langchain.evaluation.schema
.PairwiseStringEvaluator¶
- class langchain.evaluation.schema.PairwiseStringEvaluator[source]¶
Compare the output of two models (or two outputs of the same model).
Attributes
requires_input
Whether this evaluator requires an input string.
requires_reference
Whether this evaluator requires a reference label.
Methods
__init__
()aevaluate_string_pairs
(*, prediction, ...[, ...])Asynchronously evaluate the output string pairs.
evaluate_string_pairs
(*, prediction, ...[, ...])Evaluate the output string pairs.
- __init__()¶
- async aevaluate_string_pairs(*, prediction: str, prediction_b: str, reference: Optional[str] = None, input: Optional[str] = None, **kwargs: Any) dict [source]¶
Asynchronously evaluate the output string pairs.
- Parameters
prediction (str) – The output string from the first model.
prediction_b (str) – The output string from the second model.
reference (Optional[str], optional) – The expected output / reference string.
input (Optional[str], optional) – The input string.
**kwargs – Additional keyword arguments, such as callbacks and optional reference strings.
- Returns
A dictionary containing the preference, scores, and/or other information.
- Return type
dict
- evaluate_string_pairs(*, prediction: str, prediction_b: str, reference: Optional[str] = None, input: Optional[str] = None, **kwargs: Any) dict [source]¶
Evaluate the output string pairs.
- Parameters
prediction (str) – The output string from the first model.
prediction_b (str) – The output string from the second model.
reference (Optional[str], optional) – The expected output / reference string.
input (Optional[str], optional) – The input string.
**kwargs – Additional keyword arguments, such as callbacks and optional reference strings.
- Returns
A dictionary containing the preference, scores, and/or other information.
- Return type
dict