langtest.transform.accuracy.LLMEval#

class LLMEval#

Bases: BaseAccuracy

Evaluation class for Language Model performance on question-answering tasks using the Language Model Metric (LLM).

alias_name#

Alias names for the evaluation class, should include “llm_eval”.

supported_tasks#

Supported tasks for evaluation, includes “question-answering”.

transform(cls, test: str, y_true: List[Any], params: Dict) -> List[MinScoreSample]: Transforms evaluation parameters and initializes the evaluation model.

run(cls, sample_list: List[MinScoreSample], *args, **kwargs) -> List[MinScoreSample]: Runs the evaluation on a list of samples using the Language Model Metric (LLM).

Methods

`__init__`()
`async_run`(sample_list, y_true, y_pred, **kwargs)	Creates a task to run the accuracy measure.
`run`(sample_list, y_true, y_pred, **kwargs)	Runs the evaluation on a list of samples using the Language Model Metric (LLM).
`transform`(test, y_true, params)	Transforms evaluation parameters and initializes the evaluation model.

Attributes

async classmethod async_run(sample_list: List[MinScoreSample], y_true: List[Any], y_pred: List[Any], **kwargs)#

Creates a task to run the accuracy measure.

Parameters:

async static run(sample_list: List[MinScoreSample], y_true: List[Any], y_pred: List[Any], **kwargs)#

Runs the evaluation on a list of samples using the Language Model Metric (LLM).

Parameters:

sample_list (List[MinScoreSample]) – List of MinScoreSample instances containing evaluation information.
y_true (List[Any]) – List of true values for the model’s predictions.
y_pred (List[Any]) – List of predicted values by the model.
X_test (Optional) – Additional keyword argument representing the test data.
progress_bar (Optional) – Additional keyword argument indicating whether to display a progress bar.
**kwargs – Additional keyword arguments.

Returns:

List containing updated MinScoreSample instances after evaluation.

Return type:

List[MinScoreSample]

classmethod transform(test: str, y_true: List[Any], params: Dict) → List[MinScoreSample]#

Transforms evaluation parameters and initializes the evaluation model.

Parameters:

test (str) – The alias name for the evaluation class.
y_true (List[Any]) – List of true labels (not used in this method).
params (Dict) – Additional parameters for evaluation, including ‘model’, ‘hub’, and ‘min_score’.

Returns:

List containing a MinScoreSample instance with evaluation information.

Return type:

List[MinScoreSample]

Raises:

AssertionError – If the ‘test’ parameter is not in the alias_name list.