langtest.transform.accuracy.LLMEval#

class LLMEval#

Bases: BaseAccuracy

Evaluation class for Language Model performance on question-answering tasks using the Language Model Metric (LLM).

alias_name#

Alias names for the evaluation class, should include “llm_eval”.

Type:

List[str]

supported_tasks#

Supported tasks for evaluation, includes “question-answering”.

Type:

List[str]

transform(cls, test

str, y_true: List[Any], params: Dict) -> List[MinScoreSample]: Transforms evaluation parameters and initializes the evaluation model.

run(cls, sample_list

List[MinScoreSample], *args, **kwargs) -> List[MinScoreSample]: Runs the evaluation on a list of samples using the Language Model Metric (LLM).

__init__()#

Methods

__init__()

async_run(sample_list, y_true, y_pred, **kwargs)

Creates a task to run the accuracy measure.

run(sample_list, y_true, y_pred, **kwargs)

Runs the evaluation on a list of samples using the Language Model Metric (LLM).

transform(test, y_true, params)

Transforms evaluation parameters and initializes the evaluation model.

Attributes

alias_name

eval_model

supported_tasks

test_types

async classmethod async_run(sample_list: List[MinScoreSample], y_true: List[Any], y_pred: List[Any], **kwargs)#

Creates a task to run the accuracy measure.

Parameters:
  • sample_list (List[MinScoreSample]) – List of samples to be transformed.

  • y_true (List[Any]) – True values

  • y_pred (List[Any]) – Predicted values

async static run(sample_list: List[MinScoreSample], y_true: List[Any], y_pred: List[Any], **kwargs)#

Runs the evaluation on a list of samples using the Language Model Metric (LLM).

Parameters:
  • sample_list (List[MinScoreSample]) – List of MinScoreSample instances containing evaluation information.

  • y_true (List[Any]) – List of true values for the model’s predictions.

  • y_pred (List[Any]) – List of predicted values by the model.

  • X_test (Optional) – Additional keyword argument representing the test data.

  • progress_bar (Optional) – Additional keyword argument indicating whether to display a progress bar.

  • **kwargs – Additional keyword arguments.

Returns:

List containing updated MinScoreSample instances after evaluation.

Return type:

List[MinScoreSample]

classmethod transform(test: str, y_true: List[Any], params: Dict) List[MinScoreSample]#

Transforms evaluation parameters and initializes the evaluation model.

Parameters:
  • test (str) – The alias name for the evaluation class.

  • y_true (List[Any]) – List of true labels (not used in this method).

  • params (Dict) – Additional parameters for evaluation, including ‘model’, ‘hub’, and ‘min_score’.

Returns:

List containing a MinScoreSample instance with evaluation information.

Return type:

List[MinScoreSample]

Raises:

AssertionError – If the ‘test’ parameter is not in the alias_name list.