langtest.transform.fairness.MaxGenderLLMEval#

class MaxGenderLLMEval#

Bases: BaseFairness

Class for evaluating fairness based on maximum gender performance in question-answering tasks using Language Model.

alias_name#

Alias names for the evaluation method.

supported_tasks#

Supported tasks for this evaluation method.

transform(cls, test: str, data: List[Sample], params: Dict) -> List[MaxScoreSample]: Transforms data for evaluation.

run(sample_list: List[MaxScoreSample], grouped_label, **kwargs) -> List[MaxScoreSample]: Runs the evaluation process.

Methods

`__init__`()
`async_run`(sample_list, model, **kwargs)	Creates a task for the run method.
`run`(sample_list, grouped_label, **kwargs)	Runs the evaluation process using Language Model.
`transform`(test, data, params)	Transforms the data for evaluation.

Attributes

async classmethod async_run(sample_list: List[Sample], model: ModelAPI, **kwargs)#

Creates a task for the run method.

Parameters:

Returns:

The task for the run method.

Return type:

asyncio.Task

async static run(sample_list: List[MaxScoreSample], grouped_label, **kwargs) → List[MaxScoreSample]#

Runs the evaluation process using Language Model.

Parameters:

sample_list (List[MaxScoreSample]) – The input data samples.
grouped_label – A dictionary containing grouped labels where each key corresponds to a test case and the value is a tuple containing true labels and predicted labels.
**kwargs – Additional keyword arguments.

Returns:

The evaluated data samples.

Return type:

List[MaxScoreSample]

classmethod transform(test: str, data: List[Sample], params: Dict) → List[MaxScoreSample]#

Transforms the data for evaluation.

Parameters:

Returns:

The transformed data samples based on the maximum score.

Return type:

List[MaxScoreSample]