langtest.transform.fairness.MaxGenderLLMEval#

class MaxGenderLLMEval#

Bases: BaseFairness

Class for evaluating fairness based on maximum gender performance in question-answering tasks using Language Model.

alias_name#

Alias names for the evaluation method.

Type:

List[str]

supported_tasks#

Supported tasks for this evaluation method.

Type:

List[str]

transform(cls, test

str, data: List[Sample], params: Dict) -> List[MaxScoreSample]: Transforms data for evaluation.

run(sample_list

List[MaxScoreSample], grouped_label, **kwargs) -> List[MaxScoreSample]: Runs the evaluation process.

__init__()#

Methods

__init__()

async_run(sample_list, model, **kwargs)

Creates a task for the run method.

run(sample_list, grouped_label, **kwargs)

Runs the evaluation process using Language Model.

transform(test, data, params)

Transforms the data for evaluation.

Attributes

alias_name

eval_model

supported_tasks

test_types

class TestConfig#

Bases: dict

clear() None.  Remove all items from D.#
copy() a shallow copy of D#
fromkeys(value=None, /)#

Create a new dictionary with keys from iterable and values set to value.

get(key, default=None, /)#

Return the value for key if key is in the dictionary, else default.

items() a set-like object providing a view on D's items#
keys() a set-like object providing a view on D's keys#
pop(k[, d]) v, remove specified key and return the corresponding value.#

If the key is not found, return the default if given; otherwise, raise a KeyError.

popitem()#

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

setdefault(key, default=None, /)#

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

update([E, ]**F) None.  Update D from dict/iterable E and F.#

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() an object providing a view on D's values#
async classmethod async_run(sample_list: List[Sample], model: ModelAPI, **kwargs)#

Creates a task for the run method.

Parameters:
  • sample_list (List[Sample]) – The input data to be transformed.

  • model (ModelAPI) – The model to be used for the computation.

Returns:

The task for the run method.

Return type:

asyncio.Task

class max_score#

Bases: dict

clear() None.  Remove all items from D.#
copy() a shallow copy of D#
fromkeys(value=None, /)#

Create a new dictionary with keys from iterable and values set to value.

get(key, default=None, /)#

Return the value for key if key is in the dictionary, else default.

items() a set-like object providing a view on D's items#
keys() a set-like object providing a view on D's keys#
pop(k[, d]) v, remove specified key and return the corresponding value.#

If the key is not found, return the default if given; otherwise, raise a KeyError.

popitem()#

Remove and return a (key, value) pair as a 2-tuple.

Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.

setdefault(key, default=None, /)#

Insert key with a value of default if key is not in the dictionary.

Return the value for key if key is in the dictionary, else default.

update([E, ]**F) None.  Update D from dict/iterable E and F.#

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() an object providing a view on D's values#
async static run(sample_list: List[MaxScoreSample], grouped_label, **kwargs) List[MaxScoreSample]#

Runs the evaluation process using Language Model.

Parameters:
  • sample_list (List[MaxScoreSample]) – The input data samples.

  • grouped_label – A dictionary containing grouped labels where each key corresponds to a test case and the value is a tuple containing true labels and predicted labels.

  • **kwargs – Additional keyword arguments.

Returns:

The evaluated data samples.

Return type:

List[MaxScoreSample]

classmethod transform(test: str, data: List[Sample], params: Dict) List[MaxScoreSample]#

Transforms the data for evaluation.

Parameters:
  • test (str) – The test alias name.

  • data (List[Sample]) – The data to be transformed.

  • params (Dict) – Parameters for transformation.

Returns:

The transformed data samples based on the maximum score.

Return type:

List[MaxScoreSample]