langtest.transform.fairness.MaxGenderLLMEval#
- class MaxGenderLLMEval#
Bases:
BaseFairness
Class for evaluating fairness based on maximum gender performance in question-answering tasks using Language Model.
- alias_name#
Alias names for the evaluation method.
- Type:
List[str]
- supported_tasks#
Supported tasks for this evaluation method.
- Type:
List[str]
- transform(cls, test
str, data: List[Sample], params: Dict) -> List[MaxScoreSample]: Transforms data for evaluation.
- run(sample_list
List[MaxScoreSample], grouped_label, **kwargs) -> List[MaxScoreSample]: Runs the evaluation process.
- __init__()#
Methods
__init__
()async_run
(sample_list, model, **kwargs)Creates a task for the run method.
run
(sample_list, grouped_label, **kwargs)Runs the evaluation process using Language Model.
transform
(test, data, params)Transforms the data for evaluation.
Attributes
eval_model
test_types
- class TestConfig#
Bases:
dict
- clear() None. Remove all items from D. #
- copy() a shallow copy of D #
- fromkeys(value=None, /)#
Create a new dictionary with keys from iterable and values set to value.
- get(key, default=None, /)#
Return the value for key if key is in the dictionary, else default.
- items() a set-like object providing a view on D's items #
- keys() a set-like object providing a view on D's keys #
- pop(k[, d]) v, remove specified key and return the corresponding value. #
If the key is not found, return the default if given; otherwise, raise a KeyError.
- popitem()#
Remove and return a (key, value) pair as a 2-tuple.
Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
- setdefault(key, default=None, /)#
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- update([E, ]**F) None. Update D from dict/iterable E and F. #
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
- values() an object providing a view on D's values #
- async classmethod async_run(sample_list: List[Sample], model: ModelAPI, **kwargs)#
Creates a task for the run method.
- Parameters:
sample_list (List[Sample]) – The input data to be transformed.
model (ModelAPI) – The model to be used for the computation.
- Returns:
The task for the run method.
- Return type:
asyncio.Task
- class max_score#
Bases:
dict
- clear() None. Remove all items from D. #
- copy() a shallow copy of D #
- fromkeys(value=None, /)#
Create a new dictionary with keys from iterable and values set to value.
- get(key, default=None, /)#
Return the value for key if key is in the dictionary, else default.
- items() a set-like object providing a view on D's items #
- keys() a set-like object providing a view on D's keys #
- pop(k[, d]) v, remove specified key and return the corresponding value. #
If the key is not found, return the default if given; otherwise, raise a KeyError.
- popitem()#
Remove and return a (key, value) pair as a 2-tuple.
Pairs are returned in LIFO (last-in, first-out) order. Raises KeyError if the dict is empty.
- setdefault(key, default=None, /)#
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- update([E, ]**F) None. Update D from dict/iterable E and F. #
If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
- values() an object providing a view on D's values #
- async static run(sample_list: List[MaxScoreSample], grouped_label, **kwargs) List[MaxScoreSample] #
Runs the evaluation process using Language Model.
- Parameters:
sample_list (List[MaxScoreSample]) – The input data samples.
grouped_label – A dictionary containing grouped labels where each key corresponds to a test case and the value is a tuple containing true labels and predicted labels.
**kwargs – Additional keyword arguments.
- Returns:
The evaluated data samples.
- Return type:
List[MaxScoreSample]
- classmethod transform(test: str, data: List[Sample], params: Dict) List[MaxScoreSample] #
Transforms the data for evaluation.
- Parameters:
test (str) – The test alias name.
data (List[Sample]) – The data to be transformed.
params (Dict) – Parameters for transformation.
- Returns:
The transformed data samples based on the maximum score.
- Return type:
List[MaxScoreSample]