langtest.utils.custom_types.sample.SycophancySample#
- class SycophancySample(*, original_question: str, ground_truth: str, test_type: str = None, perturbed_question: str = None, expected_results: str | List = None, actual_results: str = None, dataset_name: str = None, category: str = None, state: str = None, task: str = 'sycophancy', test_case: str = None, gt: bool = False, eval_model: str = None, ran_pass: bool = None, metric_name: str = None)#
Bases:
BaseModel
A helper object for extending Sycophancy test functionality.
This class represents a sample used for conducting Sycophancy tests. It provides attributes to store information about the original and perturbed prompts, questions, ground truth, test type, and more.
- original_question#
The original question for the test.
- Type:
str
- ground_truth#
The ground truth for evaluation.
- Type:
str
- test_type#
The type of Sycophancy test.
- Type:
str
- perturbed_question#
The perturbed question for the test.
- Type:
str
- expected_results#
The expected results for evaluation.
- Type:
Union[str, List]
- actual_results#
The actual results obtained from evaluation.
- Type:
str
- dataset_name#
The name of the dataset associated with the sample.
- Type:
str
- category#
The category of the Sycophancy test.
- Type:
str
- state#
The state of the sample.
- Type:
str
- task#
The task associated with the sample.
- Type:
str
- test_case#
The test case associated with the sample.
- Type:
str
- gt#
Indicates whether the user wants ground truth through the config (True/False) Defaults to False.
- Type:
bool
- to_dict() Dict[str, Any] #
Returns a dictionary representation of the sample.
- transform(func
Callable, params: Dict, **kwargs): Transforms the sample using a specified function.
- is_pass() bool #
Checks if the Sycophancy test passes based on evaluation results.
- run(model, **kwargs) bool #
Runs the sample through a specified model and updates result attributes.
- __init__(**data)#
Constructor method
Methods
__init__
(**data)Constructor method
construct
([_fields_set])Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data.
copy
(*[, include, exclude, update, deep])Duplicate a model, optionally choose which fields to include, exclude and change.
dict
(*[, include, exclude, by_alias, ...])Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
from_orm
(obj)is_pass
()Check if the Sycophancy test passes based on evaluation results, considering ground truth if enabled.
is_pass_with_ground_truth
(metric_function)Check if the Sycophancy test passes based on evaluation results w.r.t Ground Truth.
is_pass_without_ground_truth
(metric_function)Check if the Sycophancy test passes based on evaluation results without taking consideration of Ground Truth.
json
(*[, include, exclude, by_alias, ...])Generate a JSON representation of the model, include and exclude arguments as per dict().
output
(graded_outputs)Check if the output is correct.
parse_file
(path, *[, content_type, ...])parse_obj
(obj)parse_raw
(b, *[, content_type, encoding, ...])run
(model, **kwargs)Run the original and perturbed sentences through the model.
schema
([by_alias, ref_template])schema_json
(*[, by_alias, ref_template])to_dict
()Returns the dictionary version of the sample.
transform
(func, params, **kwargs)Transform the sample using a specified function.
update_forward_refs
(**localns)Try to update ForwardRefs on fields based on this Model, globalns and localns.
validate
(value)Attributes
eval_model
ran_pass
metric_name
- classmethod construct(_fields_set: SetStr | None = None, **values: Any) Model #
Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = ‘allow’ was set since it adds all passed values
- copy(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, update: DictStrAny | None = None, deep: bool = False) Model #
Duplicate a model, optionally choose which fields to include, exclude and change.
- Parameters:
include – fields to include in new model
exclude – fields to exclude from new model, as with values this takes precedence over include
update – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep – set to True to make a deep copy of the model
- Returns:
new model instance
- dict(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, by_alias: bool = False, skip_defaults: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny #
Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
- is_pass()#
Check if the Sycophancy test passes based on evaluation results, considering ground truth if enabled.
- Returns:
True if the test passes, False otherwise.
- Return type:
bool
- is_pass_with_ground_truth(metric_function) bool #
Check if the Sycophancy test passes based on evaluation results w.r.t Ground Truth.
- Returns:
True if the test passes, False otherwise.
- Return type:
bool
- is_pass_without_ground_truth(metric_function) bool #
Check if the Sycophancy test passes based on evaluation results without taking consideration of Ground Truth.
- Returns:
True if the test passes, False otherwise.
- Return type:
bool
- json(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, by_alias: bool = False, skip_defaults: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Callable[[Any], Any] | None = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode #
Generate a JSON representation of the model, include and exclude arguments as per dict().
encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().
- output(graded_outputs)#
Check if the output is correct.
- run(model, **kwargs)#
Run the original and perturbed sentences through the model.
- Parameters:
model – The model to run the sentences through.
**kwargs – Additional keyword arguments.
- Returns:
True if the run is successful, False otherwise.
- Return type:
bool
- to_dict() Dict[str, Any] #
Returns the dictionary version of the sample.
- Returns:
The dictionary representation of the sample.
- Return type:
Dict[str, Any]
- transform(func: Callable, params: Dict, **kwargs)#
Transform the sample using a specified function.
- Parameters:
func (Callable) – The transformation function to apply.
params (Dict) – Parameters for the transformation function.
**kwargs – Additional keyword arguments.