langtest.utils.custom_types.sample.SycophancySample#

class SycophancySample(*, original_question: str, ground_truth: str, test_type: str = None, perturbed_question: str = None, expected_results: str | List = None, actual_results: str = None, dataset_name: str = None, category: str = None, state: str = None, task: str = 'sycophancy', test_case: str = None, gt: bool = False, eval_model: str = None, ran_pass: bool = None, metric_name: str = None)#

Bases: BaseModel

A helper object for extending Sycophancy test functionality.

This class represents a sample used for conducting Sycophancy tests. It provides attributes to store information about the original and perturbed prompts, questions, ground truth, test type, and more.

original_question#

The original question for the test.

Type:: str

ground_truth#

The ground truth for evaluation.

Type:: str

test_type#

The type of Sycophancy test.

Type:: str

perturbed_question#

The perturbed question for the test.

Type:: str

expected_results#

The expected results for evaluation.

Type:: Union[str, List]

actual_results#

The actual results obtained from evaluation.

Type:: str

dataset_name#

The name of the dataset associated with the sample.

Type:: str

category#

The category of the Sycophancy test.

Type:: str

state#

The state of the sample.

Type:: str

task#

The task associated with the sample.

Type:: str

test_case#

The test case associated with the sample.

Type:: str

gt#

Indicates whether the user wants ground truth through the config (True/False) Defaults to False.

Type:: bool

to_dict() → Dict[str, Any]#: Returns a dictionary representation of the sample.

transform(func: Callable, params: Dict, **kwargs): Transforms the sample using a specified function.

is_pass() → bool#: Checks if the Sycophancy test passes based on evaluation results.

run(model, **kwargs) → bool#: Runs the sample through a specified model and updates result attributes.

__init__(**data)#: Constructor method

Methods

`__init__`(**data)	Constructor method
`construct`([_fields_set])	Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data.
`copy`(*[, include, exclude, update, deep])	Duplicate a model, optionally choose which fields to include, exclude and change.
`dict`(*[, include, exclude, by_alias, ...])	Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.
`from_orm`(obj)
`is_pass`()	Check if the Sycophancy test passes based on evaluation results, considering ground truth if enabled.
`is_pass_with_ground_truth`(metric_function)	Check if the Sycophancy test passes based on evaluation results w.r.t Ground Truth.
`is_pass_without_ground_truth`(metric_function)	Check if the Sycophancy test passes based on evaluation results without taking consideration of Ground Truth.
`json`(*[, include, exclude, by_alias, ...])	Generate a JSON representation of the model, include and exclude arguments as per dict().
`output`(graded_outputs)	Check if the output is correct.
`parse_file`(path, *[, content_type, ...])
`parse_obj`(obj)
`parse_raw`(b, *[, content_type, encoding, ...])
`run`(model, **kwargs)	Run the original and perturbed sentences through the model.
`schema`([by_alias, ref_template])
`schema_json`(*[, by_alias, ref_template])
`to_dict`()	Returns the dictionary version of the sample.
`transform`(func, params, **kwargs)	Transform the sample using a specified function.
`update_forward_refs`(**localns)	Try to update ForwardRefs on fields based on this Model, globalns and localns.
`validate`(value)

Attributes

`original_question`
`ground_truth`
`test_type`
`perturbed_question`
`expected_results`
`actual_results`
`dataset_name`
`category`
`state`
`task`
`test_case`
`gt`
`eval_model`
`ran_pass`
`metric_name`

classmethod construct(_fields_set: SetStr | None = None, **values: Any) → Model#: Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = ‘allow’ was set since it adds all passed values

Duplicate a model, optionally choose which fields to include, exclude and change.

Parameters:

include – fields to include in new model
exclude – fields to exclude from new model, as with values this takes precedence over include
update – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data
deep – set to True to make a deep copy of the model

Returns:

new model instance

dict(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, by_alias: bool = False, skip_defaults: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) → DictStrAny#: Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

is_pass()#

Check if the Sycophancy test passes based on evaluation results, considering ground truth if enabled.

Returns:: True if the test passes, False otherwise.
Return type:: bool

is_pass_with_ground_truth(metric_function) → bool#

Check if the Sycophancy test passes based on evaluation results w.r.t Ground Truth.

Returns:: True if the test passes, False otherwise.
Return type:: bool

is_pass_without_ground_truth(metric_function) → bool#

Check if the Sycophancy test passes based on evaluation results without taking consideration of Ground Truth.

Returns:: True if the test passes, False otherwise.
Return type:: bool

json(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, by_alias: bool = False, skip_defaults: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Callable[[Any], Any] | None = None, models_as_dict: bool = True, **dumps_kwargs: Any) → unicode#

Generate a JSON representation of the model, include and exclude arguments as per dict().

encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().

output(graded_outputs)#: Check if the output is correct.

run(model, **kwargs)#

Run the original and perturbed sentences through the model.

Parameters:

model – The model to run the sentences through.
**kwargs – Additional keyword arguments.

Returns:

True if the run is successful, False otherwise.

Return type:

bool

to_dict() → Dict[str, Any]#

Returns the dictionary version of the sample.

Returns:: The dictionary representation of the sample.
Return type:: Dict[str, Any]

transform(func: Callable, params: Dict, **kwargs)#

Transform the sample using a specified function.

Parameters:

func (Callable) – The transformation function to apply.
params (Dict) – Parameters for the transformation function.
**kwargs – Additional keyword arguments.

classmethod update_forward_refs(**localns: Any) → None#: Try to update ForwardRefs on fields based on this Model, globalns and localns.