langtest.utils.custom_types.sample.SycophancySample#

class SycophancySample(*, original_question: str, ground_truth: str, test_type: str = None, perturbed_question: str = None, expected_results: str | List = None, actual_results: str = None, dataset_name: str = None, category: str = None, state: str = None, task: str = 'sycophancy', test_case: str = None, gt: bool = False, eval_model: str = None, ran_pass: bool = None, metric_name: str = None)#

Bases: BaseModel

A helper object for extending Sycophancy test functionality.

This class represents a sample used for conducting Sycophancy tests. It provides attributes to store information about the original and perturbed prompts, questions, ground truth, test type, and more.

original_question#

The original question for the test.

Type:

str

ground_truth#

The ground truth for evaluation.

Type:

str

test_type#

The type of Sycophancy test.

Type:

str

perturbed_question#

The perturbed question for the test.

Type:

str

expected_results#

The expected results for evaluation.

Type:

Union[str, List]

actual_results#

The actual results obtained from evaluation.

Type:

str

dataset_name#

The name of the dataset associated with the sample.

Type:

str

category#

The category of the Sycophancy test.

Type:

str

state#

The state of the sample.

Type:

str

task#

The task associated with the sample.

Type:

str

test_case#

The test case associated with the sample.

Type:

str

gt#

Indicates whether the user wants ground truth through the config (True/False) Defaults to False.

Type:

bool

to_dict() Dict[str, Any]#

Returns a dictionary representation of the sample.

transform(func

Callable, params: Dict, **kwargs): Transforms the sample using a specified function.

is_pass() bool#

Checks if the Sycophancy test passes based on evaluation results.

run(model, **kwargs) bool#

Runs the sample through a specified model and updates result attributes.

__init__(**data)#

Constructor method

Methods

__init__(**data)

Constructor method

construct([_fields_set])

Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data.

copy(*[, include, exclude, update, deep])

Duplicate a model, optionally choose which fields to include, exclude and change.

dict(*[, include, exclude, by_alias, ...])

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

from_orm(obj)

is_pass()

Check if the Sycophancy test passes based on evaluation results, considering ground truth if enabled.

is_pass_with_ground_truth(metric_function)

Check if the Sycophancy test passes based on evaluation results w.r.t Ground Truth.

is_pass_without_ground_truth(metric_function)

Check if the Sycophancy test passes based on evaluation results without taking consideration of Ground Truth.

json(*[, include, exclude, by_alias, ...])

Generate a JSON representation of the model, include and exclude arguments as per dict().

output(graded_outputs)

Check if the output is correct.

parse_file(path, *[, content_type, ...])

parse_obj(obj)

parse_raw(b, *[, content_type, encoding, ...])

run(model, **kwargs)

Run the original and perturbed sentences through the model.

schema([by_alias, ref_template])

schema_json(*[, by_alias, ref_template])

to_dict()

Returns the dictionary version of the sample.

transform(func, params, **kwargs)

Transform the sample using a specified function.

update_forward_refs(**localns)

Try to update ForwardRefs on fields based on this Model, globalns and localns.

validate(value)

Attributes

original_question

ground_truth

test_type

perturbed_question

expected_results

actual_results

dataset_name

category

state

task

test_case

gt

eval_model

ran_pass

metric_name

classmethod construct(_fields_set: SetStr | None = None, **values: Any) Model#

Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed. Behaves as if Config.extra = ‘allow’ was set since it adds all passed values

copy(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, update: DictStrAny | None = None, deep: bool = False) Model#

Duplicate a model, optionally choose which fields to include, exclude and change.

Parameters:
  • include – fields to include in new model

  • exclude – fields to exclude from new model, as with values this takes precedence over include

  • update – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data

  • deep – set to True to make a deep copy of the model

Returns:

new model instance

dict(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, by_alias: bool = False, skip_defaults: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False) DictStrAny#

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

is_pass()#

Check if the Sycophancy test passes based on evaluation results, considering ground truth if enabled.

Returns:

True if the test passes, False otherwise.

Return type:

bool

is_pass_with_ground_truth(metric_function) bool#

Check if the Sycophancy test passes based on evaluation results w.r.t Ground Truth.

Returns:

True if the test passes, False otherwise.

Return type:

bool

is_pass_without_ground_truth(metric_function) bool#

Check if the Sycophancy test passes based on evaluation results without taking consideration of Ground Truth.

Returns:

True if the test passes, False otherwise.

Return type:

bool

json(*, include: AbstractSetIntStr | MappingIntStrAny | None = None, exclude: AbstractSetIntStr | MappingIntStrAny | None = None, by_alias: bool = False, skip_defaults: bool | None = None, exclude_unset: bool = False, exclude_defaults: bool = False, exclude_none: bool = False, encoder: Callable[[Any], Any] | None = None, models_as_dict: bool = True, **dumps_kwargs: Any) unicode#

Generate a JSON representation of the model, include and exclude arguments as per dict().

encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().

output(graded_outputs)#

Check if the output is correct.

run(model, **kwargs)#

Run the original and perturbed sentences through the model.

Parameters:
  • model – The model to run the sentences through.

  • **kwargs – Additional keyword arguments.

Returns:

True if the run is successful, False otherwise.

Return type:

bool

to_dict() Dict[str, Any]#

Returns the dictionary version of the sample.

Returns:

The dictionary representation of the sample.

Return type:

Dict[str, Any]

transform(func: Callable, params: Dict, **kwargs)#

Transform the sample using a specified function.

Parameters:
  • func (Callable) – The transformation function to apply.

  • params (Dict) – Parameters for the transformation function.

  • **kwargs – Additional keyword arguments.

classmethod update_forward_refs(**localns: Any) None#

Try to update ForwardRefs on fields based on this Model, globalns and localns.