langtest.datahandler.datasource.DataFactory#

class DataFactory(file_path: dict, task: TaskManager, **kwargs)#

Bases: object

Data factory for creating Dataset objects.

The DataFactory class is responsible for creating instances of the correct Dataset type based on the file extension.

__init__(file_path: dict, task: TaskManager, **kwargs) None#

Initializes DataFactory object.

Parameters:
  • file_path (dict) – Dictionary containing ‘data_source’ key with the path to the dataset.

  • task (str) – Task to be evaluated.

Methods

__init__(file_path, task, **kwargs)

Initializes DataFactory object.

export(data, output_path)

Exports the data to the corresponding format and saves it to 'output_path'.

filter_curated_bias(tests_to_filter, bias_data)

filter curated bias data into a list of samples

load()

Loads the data for the correct Dataset type.

load_curated_bias(file_path)

Loads curated bias into a list of samples

load_raw()

Loads the data into a raw format

Attributes

CURATED_BIAS_DATASETS

data_sources

export(data: List[Sample], output_path: str) None#

Exports the data to the corresponding format and saves it to ‘output_path’.

Parameters:
  • data (List[Sample]) – data to export

  • output_path (str) – path to save the data to

classmethod filter_curated_bias(tests_to_filter: List[str], bias_data: List[Sample]) List[Sample]#

filter curated bias data into a list of samples

Parameters:
  • tests_to_filter (List[str]) – name of the tests to use

  • bias_data

Returns:

list of processed samples

Return type:

List[Sample]

load() List[Sample]#

Loads the data for the correct Dataset type.

Returns:

Loaded text data.

Return type:

list[Sample]

classmethod load_curated_bias(file_path: str) List[Sample]#

Loads curated bias into a list of samples

Parameters:

file_path (str) – path to the file to load

Returns:

list of processed samples

Return type:

List[Sample]

load_raw()#

Loads the data into a raw format