langtest.datahandler.datasource.HuggingFaceDataset#

class HuggingFaceDataset(source_info: dict, task: TaskManager, **kwargs)#

Bases: BaseDataset

Example dataset class that loads data using the Hugging Face dataset library.

__init__(source_info: dict, task: TaskManager, **kwargs)#

Initialize the HuggingFaceDataset class.

Parameters:
  • source_info (dict) – Name of the dataset to load.

  • task (str) – Task to be evaluated on.

Methods

__init__(source_info, task, **kwargs)

Initialize the HuggingFaceDataset class.

export_data(data, output_path)

Exports the data to the corresponding format and saves it to 'output_path'.

load_data()

Load the specified data based on the task.

load_raw_data()

Loads data into a list

Attributes

COLUMN_NAMES

LIB_NAME

data_sources

supported_tasks

export_data(data: List[Sample], output_path: str)#

Exports the data to the corresponding format and saves it to ‘output_path’.

Parameters:
  • data (List[Sample]) – Data to export.

  • output_path (str) – Path to save the data to.

load_data() List[Sample]#

Load the specified data based on the task.

Parameters:
  • feature_column (str) – Name of the column containing the input text or document.

  • target_column (str) – Name of the column containing the target label or summary.

  • split (str) – Name of the split to load (e.g., train, validation, test).

  • subset (str) – Name of the configuration or subset to load.

Returns:

Loaded data as a list of Sample objects.

Return type:

List[Sample]

Raises:

ValueError – If an unsupported task is provided.

load_raw_data() List#

Loads data into a list