langtest.datahandler.datasource.HuggingFaceDataset#
- class HuggingFaceDataset(source_info: dict, task: TaskManager, **kwargs)#
Bases:
BaseDataset
Example dataset class that loads data using the Hugging Face dataset library.
- __init__(source_info: dict, task: TaskManager, **kwargs)#
Initialize the HuggingFaceDataset class.
- Parameters:
source_info (dict) – Name of the dataset to load.
task (str) – Task to be evaluated on.
Methods
__init__
(source_info, task, **kwargs)Initialize the HuggingFaceDataset class.
export_data
(data, output_path)Exports the data to the corresponding format and saves it to 'output_path'.
Load the specified data based on the task.
Loads data into a list
Attributes
COLUMN_NAMES
LIB_NAME
data_sources
supported_tasks
- export_data(data: List[Sample], output_path: str)#
Exports the data to the corresponding format and saves it to ‘output_path’.
- Parameters:
data (List[Sample]) – Data to export.
output_path (str) – Path to save the data to.
- load_data() List[Sample] #
Load the specified data based on the task.
- Parameters:
feature_column (str) – Name of the column containing the input text or document.
target_column (str) – Name of the column containing the target label or summary.
split (str) – Name of the split to load (e.g., train, validation, test).
subset (str) – Name of the configuration or subset to load.
- Returns:
Loaded data as a list of Sample objects.
- Return type:
List[Sample]
- Raises:
ValueError – If an unsupported task is provided.
- load_raw_data() List #
Loads data into a list