langtest.datahandler.datasource.PandasDataset#
- class PandasDataset(file_path: str, task: TaskManager, **kwargs)#
Bases:
BaseDatasetClass to handle Pandas datasets. Subclass of BaseDataset.
- __init__(file_path: str, task: TaskManager, **kwargs) None#
Initializes a PandasDataset object.
- Parameters:
file_path – The path to the data file.
- Raises:
ValueError – If the specified task is unsupported.
Methods
__init__(file_path, task, **kwargs)Initializes a PandasDataset object.
export_data(data, output_path)Exports the data to the corresponding format and saves it to 'output_path'.
Load data from a CSV file and preprocess it based on the specified task.
load_raw_data([standardize_columns])Loads data from a file into raw lists of strings
renamed_extensions([inverted])Rename the file extensions to the correct format.
Attributes
COLUMN_NAMESdata_sourcesdataset_sizesupported_tasks- export_data(data: List[Sample], output_path: str)#
Exports the data to the corresponding format and saves it to ‘output_path’.
- load_data() List[Sample]#
Load data from a CSV file and preprocess it based on the specified task.
- Returns:
A list of preprocessed data samples.
- Return type:
List[Sample]
- load_raw_data(standardize_columns: bool = False) List[Dict]#
Loads data from a file into raw lists of strings
- Parameters:
standardize_columns (bool) – whether to standardize column names
- Returns:
parsed file into list of dicts
- Return type:
List[Dict]
- classmethod renamed_extensions(inverted: bool = False) Dict[str, str]#
Rename the file extensions to the correct format.