langtest.datahandler.datasource.PandasDataset#
- class PandasDataset(file_path: str, task: TaskManager, **kwargs)#
Bases:
BaseDataset
Class to handle Pandas datasets. Subclass of BaseDataset.
- __init__(file_path: str, task: TaskManager, **kwargs) None #
Initializes a PandasDataset object.
- Parameters:
file_path – The path to the data file.
- Raises:
ValueError – If the specified task is unsupported.
Methods
__init__
(file_path, task, **kwargs)Initializes a PandasDataset object.
export_data
(data, output_path)Exports the data to the corresponding format and saves it to 'output_path'.
Load data from a CSV file and preprocess it based on the specified task.
load_raw_data
([standardize_columns])Loads data from a file into raw lists of strings
renamed_extensions
([inverted])Rename the file extensions to the correct format.
Attributes
COLUMN_NAMES
data_sources
supported_tasks
- export_data(data: List[Sample], output_path: str)#
Exports the data to the corresponding format and saves it to ‘output_path’.
- load_data() List[Sample] #
Load data from a CSV file and preprocess it based on the specified task.
- Returns:
A list of preprocessed data samples.
- Return type:
List[Sample]
- load_raw_data(standardize_columns: bool = False) List[Dict] #
Loads data from a file into raw lists of strings
- Parameters:
standardize_columns (bool) – whether to standardize column names
- Returns:
parsed file into list of dicts
- Return type:
List[Dict]
- classmethod renamed_extensions(inverted: bool = False) Dict[str, str] #
Rename the file extensions to the correct format.