langtest.datahandler.datasource.SparkDataset#

class SparkDataset(file_path: str | dict, task: TaskManager, **kwargs)#

Class to handle Spark datasets. Subclass of BaseDataset.

__init__(file_path: str | dict, task: TaskManager, **kwargs) → None#

Initializes a SparkDataset object.

Parameters:

Methods

`__init__`(file_path, task, **kwargs)	Initializes a SparkDataset object.
`export_data`(data, output_path)	Exports the data to the corresponding format and saves it to 'output_path'.
`load_data`()	Load data from a any file and preprocess it based on the specified task.
`load_raw_data`()	Load data from a file into raw lists of strings

Attributes

export_data(data: List[Sample], output_path: str)#: Exports the data to the corresponding format and saves it to ‘output_path’.

load_data() → List[Sample]#

Load data from a any file and preprocess it based on the specified task.

load_raw_data() → List[Dict]#

Load data from a file into raw lists of strings