langtest.datahandler.datasource.ConllDataset#
- class ConllDataset(file_path: str, task: TaskManager)#
Bases:
BaseDataset
Class to handle Conll files. Subclass of BaseDataset.
- __init__(file_path: str, task: TaskManager) None #
Initializes ConllDataset object.
- Parameters:
file_path (str) – Path to the data file.
task (str) – name of the task to perform
Methods
__init__
(file_path, task)Initializes ConllDataset object.
export_data
(data, output_path)Exports the data to the corresponding format and saves it to 'output_path'.
Loads data from a CoNLL file.
Loads dataset into a list tokens and labels
Attributes
COLUMN_NAMES
data_sources
supported_tasks
- export_data(data: List[NERSample], output_path: str)#
Exports the data to the corresponding format and saves it to ‘output_path’.
- Parameters:
data (List[NERSample]) – data to export
output_path (str) – path to save the data to
- load_data() List[NERSample] #
Loads data from a CoNLL file.
- Returns:
List of formatted sentences from the dataset.
- Return type:
List[NERSample]
- load_raw_data() List[Dict] #
Loads dataset into a list tokens and labels
- Returns:
list of dict containing tokens and labels
- Return type:
List[Dict]