langtest.datahandler.datasource.ConllDataset#

class ConllDataset(file_path: str, task: TaskManager)#

Bases: BaseDataset

Class to handle Conll files. Subclass of BaseDataset.

__init__(file_path: str, task: TaskManager) None#

Initializes ConllDataset object.

Parameters:
  • file_path (str) – Path to the data file.

  • task (str) – name of the task to perform

Methods

__init__(file_path, task)

Initializes ConllDataset object.

export_data(data, output_path)

Exports the data to the corresponding format and saves it to 'output_path'.

load_data()

Loads data from a CoNLL file.

load_raw_data()

Loads dataset into a list tokens and labels

Attributes

COLUMN_NAMES

data_sources

supported_tasks

export_data(data: List[NERSample], output_path: str)#

Exports the data to the corresponding format and saves it to ‘output_path’.

Parameters:
  • data (List[NERSample]) – data to export

  • output_path (str) – path to save the data to

load_data() List[NERSample]#

Loads data from a CoNLL file.

Returns:

List of formatted sentences from the dataset.

Return type:

List[NERSample]

load_raw_data() List[Dict]#

Loads dataset into a list tokens and labels

Returns:

list of dict containing tokens and labels

Return type:

List[Dict]