langtest.pipelines.utils.data_helpers.ner_dataset.NERDataset#
- class NERDataset(tokens: List[List[str]], labels: List[List[str]], tokenizer: PreTrainedTokenizer | PreTrainedTokenizerFast, max_length: int = 128)#
Bases:
Dataset
Dataset for NER task
- __init__(tokens: List[List[str]], labels: List[List[str]], tokenizer: PreTrainedTokenizer | PreTrainedTokenizerFast, max_length: int = 128)#
Constructor method
- Parameters:
tokens (List[List[str]]) – list of tokens per sample
labels (List[List[str]]) – list of labels per sample
tokenizer (Union[PreTrainedTokenizer, PreTrainedTokenizerFast]) – tokenizer to use
max_length (int) – number of maximum tokens to use per sample
Methods
__init__
(tokens, labels, tokenizer[, max_length])Constructor method