Category "huggingface-tokenizers"

tokenization with huggingFace BartTokenizer

I am trying to use BART pretrained model to train a pointer generator network with huggingface transformer library. example input of the task: from transformers

RoBERTa classifier: cannot generate single prediction

I have succesfully trained a text emotion classifier fine-tuning a RoBERTa language model, mostly using a helpful notebook found online. Now I am trying to writ

Huggingface models only work once, then spit out Tokenizer error

i am following along with this example on huggingface's website, trying to work with twitter sentiment. I am running python 3.9 on PyCharm. the code works fine

How to apply max_length to truncate the token sequence from the left in a HuggingFace tokenizer?

In the HuggingFace tokenizer, applying the max_length argument specifies the length of the tokenized text. I believe it truncates the sequence to max_length-2 (

AttributeError: 'GPT2TokenizerFast' object has no attribute 'max_len'

I am just using the huggingface transformer library and get the following message when running run_lm_finetuning.py: AttributeError: 'GPT2TokenizerFast' object