Category "huggingface-transformers"

Weird behaviour when finetuning Huggingface Bert model with Tensorflow

I am trying to fine tune a Huggingface Bert model using Tensorflow (on ColabPro GPU enabled) for tweets sentiment analysis. I followed step by step the guide on

M2M100Tokenizer.from_pretrained 'NoneType' object is not callable

I have the following chunk of code from this link: from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer hi_text = "जीव

OSError: You seem to have cloned a repository without having git-lfs installed. Please install git-lfs and run git lfs install followed by git lfs pul

I'm using Jupyter Labs on AWS SageMaker. Kernel: conda_pytorch_p36 and did Restart & Run All. I git cloned this repo. Attempt at installing git-lfs: !curl -

Type of adapters for machine translation (AdapterHub tutorial)

I'm following this guide which explains how to apply adapters to a model for a binary classification task, and I want to adapt it to a machine translation task.

Transformers model from Hugging-Face throws error that specific classes couldn t be loaded

Hi after running this code below, I get the following error. ValueError: Could not load model facebook/bart-large-mnli with any of the following classes: (<c

Solving "CUDA out of memory" when fine-tuning GPT-2 (HuggingFace)

I get the reoccuring CUDA out of memory error when using the HuggingFace Transformers library to fine-tune a GPT-2 model and can't seem to solve it, despite my

RuntimeError: Found dtype Long but expected Float when fine-tuning using Trainer API

I'm trying to fine-tune BERT model for sentiment analysis (classifying text as positive/negative) with Huggingface Trainer API. My dataset has two columns, Text

How to early-stop autoregressive model with a list of stop words?

I am using GPT-Neo model from transformers to generate text. Because the prompt I use starts with '{', so I would like to stop the sentence once the paring '}'

RoBERTa classifier: cannot generate single prediction

I have succesfully trained a text emotion classifier fine-tuning a RoBERTa language model, mostly using a helpful notebook found online. Now I am trying to writ

Deploying Huggingface model for inference - pytorch-scatter issues

It's my first time with SageMaker, and I'm having issues when trying to execute this script I took from this Huggingface model (deploy tab) from sagemaker.huggi

How to specify a proxy in transformers pipeline

I am using sentiment-analysis pipeline as described here. from transformers import pipeline classifier = pipeline('sentiment-analysis') It's failing with a con

How to load custom dataset from CSV in Huggingfaces

I would like to load a custom dataset from csv using huggingfaces-transformers

how to train a bert model from scratch with huggingface?

i find a answer of training model from scratch in this question: How to train BERT from scratch on a new domain for both MLM and NSP? one answer use Trainer and

Early stopping in Bert Trainer instances

I am fine tuning a BERT model for a multiclass classification task. My problem is that I don't know how to add "early stopping" to those Trainer instances. Any

Import of transformers package throwing value_error

I have successfully installed transformers package in my Jupyter Notebook from Anaconda administrator console using the command 'conda install -c conda-forge tr

Continual pre-training vs. Fine-tuning a language model with MLM

I have some custom data I want to use to further pre-train the BERT model. I’ve tried the two following approaches so far: Starting with a pre-trained BER

Hugginface transformers module not recognized by anaconda

I am using Anaconda, python 3.7, windows 10. I tried to install transformers by https://huggingface.co/transformers/ on my env. I am aware that I must have eith

hugging face transformers not downloading based on requirements list/ pip freeze

a pip freeze yields the following for hugging face transformers: git+https://github.com/huggingface/transformers.git@8ddbfe975264a94f124684a138a2a5ca89a2bd0d

T5Tokenizer requires the SentencePiece library but it was not found in your environment

I am trying to explore T5 this is the code !pip install transformers from transformers import T5Tokenizer, T5ForConditionalGeneration qa_input = """question: Wh

How to output the list of probabilities on each token via model.generate?

Right now I have: model = GPTNeoForCausalLM.from_pretrained(model_name) tokenizer = GPT2Tokenizer.from_pretrained(model_name) input_ids = tokenizer(prompt, retu