'How to load a pre-trained Word2vec MODEL File and reuse it?
I want to use a pre-trained word2vec
model, but I don't know how to load it in python.
This file is a MODEL file (703 MB).
It can be downloaded here:
http://devmount.github.io/GermanWordEmbeddings/
Solution 1:[1]
just for loading
import gensim
# Load pre-trained Word2Vec model.
model = gensim.models.Word2Vec.load("modelName.model")
now you can train the model as usual. also, if you want to be able to save it and retrain it multiple times, here's what you should do
model.train(//insert proper parameters here//)
"""
If you don't plan to train the model any further, calling
init_sims will make the model much more memory-efficient
If `replace` is set, forget the original vectors and only keep the normalized
ones = saves lots of memory!
replace=True if you want to reuse the model
"""
model.init_sims(replace=True)
# save the model for later use
# for loading, call Word2Vec.load()
model.save("modelName.model")
Solution 2:[2]
Use KeyedVectors
to load the pre-trained model.
from gensim.models import KeyedVectors
from gensim import models
word2vec_path = 'path/GoogleNews-vectors-negative300.bin.gz'
w2v_model = models.KeyedVectors.load_word2vec_format(word2vec_path, binary=True)
Solution 3:[3]
I used the same model in my code and since I couldn't load it, I asked the author about it. His answer was that the model has to be loaded in binary format:
gensim.models.KeyedVectors.load_word2vec_format(w2v_path, binary=True)
This worked for me, and I think it should work for you, too.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | AbtPst |
Solution 2 | Nilani Algiriyage |
Solution 3 | Risa |