I'm new to NLP and trying to learn it by myself and I am doing classification. I have a pickle file with some data like this, {'food' : {'f1.txt', 'f2.txt', 'f
I am using the word_associate package in R Markdown to create word clouds across a grouping variable with multiple categories. I would like the titles of each w
I know that how to generate next word in keras with lstm but how to predict previous word for example If you have two words like "car" and "running" then It sho
import pandas as pd from sklearn.feature_extraction.text import TfidfTransformer from sklearn.feature_extraction.text import TfidfVectorizer import path import
I was wondering - if I'm doing text categorization (with SpaCy, using their textcat-multi component for example), will those results improve if an NER component
I am really struggling to make things work with the new spacy v3 version. The documentation is full. However, I am trying to run a training loop in a script. (I
I want to implement a voice separation project. Now, I got a voice dataset with no background noise and a dataset about noise, such as engine sound , horn sound
I need a TF-IDF value for a word that is found in number of documents and not only a single document or a specific document. For example, Consider this corpus c
I am relatively new to Python and NLTK and have a hold of Flickr data stored in CSV and want to remove non-english words from the tags column. I keep getting er
I'm currently trying to perform a sentiment analysis on a kwic object, but I'm afraid that the kwic() function does not return all rows it should return. I'm no
I'm a beginner at NLP. So I'm trying to reproduce the most basic transformer all you need code. But I got a question while doing it. In the MultiHeadAttention l
I once again have a question about the kwic() function from the quanteda package. I want to extract the five words around a specific keyword (in the example bel
When I am using criterion = nn.BCELoss() for my binary classification task it creates problem and print "Using a target size (torch.Size([2])) that is different
code: model = create_model() model.compile(optimize=tf.keras.optimizers.Adam(learning_rate=2e-5), loss=tf.keras.losses.BinaryCrossentropy(),
I have a question related to the continous Bag of Words model. If I have a vocabulary size of 1000, a window size of 2, and the number of nodes in the hidden la
[here] I tried to do it with sp.hstack() and with
I have a subset of a dataframe that looks like: <OUT> PageNumber english_only_tags 175 flower architecture people 162 hair red bobbles
Working fine for months, then I interrupted a "bert-large-cased" download and the following code returns the error in the title: from transformers import BertMo
My code: model = SentenceTransformer('hiiamsid/sentence_similarity_spanish_es') I apply the model to the text column of the data frame prueba['encoder'] = prueb
While running the code with displacy, I see the images being created perfectly as expected. They are also projected to a server, the address of which is mention