I have tried to remove non-English words from a text. Problem many other words are absent from the NLTK words corpus. My code: import pandas as pd lst = ['
Assume there is a large record of all different kinds of inter-employee and customer communications (e.g. mails, chat transcripts, OCRed letters) which should b
I am trying to use nlp for german language but it does not work! I was making the pipeline and then NER to find the entity of each element in sentence which is
I have imported ` from itertools import chain import nltk import sklearn import scipy.stats import sklearn_crfsuite from sklearn_crfsuite import scorers,CR
I'm trying to create a dataframe containing specific keywords-in-context using the kwic() function, but unfortunately, I'm running into some error when attempti
In the paper describing BERT, there is this paragraph about WordPiece Embeddings. We use WordPiece embeddings (Wu et al., 2016) with a 30,000 token vocab
This is a pretty dumb question, but I couldn't find anywhere, so I will take my chances in here... I'm building a classifier using CatBoost. Since this is a NLP
I am trying to calculate the Meteor score for the following: print (nltk.translate.meteor_score.meteor_score( ["this is an apple", "that is an apple"], "an
I am trying to run the textEmbed function in R. Set up needed: require(quanteda) require(quanteda.textstats) require(udpipe) require(reticulate) #udpi
I have made a resume parser but to parse my resumes, I am using a for loop to run my parse function over each resume. Is there a way to vectorize this approach?
I would like to store vector features, like Bag-of-Words or Word-Embedding vectors of a large number of texts, in a dataset, stored in a SQL Database. What're t
I am working with the R programming language. I have the following data: text = structure(list(id = 1:8, reviews = c("I guess the employee decided to buy their
I got an PowerIterationFailedConvergence:(PowerIterationFailedConvergence(...), 'power iteration failed to converge within 100 iterations') when I tried to summ
I have some custom data I want to use to further pre-train the BERT model. I’ve tried the two following approaches so far: Starting with a pre-trained BER
I success with English python -m spacy download en_core_web_lg python -m spacy download en_core_web_sm python -m spacy download en I read https://spacy.io/mod
What does downstream tasks terminology mean in NLP? I saw this terminology used in several articles but I can't understand the idea behind it.
text='Alice is a student.She likes studying.Teachers are giving a lot of homewok.' I am trying to get topics from a simple text(like above) with coherance scor
So about a week ago I posted this question: Issues running a Keras model with custom layers. The suggestion there was to try to make this question smaller and t
I have a text file which contains lines as shown below: Electronically signed : Wes Scott, M.D.; Jun 26 2010 11:10AM CST The patient was referred by Dr. J
I am using some text for some NLP analyses. I have cleaned the text taking steps to remove non-alphanumeric characters, blanks, duplicate words and stopwords, a