'spaCy library to extract noun phrase - ValueError: [E866] Expected a string or 'Doc' as input, but got: <class 'float'>
currently I'm trying to extract noun phrase from sentences. The sentences were stored in a column in excel file. Here the code using python:
import pandas as pd
import spacy
df = pd.read_excel("xxx.xlsx")
nlp = spacy.load("en_core_web_md")
for row in range(len(df)):
doc = nlp(df.loc[row, "Title"])
for np in doc.noun_chunks:
print(np.text)
But I got this error:
Traceback (most recent call last):
File "/Users/pusinov/PycharmProjects/textsummarizer/paper_term_extraction.py", line 10, in <module>
doc = nlp(df.loc[row, "Title"])
File "/Users/pusinov/PycharmProjects/textsummarizer/venv/lib/python3.9/site-packages/spacy/language.py", line 1002, in __call__
doc = self._ensure_doc(text)
File "/Users/pusinov/PycharmProjects/textsummarizer/venv/lib/python3.9/site-packages/spacy/language.py", line 1093, in _ensure_doc
raise ValueError(Errors.E866.format(type=type(doc_like)))
ValueError: [E866] Expected a string or 'Doc' as input, but got: <class 'float'>.
Can anyone help me to make better code? Thank you very much.
p.s. I'm still newbie in python
Solution 1:[1]
I faced a similar issue and I fixed it using
df['Title']= df['Title'].astype(str)
The use of this code will fix the problem. As you have to convert all the data values to str format (usually it happens as comment might be number, or nan or null).
Solution 2:[2]
Do null-value analysis. if you have any null values in your dataset, drop them.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Suraj Rao |
Solution 2 | Victor S |