'Package sparse_dot_topn in Pyspark AWS EMR Jupyter install error
Running on AWS and EMR, Jupyter, Pyspark notebook and trying to install a python package "sparse_dot_topn" version 0.2.9
I'm getting an error I don't understand that mentions headers, wheels, and other topics that are over my head.
I'm hoping someone can help me understand what this long error message means, and point me in a direction to fix it. I've spent a lot of time on this and I'm completely stumped. Any insight is really appreciated.
pre-loaded required packages before loading the package that is giving me a hard time.
sc.install_pypi_package("pandas==0.25.1")
sc.install_pypi_package("numpy")
import numpy as np
sc.install_pypi_package("cython")
import cython
sc.install_pypi_package("setuptools==54.1.3")
import setuptools
sc.install_pypi_package("scipy")
import scipy
The above works fine, but when I try to install "sparse_dot_topn" I get the error
sc.install_pypi_package("sparse_dot_topn==0.2.9")
Error Message:
A Jupyter widget could not be displayed because the widget state could not be found. This could happen if the kernel storing the widget is no longer available, or if the widget state was not saved in the notebook. You may be able to create the widget by running the appropriate cells.
Collecting sparse_dot_topn==0.2.9
Using cached https://files.pythonhosted.org/packages/70/d5/2a3a52acd89344f0c45cae320bd41ee49573caec656834b98c5ea48669b7/sparse_dot_topn-0.2.9.tar.gz
Requirement already satisfied: setuptools>=18.0 in /mnt/tmp/1617142081620-0/lib/python3.7/site-packages (from sparse_dot_topn==0.2.9)
Requirement already satisfied: cython>=0.29.15 in /mnt/tmp/1617142081620-0/lib/python3.7/site-packages (from sparse_dot_topn==0.2.9)
Requirement already satisfied: numpy>=1.16.6 in /mnt/tmp/1617142081620-0/lib/python3.7/site-packages (from sparse_dot_topn==0.2.9)
Requirement already satisfied: scipy>=1.2.3 in /mnt/tmp/1617142081620-0/lib/python3.7/site-packages (from sparse_dot_topn==0.2.9)
Building wheels for collected packages: sparse-dot-topn
Running setup.py bdist_wheel for sparse-dot-topn: started
Running setup.py bdist_wheel for sparse-dot-topn: finished with status 'error'
Complete output from command /tmp/1617142081620-0/bin/python -u -c "import setuptools, tokenize;__file__='/mnt/tmp/pip-build-1kh7kd3g/sparse-dot-topn/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpglu65hu8pip-wheel- --python-tag cp37:
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.7
creating build/lib.linux-x86_64-3.7/sparse_dot_topn
copying sparse_dot_topn/__init__.py -> build/lib.linux-x86_64-3.7/sparse_dot_topn
copying sparse_dot_topn/awesome_cossim_topn.py -> build/lib.linux-x86_64-3.7/sparse_dot_topn
running build_ext
skipping './sparse_dot_topn/sparse_dot_topn.cpp' Cython extension (up-to-date)
skipping './sparse_dot_topn/sparse_dot_topn_threaded.cpp' Cython extension (up-to-date)
building 'sparse_dot_topn.sparse_dot_topn' extension
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/sparse_dot_topn
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.7m -I/tmp/1617142081620-0/lib/python3.7/site-packages/numpy/core/include -c ./sparse_dot_topn/sparse_dot_topn.cpp -o build/temp.linux-x86_64-3.7/./sparse_dot_topn/sparse_dot_topn.o -std=c++0x -pthread -O3
./sparse_dot_topn/sparse_dot_topn.cpp:32:10: fatal error: Python.h: No such file or directory
#include "Python.h"
^~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1
----------------------------------------
Running setup.py clean for sparse-dot-topn
Failed to build sparse-dot-topn
Installing collected packages: sparse-dot-topn
Running setup.py install for sparse-dot-topn: started
Running setup.py install for sparse-dot-topn: finished with status 'error'
Complete output from command /tmp/1617142081620-0/bin/python -u -c "import setuptools, tokenize;__file__='/mnt/tmp/pip-build-1kh7kd3g/sparse-dot-topn/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-zpb55prd-record/install-record.txt --single-version-externally-managed --compile --install-headers /tmp/1617142081620-0/include/site/python3.7/sparse-dot-topn:
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.7
creating build/lib.linux-x86_64-3.7/sparse_dot_topn
copying sparse_dot_topn/__init__.py -> build/lib.linux-x86_64-3.7/sparse_dot_topn
copying sparse_dot_topn/awesome_cossim_topn.py -> build/lib.linux-x86_64-3.7/sparse_dot_topn
running build_ext
skipping './sparse_dot_topn/sparse_dot_topn.cpp' Cython extension (up-to-date)
skipping './sparse_dot_topn/sparse_dot_topn_threaded.cpp' Cython extension (up-to-date)
building 'sparse_dot_topn.sparse_dot_topn' extension
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/sparse_dot_topn
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/python3.7m -I/tmp/1617142081620-0/lib/python3.7/site-packages/numpy/core/include -c ./sparse_dot_topn/sparse_dot_topn.cpp -o build/temp.linux-x86_64-3.7/./sparse_dot_topn/sparse_dot_topn.o -std=c++0x -pthread -O3
./sparse_dot_topn/sparse_dot_topn.cpp:32:10: fatal error: Python.h: No such file or directory
#include "Python.h"
^~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1
----------------------------------------
Failed building wheel for sparse-dot-topn
Command "/tmp/1617142081620-0/bin/python -u -c "import setuptools, tokenize;__file__='/mnt/tmp/pip-build-1kh7kd3g/sparse-dot-topn/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-zpb55prd-record/install-record.txt --single-version-externally-managed --compile --install-headers /tmp/1617142081620-0/include/site/python3.7/sparse-dot-topn" failed with error code 1 in /mnt/tmp/pip-build-1kh7kd3g/sparse-dot-topn/
Note:
For reference, I'm trying to follow this tutorial: https://towardsdatascience.com/fuzzy-matching-at-scale-84f2bfd0c536
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|