''bz2 is module not available' when installing Pandas with pip in python virtual environment

I am going through this post Numpy, Scipy, and Pandas - Oh My!, installing some python packages, but got stuck at the line for installing Pandas:

pip install -e git+https://github.com/pydata/pandas#egg=pandas

I changed 'wesm' to 'pydata' for the latest version, and the only other difference to the post is that I'm using pythonbrew.

I found this post, related to the error, but where is the Makefile for bz2 mentioned in the answer? Is there another way to resolve this problem?

Any help would be much appreciated. Thanks.



Solution 1:[1]

You need to build python with BZIP2 support.

Install the following package before building python:

  • Red Hat/Fedora/CentOS: yum install bzip2-devel
  • Debian/Ubuntu: sudo apt-get install libbz2-dev

Extract python tarball. Then

configure;
make;
make install

Install pip using the new python.

Alternative:

Install a binary python distribution using yum or apt, that was build with BZIP2 support.

See also: ImportError: No module named bz2 for Python 2.7.2

Solution 2:[2]

I spent a lot of time on the internet and got a partial answer everywhere. Here is what you need to do to make it work. Follow every step.

  1. sudo apt-get install libbz2-dev Thanks to Freek Wiekmeijer for this.
    Now you also need to build python with bz2. Previously installed python won't work. For that do following:

  2. Download stable python version from https://www.python.org/downloads/source/ then extract that Gzipped source tarball file. You can use wget https://python-tar-file-link.tgz to download and tar -xvzf python-tar-file.tgz to extract it in current directory

  3. Go inside extracted folder then run following commands one at a time

    • ./configure
    • make
    • make install
  4. This will build a python file with bz2 that you previously installed

  5. Since this python doesn't have pip installed, idea was to create a virtual environment with above-built python then install pandas using previously installed pip

  6. You will see python file in the same directory. Just create a virtual environment.

    • ./python -m env myenv (create myenv in the same directory or outside it's your choice)
    • source myenv/bin/activate (activate virtual environment)
    • pip install pandas (install pandas in the current environment)
  7. That's it. Now with this environment, you should be able to use pandas without error.

Solution 3:[3]

pyenv

I noticed that installing Python using source takes a long time (I am doing it on i7 :/ ); especially the make and make test...

A simpler and shorter solution was to install another version of Python (I did Python 3.7.8) using pyenv, install it using these steps.

It not only saved the problem of using multiple Python instances on the same system but also maintain my virtual environments without virtualenvwrapper (which turned buggy on my newly setup ubuntu-20.04).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 bagerard
Solution 2 Vasiliy Artamonov
Solution 3 Pe Dro