'pandas read_csv throwing ValueError: Invalid file path or buffer object type: <class 'list'>

I want to read a csv file sent as a command line argument. Thought I could directly use FileType object of argsprase but I'm getting errors.

from argparse import ArgumentParser, FileType
from pandas import read_csv

if __name__ == "__main__":
    parser = ArgumentParser()
    parser.add_argument("input_file_path", help="Input CSV file", type=FileType('r'), nargs=1)
    df = read_csv(parser.parse_args().input_file_path, sep="|")
    print(df.to_string())

Pandas read_csv is unable to read FileType object when I execute the program as given below - what is missing?

python csv_splitter.py test.csv

Traceback (most recent call last):
  File "csv_splitter.py", line 7, in <module>
    df = read_csv(parser.parse_args().input_file_path, sep="|")
  File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 605, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 457, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 814, in __init__
    self._engine = self._make_engine(self.engine)
  File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1045, in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
  File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1862, in __init__
    self._open_handles(src, kwds)
  File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1357, in _open_handles
    self.handles = get_handle(
  File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\common.py", line 558, in get_handle
    ioargs = _get_filepath_or_buffer(
  File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\common.py", line 371, in _get_filepath_or_buffer
    raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'list'>


Solution 1:[1]

pd.read_csv cannot read a list of files, only one at a time.

To read multiple files into one dataframe, use pd.concat with a generator:

df = pd.concat(pd.read_csv(p) for p in paths)

Or pd.concat with map:

df = pd.concat(map(pd.read_csv, paths))

In OP's case, even though nargs=1 limits the arg parser to consuming 1 file, it still returns a list of that 1 file object:

print(parser.parse_args().input_file_path)
# [ <_io.TextIOWrapper> ]

So just index the single file:

df = pd.read_csv(parser.parse_args().input_file_path[0])
#                                                   ^^^

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1