'pandas read_csv throwing ValueError: Invalid file path or buffer object type: <class 'list'>
I want to read a csv file sent as a command line argument. Thought I could directly use FileType object of argsprase but I'm getting errors.
from argparse import ArgumentParser, FileType
from pandas import read_csv
if __name__ == "__main__":
parser = ArgumentParser()
parser.add_argument("input_file_path", help="Input CSV file", type=FileType('r'), nargs=1)
df = read_csv(parser.parse_args().input_file_path, sep="|")
print(df.to_string())
Pandas read_csv is unable to read FileType object when I execute the program as given below - what is missing?
python csv_splitter.py test.csv
Traceback (most recent call last):
File "csv_splitter.py", line 7, in <module>
df = read_csv(parser.parse_args().input_file_path, sep="|")
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 605, in read_csv
return _read(filepath_or_buffer, kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 457, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 814, in __init__
self._engine = self._make_engine(self.engine)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1045, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1862, in __init__
self._open_handles(src, kwds)
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\parsers.py", line 1357, in _open_handles
self.handles = get_handle(
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\common.py", line 558, in get_handle
ioargs = _get_filepath_or_buffer(
File "C:\Users\kakkrah\AppData\Roaming\Python\Python38\site-packages\pandas\io\common.py", line 371, in _get_filepath_or_buffer
raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'list'>
Solution 1:[1]
pd.read_csv
cannot read a list of files, only one at a time.
To read multiple files into one dataframe, use pd.concat
with a generator:
df = pd.concat(pd.read_csv(p) for p in paths)
df = pd.concat(map(pd.read_csv, paths))
In OP's case, even though nargs=1
limits the arg parser to consuming 1 file, it still returns a list of that 1 file object:
print(parser.parse_args().input_file_path)
# [ <_io.TextIOWrapper> ]
So just index the single file:
df = pd.read_csv(parser.parse_args().input_file_path[0])
# ^^^
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |