'Parsing in memory CSV files from zip archives
I'm working on a new library which will allow the user to parse any file (xlsx
, csv
, json
, tar
, zip
, txt
) into generators.
Now I'm stuck at zip
archive and when I try to parse a csv
from it, I get
io.UnsupportedOperation: seek
immediately after elem.seek(0)
. The csv
file is a simple one 4x4 rows and columns. If I parse the csv
using the csv_parser
I get what I want, but trying to parse it from a zip archive... boom. Error!
with open("/Users/ro/Downloads/archive_file/csv.zip", 'r') as my_file_file:
asd = parse_zip(my_file_file)
print asd
Where parse_zip
is:
def parse_zip(element):
"""Function for manipulating zip files"""
try:
my_zip = zipfile.ZipFile(element, 'r')
except zipfile.BadZipfile:
raise err.NestedArchives(element)
else:
my_file = my_zip.open('corect_csv.csv')
# print my_file
my_mime = csv_tsv_parser.parse_csv_tsv(my_file)
print list(my_mime)
And parse_cvs_tsv
is:
def _csv_tsv_parser(element):
"""Helper function for csv and tsv files that return an generator"""
for row in element:
if any(s for s in row):
yield row
def parse_csv_tsv(elem):
"""Function for manipulating all the csv files"""
dialect = csv.Sniffer().sniff(elem.readline())
elem.seek(0)
data_file = csv.reader(elem, dialect)
read_data = _csv_tsv_parser(data_file)
yield '', read_data
Where am I wrong? Is the way I'm opening the file OK or...?
Solution 1:[1]
Zipfile.open returns a file-like ZipExtFile object that inherits from io.BufferedIOBase. io.BufferedIOBase
does not support seek
(only text streams in the io
module support seek
), hence the exception.
However, ZipExtFile
does provide a peek method, which will return a number of bytes without moving the file pointer. So changing
dialect = csv.Sniffer().sniff(elem.readline())
elem.seek(0)
to
num_bytes = 128 # number of bytes to read
dialect = csv.Sniffer().sniff(elem.peek(n=num_bytes))
solves the problem.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | snakecharmerb |