'Python ZipFile doesn't recognize file as zipfile

I have a bunch of zipfiles that I'm trying to extract (all from the same source) and there is one that throws an error

BadZipFile: File is not a zip file

Here is the code that I'm using (which worked for all 187 other files of the exact same format):

for filename in sorted(os.listdir('/home/sarahwie/Documents/zip/subset/')):
    if filename.endswith(".xml.zip"):
        zip_file = zipfile.ZipFile(urlparse.urljoin('/home/sarahwie/Documents/zip/subset/', filename))

I ran it on a different machine and got the following traceback:

<zipfile.ZipFile object at 0x7fe33c066950>
Traceback (most recent call last):
  File "/home/sarahwie/anaconda2/lib/python2.7/site-packages/IPython/core/ultratb.py", line 1118, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "/home/sarahwie/anaconda2/lib/python2.7/site-packages/IPython/core/ultratb.py", line 300, in wrapped
    return f(*args, **kwargs)
  File "/home/sarahwie/anaconda2/lib/python2.7/site-packages/IPython/core/ultratb.py", line 345, in _fixed_getinnerframes
    records = fix_frame_records_filenames(inspect.getinnerframes(etb,     context))
File "/home/sarahwie/anaconda2/lib/python2.7/inspect.py", line 1049, in getinnerframes
    framelist.append((tb.tb_frame,) + getframeinfo(tb, context))
  File "/home/sarahwie/anaconda2/lib/python2.7/inspect.py", line 1009, in getframeinfo
    filename = getsourcefile(frame) or getfile(frame)
  File "/home/sarahwie/anaconda2/lib/python2.7/inspect.py", line 454, in getsourcefile
    if hasattr(getmodule(object, filename), '__loader__'):
  File "/home/sarahwie/anaconda2/lib/python2.7/inspect.py", line 483, in getmodule
    file = getabsfile(object, _filename)
  File "/home/sarahwie/anaconda2/lib/python2.7/inspect.py", line 467, in getabsfile
    return os.path.normcase(os.path.abspath(_filename))
  File "/home/sarahwie/anaconda2/lib/python2.7/posixpath.py", line 364, in abspath
    cwd = os.getcwd()
OSError: [Errno 2] No such file or directory
ERROR: Internal Python error in the inspect module.
Below is the traceback from this internal error.


Unfortunately, your original traceback can not be constructed.

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/home/sarahwie/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in run_code(self, code_obj, result)
   2900             if result is not None:
   2901                 result.error_in_exec = sys.exc_info()[1]
-> 2902             self.showtraceback()
   2903         else:
   2904             outflag = 0

/home/sarahwie/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in showtraceback(self, exc_tuple, filename, tb_offset, exception_only)
   1828                     except Exception:
   1829                         stb = self.InteractiveTB.structured_traceback(etype,
-> 1830                                             value, tb, tb_offset=tb_offset)
   1831 
   1832                     self._showtraceback(etype, value, stb)

/home/sarahwie/anaconda2/lib/python2.7/site-packages/IPython/core/ultratb.pyc in structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context)
   1390         self.tb = tb
   1391         return FormattedTB.structured_traceback(
-> 1392             self, etype, value, tb, tb_offset, number_of_lines_of_context)
   1393 
   1394 

/home/sarahwie/anaconda2/lib/python2.7/site-packages/IPython/core/ultratb.pyc in structured_traceback(self, etype, value, tb, tb_offset, number_of_lines_of_context)
   1298             # Verbose modes need a full traceback
   1299             return VerboseTB.structured_traceback(
-> 1300                 self, etype, value, tb, tb_offset, number_of_lines_of_context
   1301             )
   1302         else:

/home/sarahwie/anaconda2/lib/python2.7/site-packages/IPython/core/ultratb.pyc in structured_traceback(self, etype, evalue, etb, tb_offset, number_of_lines_of_context)
   1182                 structured_traceback_parts += formatted_exception
   1183         else:
-> 1184             structured_traceback_parts += formatted_exception[0]
   1185 
   1186         return structured_traceback_parts

IndexError: string index out of range
`

I know it's not a multiple-zip because I'm manually unzipped it on the same machine. Any ideas?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source