'zipfile in Python produces not quite normal ZIP files
In my project set of files are created and packed to ZIP archive to be used at Android mobile phone. Android application is opening such ZIP files for reading initial data and then store results of its work to the same ZIPs. I have no access to source code of mentioned Android App and old script that generated zip files before (actually, I do not know how old ZIP files were created). But structure of ZIP archive is known and I have written new python script to make the same files.
I was faced with the following problem: ZIP files produced by my script cannot be opened by Android App (error message about incorrect file structure arrears), but if I unpack all the contents and pack it back to new ZIP file with the same name by WinZIP, 7-Zip or "Send to -> Compressed (zipped) folder" (in Windows 7) file is normally processed on the phone (this leads me to the conclusion that the problem is not in the Android Application).
The code snippet for packing folder in ZIP was as follows
# make zip
try:
with zipfile.ZipFile(prefix + '.zip', 'w') as zipf:
for root, dirs, files in os.walk(prefix):
for file in files:
zipf.write(os.path.join(root, file))
# remove dir, that was packed
shutil.rmtree(prefix)
# Report about resulting
print('File ' + prefix + '.zip was created')
except:
print('Unexpected error occurred while creating file ' + prefix + '.zip')
After I noticed that files are not compressed I added compression option:
zipfile.ZipFile(prefix + '.zip', 'w', zipfile.ZIP_DEFLATED)
but this didn’t solve my problem and setting True
value for allowZip64
also didn’t change the situation.
By the way a ZIP file produced with zipfile.ZIP_DEFLATED
is about 5 kilobytes smaller than ZIP file produced by Windows and about 14 kilobytes smaller than 7-Zip’s result for the same archive content. At the same time all these ZIP files I can open for visual comparison by both 7-Zip and Windows Explorer.
So I have three related questions:
1) What may cause such strange behavior of my script with zipfile
?
2) How else can I influence on zipfile
?
3) How to check ZIP file created with zipfile
to find possible structure problems or make sure there are no problems?
Of course, if I have to give up using zipfile
I can use external archiver (e.g. 7-zip) for files packing, but I would like to find an elegant solution if it exists.
UPDATE:
In order to check content of ZIP file created with zipfile
I made the following
# make zip
flist = []
try:
with zipfile.ZipFile(prefix + '.zip', 'w', zipfile.ZIP_DEFLATED) as zipf:
for root, dirs, files in os.walk(prefix):
for file in files:
zipf.write(os.path.join(root, file))
# Store item in the list
flist.append(os.path.join(root, file).replace("\\","/"))
# remove dir, that was packed
shutil.rmtree(prefix)
# Report about resulting
print('File ' + prefix + '.zip was created')
except:
print('Unexpected error occurred while creating file ' + prefix + '.zip')
# Check of zip
with closing(zipfile.ZipFile(prefix + '.zip')) as zfile:
for info in zfile.infolist():
print(info.filename + \
' (extra = ' + str(info.extra) + \
'; compress_type = ' + ('ZIP_DEFLATED' if info.compress_type == zipfile.ZIP_DEFLATED else 'NOT ZIP_DEFLATED') + \
')')
# remove item from list
if info.filename in flist:
flist.remove(info.filename)
else:
print(info.filename + ' is unexpected item')
print('Number of items that were missed:')
print(len(flist))
And see the following results in the output:
File en_US_00001.zip was created
en_US_00001/en_US_00001_0001/en_US_00001_0001_big.png (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0001/en_US_00001_0001_info.xml (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0001/en_US_00001_0001_small.png (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.pkl (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.tex (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0001/en_US_00001_0001_user.png (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0002/en_US_00001_0002_big.png (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0002/en_US_00001_0002_info.xml (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0002/en_US_00001_0002_small.png (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.pkl (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.tex (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0002/en_US_00001_0002_user.png (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0003/en_US_00001_0003_big.png (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0003/en_US_00001_0003_info.xml (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0003/en_US_00001_0003_small.png (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.pkl (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.tex (extra = b''; compress_type = ZIP_DEFLATED)
en_US_00001/en_US_00001_0003/en_US_00001_0003_user.png (extra = b''; compress_type = ZIP_DEFLATED)
Number of items that were missed:
0
Thus, all that was written, then was read, but the question remains - if all that is necessary has been written? E.g. in comments Harold said about relative paths... perhaps, it is the key to the answer
UPDATE 2
When I replaced zipfile
by using external 7-Zip code
# make zip
subprocess.call(["7z.exe","a",prefix + ".zip", prefix])
shutil.rmtree(prefix)
# Check of zip
with closing(zipfile.ZipFile(prefix + '.zip')) as zfile:
for info in zfile.infolist():
print(info.filename)
print(' (extra = ' + str(info.extra) + '; compress_type = ' + str(info.compress_type) + ')')
print('Values for compress_type:')
print(str(zipfile.ZIP_DEFLATED) + ' = ZIP_DEFLATED')
print(str(zipfile.ZIP_STORED) + ' = ZIP_STORED')
produces the following result
Creating archive en_US_00001.zip
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_big.png
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_info.xml
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_small.png
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_source.pkl
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_source.tex
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_user.png
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_big.png
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_info.xml
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_small.png
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_source.pkl
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_source.tex
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_user.png
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_big.png
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_info.xml
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_small.png
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_source.pkl
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_source.tex
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_user.png
Everything is Ok
en_US_00001/
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00Faf\xd2Y\xf9\xd1\x01Faf\xd2Y\xf9\xd1\x01%\xc9c\xd2Y\xf9\xd1\x01'; compress_type = 0)
en_US_00001/en_US_00001_0001/
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xbe(e\xd2Y\xf9\xd1\x01\xbe(e\xd2Y\xf9\xd1\x016\xf0c\xd2Y\xf9\xd1\x01'; compress_type = 0)
en_US_00001/en_US_00001_0001/en_US_00001_0001_big.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00G\x17d\xd2Y\xf9\xd1\x01G\x17d\xd2Y\xf9\xd1\x01G\x17d\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_info.xml
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00X>d\xd2Y\xf9\xd1\x01X>d\xd2Y\xf9\xd1\x01X>d\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_small.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00z\x8cd\xd2Y\xf9\xd1\x01ied\xd2Y\xf9\xd1\x01ied\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.pkl
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\x8b\xb3d\xd2Y\xf9\xd1\x01\x8b\xb3d\xd2Y\xf9\xd1\x01\x8b\xb3d\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.tex
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xad\x01e\xd2Y\xf9\xd1\x01\xad\x01e\xd2Y\xf9\xd1\x01\xad\x01e\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_user.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xbe(e\xd2Y\xf9\xd1\x01\xbe(e\xd2Y\xf9\xd1\x01\xbe(e\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0002/
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x005:f\xd2Y\xf9\xd1\x015:f\xd2Y\xf9\xd1\x01\xcfOe\xd2Y\xf9\xd1\x01'; compress_type = 0)
en_US_00001/en_US_00001_0002/en_US_00001_0002_big.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xe0ve\xd2Y\xf9\xd1\x01\xcfOe\xd2Y\xf9\xd1\x01\xcfOe\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_info.xml
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xf1\x9de\xd2Y\xf9\xd1\x01\xe0ve\xd2Y\xf9\xd1\x01\xe0ve\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_small.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\x02\xc5e\xd2Y\xf9\xd1\x01\x02\xc5e\xd2Y\xf9\xd1\x01\x02\xc5e\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.pkl
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\x13\xece\xd2Y\xf9\xd1\x01\x13\xece\xd2Y\xf9\xd1\x01\x13\xece\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.tex
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00$\x13f\xd2Y\xf9\xd1\x01$\x13f\xd2Y\xf9\xd1\x01$\x13f\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_user.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x005:f\xd2Y\xf9\xd1\x015:f\xd2Y\xf9\xd1\x015:f\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0003/
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xdf\xc0g\xd2Y\xf9\xd1\x01\xdf\xc0g\xd2Y\xf9\xd1\x01Faf\xd2Y\xf9\xd1\x01'; compress_type = 0)
en_US_00001/en_US_00001_0003/en_US_00001_0003_big.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00W\x88f\xd2Y\xf9\xd1\x01W\x88f\xd2Y\xf9\xd1\x01W\x88f\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_info.xml
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00h\xaff\xd2Y\xf9\xd1\x01h\xaff\xd2Y\xf9\xd1\x01h\xaff\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_small.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\x9b$g\xd2Y\xf9\xd1\x01y\xd6f\xd2Y\xf9\xd1\x01y\xd6f\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.pkl
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xacKg\xd2Y\xf9\xd1\x01\xacKg\xd2Y\xf9\xd1\x01\xacKg\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.tex
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xce\x99g\xd2Y\xf9\xd1\x01\xce\x99g\xd2Y\xf9\xd1\x01\xce\x99g\xd2Y\xf9\xd1\x01'; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_user.png
(extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xdf\xc0g\xd2Y\xf9\xd1\x01\xdf\xc0g\xd2Y\xf9\xd1\x01\xdf\xc0g\xd2Y\xf9\xd1\x01'; compress_type = 8)
Values for compress_type:
8 = ZIP_DEFLATED
0 = ZIP_STORED
As I understand the most important findings are:
- items with info for folders (e.g.
en_US_00001/
,en_US_00001/en_US_00001_0001/
), that were not in the ZIP produced with my usage ofzipfile
- folders have
compress_type == ZIP_STORED
, while for filescompress_type == ZIP_DEFLATED
extra
s have different values (quite long strings were generated)
Solution 1:[1]
Based on the differences listed in UPDATE 2 of Question and examples from other question about zipfile, I have tried the following code to add directories to ZIP file and check the result:
# make zip
try:
with zipfile.ZipFile(prefix + '.zip', 'w', zipfile.ZIP_DEFLATED) as zipf:
info = zipfile.ZipInfo(prefix+'\\')
zipf.writestr(info, '')
for root, dirs, files in os.walk(prefix):
for d in dirs:
info = zipfile.ZipInfo(os.path.join(root, d)+'\\')
zipf.writestr(info, '')
for file in files:
zipf.write(os.path.join(root, file))
# remove dir, that was packed
shutil.rmtree(prefix)
# Report about resulting
print('File ' + prefix + '.zip was created')
except:
print('Unexpected error occurred while creating file ' + prefix + '.zip')
# Check zip content
with closing(zipfile.ZipFile(prefix + '.zip')) as zfile:
for info in zfile.infolist():
print(info.filename)
print(' (extra = ' + str(info.extra) + '; compress_type = ' + str(info.compress_type) + ')')
print('Values for compress_type:')
print(str(zipfile.ZIP_DEFLATED) + ' = ZIP_DEFLATED')
print(str(zipfile.ZIP_STORED) + ' = ZIP_STORED')
Output is
File en_US_00001.zip was created
en_US_00001/
(extra = b''; compress_type = 0)
en_US_00001/en_US_00001_0001/
(extra = b''; compress_type = 0)
en_US_00001/en_US_00001_0002/
(extra = b''; compress_type = 0)
en_US_00001/en_US_00001_0003/
(extra = b''; compress_type = 0)
en_US_00001/en_US_00001_0001/en_US_00001_0001_big.png
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_info.xml
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_small.png
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.pkl
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.tex
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0001/en_US_00001_0001_user.png
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_big.png
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_info.xml
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_small.png
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.pkl
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.tex
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0002/en_US_00001_0002_user.png
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_big.png
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_info.xml
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_small.png
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.pkl
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.tex
(extra = b''; compress_type = 8)
en_US_00001/en_US_00001_0003/en_US_00001_0003_user.png
(extra = b''; compress_type = 8)
Values for compress_type:
8 = ZIP_DEFLATED
0 = ZIP_STORED
Adding slash to directory names (+'\\'
or +'/'
) appeared mandatory.
And the most important thing - now ZIP file is properly accepted by Android Application.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Community |