'Zipfile in Python file permission

i used zipfile lib to extract file from zip and now after unzip the directory i found the permission of my file has been corrupted ,

import zipfile
fh = open('sample.zip', 'rb')
z = zipfile.ZipFile(fh)
print z.namelist()
for name in z.namelist():
    z.extract(name, '/tmp/')
fh.close()

but when i use linux unzip tools this issue don't happen i try to use

os.system('unzip sample.zip')

but i still want to do this with zipfile



Solution 1:[1]

The related Python issues provide some insight as to why the issue existed in the first place: https://bugs.python.org/issue18262 and https://bugs.python.org/issue15795

Additionally the original Zip spec can be found here: https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

The important sections are:

4.4.2 version made by (2 bytes)

        4.4.2.1 The upper byte indicates the compatibility of the file
        attribute information.  If the external file attributes 
        are compatible with MS-DOS and can be read by PKZIP for 
        DOS version 2.04g then this value will be zero.  If these 
        attributes are not compatible, then this value will 
        identify the host system on which the attributes are 
        compatible.  Software can use this information to determine
        the line record format for text files etc.  

        4.4.2.2 The current mappings are:

         0 - MS-DOS and OS/2 (FAT / VFAT / FAT32 file systems)
         1 - Amiga                     2 - OpenVMS
         3 - UNIX                      4 - VM/CMS
         5 - Atari ST                  6 - OS/2 H.P.F.S.
         7 - Macintosh                 8 - Z-System
         9 - CP/M                     10 - Windows NTFS
        11 - MVS (OS/390 - Z/OS)      12 - VSE
        13 - Acorn Risc               14 - VFAT
        15 - alternate MVS            16 - BeOS
        17 - Tandem                   18 - OS/400
        19 - OS X (Darwin)            20 thru 255 - unused
...
4.4.15 external file attributes: (4 bytes)

       The mapping of the external attributes is
       host-system dependent (see 'version made by').  For
       MS-DOS, the low order byte is the MS-DOS directory
       attribute byte.  If input came from standard input, this
       field is set to zero.

That means that the external file attributes are system specific. Interpreting the external file attributes for a different system could potentially make it worse. If we only care about UNIX, we could check the ZipInfo.created_system and compare it with 3 (for UNIX). Unfortunately the spec doesn't help us much further in how to interpret the external attributes.

There is something in this Wiki http://forensicswiki.org/wiki/Zip#External_file_attributes

The external attributes UNIX (3) is 4 bytes of size and consists of:

???????????????????????????????????????????????????????????????????????????????????????
? Offset  ?  Size    ? Value  ?                      Description                      ?
???????????????????????????????????????????????????????????????????????????????????????
?      0  ? 1        ?        ? FAT (MS-DOS) file attributes.                         ?
?      1  ? 1        ?        ? Unknown                                               ?
?      2  ? 16 bits  ?        ? The UNIX mode (or permission).                        ?
?         ?          ?        ? The value seems to be similar to stat.st_mode value.  ?
???????????????????????????????????????????????????????????????????????????????????????

While that is rather observational, it seems to be the consensus.

Putting this together:

from zipfile import ZipFile

ZIP_UNIX_SYSTEM = 3

def extract_all_with_permission(zf, target_dir):
  for info in zf.infolist():
    extracted_path = zf.extract(info, target_dir)

    if info.create_system == ZIP_UNIX_SYSTEM:
      unix_attributes = info.external_attr >> 16
      if unix_attributes:
        os.chmod(extracted_path, unix_attributes)

with ZipFile('sample.zip', 'r') as zf:
  extract_all_with_permission(zf, '/tmp')

There may be a question coming up why we want to preserve the permissions in the first place. Some may dare to say that we only want to keep the executable flag. A slightly safer option in that case could be to only restore the executable flag, for files only.

from zipfile import ZipFile
from stat import S_IXUSR

ZIP_UNIX_SYSTEM = 3

def extract_all_with_executable_permission(zf, target_dir):
  for info in zf.infolist():
    extracted_path = zf.extract(info, target_dir)

    if info.create_system == ZIP_UNIX_SYSTEM and os.path.isfile(extracted_path):
      unix_attributes = info.external_attr >> 16
      if unix_attributes & S_IXUSR:
        os.chmod(extracted_path, os.stat(extracted_path).st_mode | S_IXUSR)

with ZipFile('sample.zip', 'r') as zf:
  extract_all_with_executable_permission(zf, '/tmp')

Solution 2:[2]

import zipfile
import os

unZipFile = zipfile.ZipFile("sample.zip", "r")

tmp_dir = "/tmp"
try:
    for info in unZipFile.infolist():
        real_path = unZipFile.extract(info, tmp_dir)

        # permission
        unix_attributes = info.external_attr >> 16
        target = os.path.join(tmp_dir, info.filename)
        if unix_attributes:
            os.chmod(target, unix_attributes)

        if not real_path:
            print "Extract failed: " + info.filename
finally:
    unZipFile.close()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Jeason