'How to change directory with zipfile?

I can see general questions about zipfile, not sure I'm missing something about my specific question (but I feel it's an obvious question, so sorry if it's out there).

I have a zip file with multiple directories, I want to cd within the zipfile, and then process the files.

I have this code:

with ZipFile(input_path, 'r') as zip:
  for f in zip.namelist():
    zinfo = zip.getinfo(f)
    if(zinfo.is_dir()):
       print(f)

Which prints:

dir1/
dir1/dir2
dir1/dir2/dir3

I now want to cd to dir3 only, and process the files, without unzipping the whole zipfile.

Can someone show me the code to do this?

Update 1: According to suggestion, I was trying this:

  with ZipFile(input_path, 'r') as zip: 
    for f in zip.namelist():
      zinfo = zip.getinfo(f)
      if(zinfo.is_dir()):
        if zinfo.filename == 'dir1/dir2/dir3':
          zip2 = zip.open(zip.path('dir1/dir2/dir3'))
          for f2 in zip2:
            zinfo2 = zip.getinfo(f2)
            print(zinfo2)

There is no error, but nothing prints (but there is definitely files in the directory, as I can see this when I unpack the zip manually).



Solution 1:[1]

As far as I know you can't cd within a zipfile. however here's a way to find a sub-directory in one and iterate through the files in it. The code below uses a recursive helper function named find_target_dir() to locate the target subfolder, and then iterates and prints the name of the files in it.

import os
from zipfile import ZipFile, Path as ZipPath


def find_target_dir(sub_directory, target_dir_name):
    for member in sub_directory.iterdir():
        if member.is_dir():
            return(member if member.name == target_dir_name else
                   find_target_dir(member, target_dir_name))

def get_zipfile_dir_members(input_path, target_dir_name):
    with ZipFile(input_path, 'r') as zfile:
        root_path = ZipPath(zfile)
        target_dir = find_target_dir(root_path, target_dir_name)
        if target_dir:
            print(f'Files in {target_dir}:')
            for file_name in target_dir.iterdir():
                if file_name.is_file():
                    print(f'    {file_name.name}')


get_zipfile_dir_members('./test_archive.zip', 'dir3')

Sample output with a test archive file that has subdirectories like the ones described in your question, plus I added one or more regular files in at each level for testing purposes.

Here is a list showing what's in the test_archive.zip file:

contents of test archive

Here's the printed output from the script:

Files in ./test_archive.zip/dir1/dir2/dir3/:
    d3_1.txt
    d3_2.txt
    d3_3.txt

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1