'How to unzip specific subfolders of a zip archive with Python zipfile

I want to unzip particular subfolders from a list of zip archives using the zipfile module.

Paths to archives are stored in a csv file with concatenated subfolders, like so:

C:/path_to/archive.zip/y/z/subfolder_a
C:/path_to_a_different/archive.zip/x/subfolder_b
...

My goal is to unzip only these specific subfolders to another location while maintaining file structure. Here is what I have come up with.

import zipfile
import os

# List of archive paths with subfolder location
paths = 'C:\\paths.csv'
paths = open(paths, 'r')

for path in paths:
    archive = path.split('.zip')[0] + '.zip'
    path_to_subfolder = path.split('.zip')[1].rstrip('\n')

    # Removes leading slash and replaces backslashes by slashes
    path_to_subfolder = path_to_subfolder.strip('\\').replace('\\','/')

    # Defines name of output folder
    output_folder_name = os.path.split(path_to_subfolder)[-1].rstrip('\n') # subfolder_a, subfolder_b

    # Moves relevant subfolders to output location
    archive = zipfile.ZipFile(archive)
    for file in archive.namelist():
        if file.startswith(path_to_subfolder):
            archive.extract(file, 'C://output//{}'.format(output_folder_name))

The folder structure after running this looks like:

C:/output/subfolder_a/y/z/subfolder_a/...

but the folder structure should look like this:

C:/output/subfolder_a/...

What can I change to fix this?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source