'Create a zip with only .pdf and .xml files from one directory

I would love to know how i can zip only all pdfs from the main directory without including the subfolders.

I've tried several times changing the code, without any succes with what i want to achieve.

import zipfile
 
fantasy_zip = zipfile.ZipFile('/home/rob/Desktop/projects/zenjobv2/archivetest.zip', 'w')
 
for folder, subfolders, files in os.walk('/home/rob/Desktop/projects/zenjobv2/'):
 
    for file in files:
        if file.endswith('.pdf'):
            fantasy_zip.write(os.path.join(folder, file), os.path.relpath(os.path.join(folder,file), '/home/rob/Desktop/projects/zenjobv2/'), compress_type = zipfile.ZIP_DEFLATED)
        elif file.endswith('.xml'):
            fantasy_zip.write(os.path.join(folder, file), os.path.relpath(os.path.join(folder,file), '/home/rob/Desktop/projects/zenjobv2/'), compress_type = zipfile.ZIP_DEFLATED)
fantasy_zip.close()

I expect that a zip is created only with the .pdfs and .xml files from the zenjobv2 folder/directory without including any other folders/subfolders.



Solution 1:[1]

You are looping through the entire directory tree with os.walk(). It sounds like you want to just look at the files in a given directory. For that, consider os.scandir(), which returns an iterator of all files and subdirectories in a given directory. You will just have to filter out elements that are directories:

root = "/home/rob/Desktop/projects/zenjobv2"
for entry in os.scandir(root):
    if entry.is_dir():
        continue  # Just in case there are strangely-named directories
    if entry.path.endswith(".pdf") or entry.path.endswith(".xml"):
        # Process the file at entry.path as you see fit

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Dharman