'extract files inside zip sub folders with python zipfile
i have a zip folder that contains files and child zip folders. I am able to read the files placed in the parent folder but how can i get to the files inside the child zip folders? here is my code to get the files inside the parent folder
from io import BytesIO
import pandas as pd
import requests
import zipfile
url1 = 'https://www.someurl.com/abc.zip'
r = requests.get(url1)
z = zipfile.ZipFile(BytesIO(r.content))
temp = pd.read_csv(z.open('mno.csv')
my question is, if lets say, I have a child sub folder
xyz.zip
containing file
pqr.csv
how can I read this file
Solution 1:[1]
Use another BytesIO
object to open the contained zipfile
from io import BytesIO
import pandas as pd
import requests
import zipfile
# Read outer zip file
url1 = 'https://www.someurl.com/abc.zip'
r = requests.get(url1)
z = zipfile.ZipFile(BytesIO(r.content))
# lets say the archive is:
# zippped_folder/pqr.zip (which contains pqr.csv)
# Read contained zip file
pqr_zip = zipfile.ZipFile(BytesIO(z.open('zippped_folder/pqr.zip')))
temp = pd.read_csv(pqr_zip.open('prq.csv'))
Solution 2:[2]
After trying some permutation-combination, i hatched the problem with this code
zz = zipfile.ZipFile(z.namelist()[i])
temp2 = pd.read_csv(zz.open('pqr.csv'))
# where i is the index position of the child zip folder in the namelist() list. In this case, the 'xyz.zip' folder
# for eg if the 'xyz.zip' folder was third in the list, the command would be:
zz = zipfile.ZipFile(z.namelist()[2])
alternatively, if the index position is not known, the same can be achieved like this:
zz = zipfile.ZipFile(z.namelist()[z.namelist().index('xyz.zip')])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | tdelaney |
Solution 2 |