'Read CSV starting with string from Zipfile

I'm trying to loop through a folder that has zip files in it, and only extracting the csv files that start with a certain prefix.

Here is the code:

for name in glob.glob(path + '/*.zip'):
    zf = zipfile.ZipFile(name)
    csv_file = pd.read_csv(zf.open('Common_MarketResults*.csv'))
    df = pd.concat(csv_file, axis=0).reset_index()

The csv file has some dates after the string I am using, which will be different in every zip file. I am receiving the following error message:

KeyError: "There is no item named 'Common_MarketResults*.csv' in the archive"



Solution 1:[1]

Searching for substrings in the filename made this possible.

sub = 'Common_MarketResults'
suf = 'csv'
data = []

for name in glob.glob(path + '*.zip'):
    zf = zipfile.ZipFile(name)
    zf_nfo = zipfile.ZipFile(name).namelist()
    for s in zf_nfo:
        if sub in s and suf in s:
            csv_file_str = s
    csv_file = pd.read_csv(zf.open(csv_file_str))
    csv_file['file_name'] = csv_file_str
    data.append(csv_file)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 alex cruz