'Why is my ZipFile not found while trying to load it into Python?
This code i cannot change
%matplotlib inline
from collections import Counter, defaultdict, OrderedDict
from bs4 import BeautifulSoup
import os
from tqdm import tqdm_notebook
import glob
import nltk
import zipfile
import math
import pandas as pd
import sys
import itertools
def loadShakespeare():
if 'shaks200.zip' in os.listdir():
return 'shaks200.zip'
elif os.path.exists('../../data/Week1/'):
return '../../data/Week1/shaks200.zip'
elif os.path.exists('../../../data/Week1/'):
return '../../../data/Week1/shaks200.zip'
The following code i am allowed to change
def index_collection(shaks200):
# With zipfile we can read the file without opening the zip file
archive = zipfile.ZipFile('shaks200.zip', 'r')
namelist = [x for x in archive.namelist() if '.xml' in x]
MyIndex = defaultdict(lambda: defaultdict(int)) # initialize MyIndex
for infile in tqdm_notebook(namelist): # loop over each file
f = archive.open(infile)
return MyIndex
%time Shakespeare = index_collection(loadShakespeare())
Shakespeare['the'], Shakespeare['witch']
This code gives me FileNotFoundError: [Errno 2] No such file or directory: 'shaks200.zip'
the location of the file is C:\Users\joris\Desktop\Zoekmachines\IR0_2020_Student_Repo\IR0_2020_Student_Repo\Data\Week1
Solution 1:[1]
I think you need to pass the absolute path of the file. If you are not passing the absolute path of the file, python assumes that you are looking for the file in the current directory, which I guess is causing this issue.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Kashyap Sharma |