'FileNotFoundError when reading .h5 file from S3 in python using Pandas
For some reason, when I attempt to read a hdf file from S3 using the pandas.read_hdf() method, I get a FileNotFoundError when I put an s3 url. The file definitely exists and I have tried using the pandas.read_csv() method with a csv file in the same s3 directory and that works. Is there something else I need to be doing? Here's the code:
import boto3
import h5py
import s3fs
import pandas as pd
csvDataframe = pd.read_csv('s3://BUCKET_NAME/FILE_NAME.csv', key='df')
print("Csv data:")
print(csvDataframe)
dataframe = pd.read_hdf('s3://BUCKET_NAME/FILE_NAME.h5', key='df')
print("Hdf data:")
print(dataframe)
Here is the error:
FileNotFoundError: File s3://BUCKET_NAME/FILE_NAME.h5 does not exist
In the actual code, BUCKET_NAME and FILE_NAME are replaced with their actual strings.
Solution 1:[1]
Please make sure file extension is .h5
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | RatheeshTS |