'FileNotFoundError when reading .h5 file from S3 in python using Pandas

For some reason, when I attempt to read a hdf file from S3 using the pandas.read_hdf() method, I get a FileNotFoundError when I put an s3 url. The file definitely exists and I have tried using the pandas.read_csv() method with a csv file in the same s3 directory and that works. Is there something else I need to be doing? Here's the code:

import boto3
import h5py
import s3fs
import pandas as pd

csvDataframe = pd.read_csv('s3://BUCKET_NAME/FILE_NAME.csv', key='df')
print("Csv data:")
print(csvDataframe)
dataframe = pd.read_hdf('s3://BUCKET_NAME/FILE_NAME.h5', key='df')
print("Hdf data:")
print(dataframe)

Here is the error:

FileNotFoundError: File s3://BUCKET_NAME/FILE_NAME.h5 does not exist

In the actual code, BUCKET_NAME and FILE_NAME are replaced with their actual strings.



Solution 1:[1]

Please make sure file extension is .h5

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 RatheeshTS