I'm trying to simplify access to datasets in various file formats (csv, pickle, feather, partitioned parquet, ...) stored as S3 objects. Since some users I supp
I have two pipelines that I run. The first pipeline reads files from s3 does some processing and updates the files. The second pipeline runs multiple jobs and f
Pandas (v1.0.5) use s3fs library to connect with AWS S3 and read data. By default, s3fs uses the credentials found in ~/.aws/credentials file in default profile
I looking for ways to read data from multiple partitioned directories from s3 using python. data_folder/serial_number=1/cur_date=20-12-2012/abcdsd0324324.snapp