'How to use boto3 (or other Python) to list the contents of a _RequesterPays_ S3 bucket?

You can download a file via boto3 from a RequesterPays S3 bucket, as follows:

  s3_client.download_file('aws-naip', 'md/2013/1m/rgbir/38077/{}'.format(filename), full_path, {'RequestPayer':'requester'})

What I can't figure out is how to list the objects in the bucket... I get an authentication error when I try and call objects.all() on the bucket.

How can I use boto3 to enumerate the contents of a RequesterPays bucket? Please note this is a particular kind of bucket where the requester pays the S3 charges.



Solution 1:[1]

From boto3, we can see that there is a #S3.Client.list_objects method. This can be used to enumerate objects:

import boto3
s3_client = boto3.client('s3')
resp = s3_client.list_objects(Bucket='RequesterPays')

# print names of all objects
for obj in resp['Contents']:
    print 'Object Name: %s' % obj['Key']

Output:

Object Name: pic.gif
Object Name: doc.txt
Object Name: page.html

If you are getting a 401 then make sure that IAM user calling the API has s3:GetObject permissions on the bucket.

Solution 2:[2]

You have to pass the RequestPayer kwarg to the list_objects method.

Also, according to the boto3 docs,

Note: ListObjectsV2 is the revised List Objects API and we recommend you use this revised API for new application development

Putting that together with pagination would look like:

import boto3
s3_client = boto3.client('s3')

def get_keys(bucket, prefix, requester_pays=False):
    """Get s3 objects from a bucket/prefix
    optionally use requester-pays header
    """
    extra_kwargs = {}
    if requester_pays:
        extra_kwargs = {'RequestPayer': 'requester'}

    next_token = 'init'
    while next_token:
        kwargs = extra_kwargs.copy()
        if next_token != 'init':
            kwargs.update({'ContinuationToken': next_token})

        resp = s3_client.list_objects_v2(
            Bucket=bucket, Prefix=prefix, **kwargs)

        try:
            next_token = resp['NextContinuationToken']
        except KeyError:
            next_token = None

        for contents in resp['Contents']:
            key = contents['Key']
            yield key

and would be used like

x = list(get_keys('aws-naip', 'co', requester_pays=True))

Solution 3:[3]

I had the same issue so here is the code:

import boto3

s3 = boto3.resource('s3')

for bucket in s3.buckets.all():
    print(bucket.name)

client = boto3.client('s3')

result= client.list_objects(Bucket='bucketname',RequestPayer='requester')
for o in result['Contents']:
    print(o['Key'])

The response to the query is a dictionary, and within that dictionary there is another dictionary named contents where the keys are the paths to the objects. You can check the response fields in the following link: List_objects documentation

Note : list_objects returns up to 1000 contents so you would have to iterate over with the next_marker property (I will update this answer if you would like the full list) . I guess you have already figured out how to setup the access key and secret key. Let me know if you need more details on that.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Raf
Solution 2 perrygeo
Solution 3 Alexis Kanter