'AWS S3 Sync --force-glacier-transfer

A few days back I was experimenting with S3 & Glacier and my data was archived so restoring it back I had to use their expedited service (which costs a lot). I want to move all of my content from one bucket to another bucket in the same region same account.

When I try to sync the data it gives the following error Completed 10.9 MiB/~10.9 MiB (30.0 KiB/s) with ~0 file(s) remaining (calculatingwarning: Skipping file s3://bucket/zzz0dllquplo1515993694.mp4. Object is of storage class GLACIER. Unable to perform copy operations on GLACIER objects. You must restore the object to be able to perform the operation. See aws s3 copy help for additional parameter options to ignore or force these transfers.

I am using the below command and I was wondering what would it cost me in terms of dollars? Because all of my files storage class is changed to "Glacier" from "Standard". So, I am forced to use --force-glacier-transfer flag

aws s3 sync s3://bucketname1 s3://bucketname2 --force-glacier-transfer --storage-class STANDARD


Solution 1:[1]

If you restored them and are before the expiry-date you should be able to sync them without an additional restore. You get the Glacier error for all recursive commands as the API they use doesn't check to see if they are restored. You can read about it in the ticket where they added the --force-glacier-transfer.

https://github.com/aws/aws-cli/issues/1699

When using the --force-glacier-transfer flag it doesn't do another restore, it just ignores the API saying the object is in Glacier and tries anyways. It will fail if the object is not restored (it won't try to restore it).

Note that this is only with the recursive commands (eg. sync and cp/mv with --recursive), if you just copy 1 file it will work without the force flag.

Solution 2:[2]

Copy file of a Glacier storage class to a different bucket

You wrote: "I want to move all of my content from one bucket to another bucket in the same region same account."

If you want to copy files kept at a Glacier storage class from one bucket to another bucket even by the sync command, you have to restore the files first, i.e. make the files available for retrieval before you can actually copy them. The exception is when a file is stored in a "Amazon S3 Glacier Instant Retrieval" storage class. In this case, you don't need to explicitly restore the files.

Therefore, you have to issue the restore-object command to each of the files to initiate a restore request. Then you have to wait until the restore request completes. After that, you will be able to copy your files within the number of days that you have specified during the restore request.

Retrieval pricing

You also wrote: "I was wondering what would it cost me in terms of dollars".

With the command you provided, aws s3 sync s3://bucketname1 s3://bucketname2 --force-glacier-transfer --storage-class STANDARD, you copy the files from Glacier to Standard storage class. In this case, you have to first pay for retrieval (one-off) and then you will pay (monthly) for storing both copies of the file: one copy at the glacier their and another copy at the Standard storage class.

According to Amazon (quote),

To change the object's storage class to Amazon S3 Standard, use copy (by overwriting the existing object or copying the object into another location).

However, for a file stored in the Glacier storage class, you can only copy it to another location at S3 within the same bucket, you cannot actually retrieve the file contents unless you restore it, i.e. make it available for retrieval.

Since you have asked "what would it cost me in terms of dollars", you will have to pay according to the retrieval prices and storage prices published by Amazon.

You can check the retrieval pricing at https://aws.amazon.com/s3/glacier/pricing/

The storage prices are available at https://aws.amazon.com/s3/pricing/

The retrieval prices depend on what kind of Glacier storage class you initially selected to store the files: "S3 Glacier Instant Retrieval", "S3 Glacier Flexible Retrieval" or "S3 Glacier Deep Archive". The storage class can be modified by lifecycle rules, so to be more correct, it is the current storage class for each file that matters.

Unless you store your files in the "S3 Glacier Instant Retrieval" storage class, the cheapest option is to first restore the files (make them available for retrieval) using "Bulk" retrieval option (restore tier), which is a free option for "S3 Glacier Flexible Retrieval" and very cheap for "S3 Glacier Deep Archive". Thus you can copy the files with minimal restoration costs if at all.

Since you prefer to use command-line, you can use the Perl script to make the files available for retrieval with the "Bulk" retrieval option (restore tier). Otherwise, the aws s3 sync command that you gave will use the "Standard" restore tier.

As of today, in the Ohio US region, the prices for retrieval are the following.

For "S3 Glacier Instant Retrieval", at the time of writing, it costs $0.03 per GB to restore, with no other options. For "S3 Glacier Flexible Retrieval", the "Standard" retrieval costs $0.01 per GB while "Bulk" retrieval is free. For "S3 Glacier Deep Archive", the "Standard" retrieval costs $0.02 while "Bulk" costs $0.0025 per GB.

You will also pay for retrieval requests regardless of the data size. However, for "S3 Glacier Instant Retrieval" you won't pay for retrieval requests; and for "Bulk", retrieval requests costs are minimal (for S3 Glacier Deep Archive), if not free (S3 Glacier Flexible Retrieval).

Solution 3:[3]

BUCKET=my-bucket
DATE=$1
BPATH=/pathInBucket/FolderPartitioDate=$DATE
DAYS=5
 for x in `aws s3 ls s3://$BUCKET$BPATH --recursive | awk '{print $4}'`;
 do
  echo "1:Restore $x"
  aws s3api --profile sriAthena restore-object --bucket $BUCKET --key $x --restore-request Days=$DAYS,GlacierJobParam
eters={"Tier"="Standard"};
  echo "2:Monitor $x"
  aws s3api head-object --bucket $BUCKET --key $x;
done

https://aws.amazon.com/premiumsupport/knowledge-center/restore-s3-object-glacier-storage-class/

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 John Eikenberry
Solution 2
Solution 3 Adamsistron