'AWS S3 replication DstObjectHardDeleted error during replication
Background: We are currently trying to cutover from 1 AWS account to another. This includes getting a full copy of the S3 buckets into the new account (including all historical versions and timestamps). We first initiated replication to the new account's S3 buckets, ran a batch job to copy the historical data, and then tested against it. Afterward, we emptied the bucket to remove data added during testing, and then tried to redo the replication/batch job.
Now it seems AWS will not replicate the objects because it sees they did at one point exist in the bucket. Looking at the batch job's output, every object shows this:
{bucket} {key} {version} failed 500 DstObjectHardDeleted Currently object can't be replicated if this object previously existed in the destination but was recently deleted. Please try again at a later time
After seeing this, I deleted the destination bucket completely and recreated it, in the hope that it would flush out any previous traces of the data, and then I retried it. The same error occurs.
I cannot find any information on this error or even an acknowledgement in the AWS docs that this is expected or a potential issue.
Can anyone tell me how long we have to wait before replicating again? an hour? 24?
Is there any documentation on this error in AWS?
Is there anyway to get around this limitation?
Update: Retried periodically throughout the day, and never got an upload to replicate. Also I tried replicating instead to a third bucket, and then initiate replication from that new bucket to the original target. It throws the same error.
Update2: This post was made on a Friday. Retried the jobs today (the following Monday), and the error remains unchanged.
Update3: Probably the last update. Short version is I gave up, and made a different bucket to replicate it. If anyone has information on this, I'm still interested, I just can't waste anymore time on it.
Solution 1:[1]
Batch Replication does not support re-replicating objects that were hard-deleted (deleted with the version of the object) from the destination bucket.
Below are possible workaround for this limitation:
Copy the source objects in place with a Batch Copy job. Copying those objects in place will create new versions of the objects in the source and initiate replication automatically to the destination. You may also use a custom script to do an in-place copy in the source bucket.
Re-replicate these source objects to a different/new destination bucket.
Run aws s3 sync command. It will copy objects to destination bucket with new version IDs (Version IDs will be different in source and destination buckets). If you are syncing large number of objects, run it at prefix level and determine the approximate time to replicate all objects depending on your network throughput. Run command in background with "&" at the end. You may also do dryrun before actual copy. Refer for more options.
aws s3 sync s3://SOURCE-BUCKET/prefix1 s3://DESTINATION-BUCKET/prefix1 --dryrun > output.txt
aws s3 sync s3://SOURCE-BUCKET/prefix1 s3://DESTINATION-BUCKET/prefix1 > output.txt &
In summary, you can do S3 batch copy OR S3 replication to existing destination bucket only for new version ID objects. To replicate existing version ID objects of source bucket, you will have to use different/new destination bucket.
Solution 2:[2]
We encountered the same thing and tried the same process you outlined. We did get some of the buckets to succeeded in the second account replication batch job but the largest amount of data was just below 2 million count. We have had to use the aws cli to sync the data or use the DataSync service (this process is still ongoing and may have to run many times breaking up the records).
It appears that when deleting large buckets in the first account, the metadata about them is hanging around for a long time. We moved about 150 buckets with varying amounts of data. Only about half made it to the second account doing the two step replication. So the lesson I learned is if you can control the name of your buckets and change them during the move, do that.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | ariels |
Solution 2 | Mariah |