'How do I delete all except the latest 5 recently updated/new files from AWS s3?
I can fetch the last five updated files from AWS S3 using the below command
aws s3 ls s3://somebucket/ --recursive | sort | tail -n 5 | awk '{print $4}'
Now I need to delete all the files in AWS S3 except the last 5 files which are fetched from above command in AWS.
Say the command fetches 1.txt,2.txt,3.txt,4.txt,5.txt
. I need to delete all from AWS S3 except 1.txt,2.txt,3.txt,4.txt,and 5.txt
.
Solution 1:[1]
Use AWS s3 rm command with multiple --exclude
options (I assume the last 5 files do not fall under a pattern)
aws s3 rm s3://somebucket/ --recursive --exclude "somebucket/1.txt" --exclude "somebucket/2.txt" --exclude "somebucket/3.txt" --exclude "somebucket/4.txt" --exclude "somebucket/5.txt"
CAUTION: Make sure you try it with --dryrun
option, verify the files to be deleted do not include the 5 files before actually removing the files.
Solution 2:[2]
Use a negative number with head
to get all but the last n
lines:
aws s3 ls s3://somebucket/ --recursive | sort | head -n -5 | while read -r line ; do
echo "Removing ${line}"
aws s3 rm s3://somebucket/${line}
done
Solution 3:[3]
Short story : Based on @bcattle answser, this work for AWS CLI 2:
aws s3 ls s3://[BUCKER_NAME] --recursive | awk 'NF>1{print $4}' | grep . | sort | head -n -5 | while read -r line ; do
echo "Removing ${line}"
aws s3 rm s3://[BUCKER_NAME]/${line}
done
Long story : aws s3 ls
is returning under CLI 2 file path, but also date creation. This behaviour insn't expected in our script, as we want only the file path to be concatenated with bucket uri.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | helloV |
Solution 2 | bcattle |
Solution 3 | emilie zawadzki |