UPDATED 9/4/2012: I accidentally had a hard-coded AWS_VOLUME_IDS setting in the script, which I inserted while debugging my own copy of the script and forgot to remove before posting the script here. I’ve removed it. D’oh!
UPDATED 8/19/2012: The logic in my original script for determining which backups to preserve was incorrect. It is updated below.
The easiest way to robustly back up an Amazon EBS volume is to take a snapshot. Whereas EBS volumes are stored in only a single availability zone, such that a catastrophic failure in that zone could destroy your backups along with your EBS volume, snapshots are stored in S3 and replicated across all availability zones in a region, resulting in a ridiculously low likelihood of data loss. (Nevertheless, if your disaster recovery plans need to account for the possibility that an entire EC2 region could kick the bucket, you need to back up your data some other way in addition to the mechanism outlined here.)
Many people have posted their solutions to the “automatically backing up an EBS volume on a reguilar basis” problem. Here’s mine.
To use it, create /etc/sysconfig/aws with settings for AWS_ACCESS_KEY, AWS_SECRET_KEY, and AWS_VOLUME_IDS in it. The latter should contain one or more whitespace-separated volume IDs to be backed up.
The account with the access and secret keys you specify must have at least ec2:CreateSnapshot, ec2:DescripeSnapshots, and ec2:DeleteSnapshot permissions.
Every time the script runs, it creates a new snapshot for each specified volume, then prunes previous backup snapshots of the same volume as follows:
- Save daily backups for the past week.
- Save weekly backups for the past month.
- Save monthly backups for the past year.
- Prune everything else.
You can save the script in /etc/cron.daily to run your backups automatically on a daily basis.
#!/bin/bash -le
# -l above so /etc/profile.d/aws-apitools-common.sh is loaded
# Daily snapshot, preserving weekly and monthly for a year.
# Put AWS_ACCESS_KEY, AWS_SECRET_KEY, and AWS_VOLUME_IDS here.
export AWS_ACCESS_KEY AWS_SECRET_KEY
. /etc/sysconfig/aws
volume_ids="$AWS_VOLUME_IDS"
set -- $volume_ids
if [ $# = 0 ]; then
echo "Can't determine volume IDs" 1>&2
exit 1
fi
for volume_id in $volume_ids; do
ec2-create-snapshot $volume_id --description 'Automated volume backup'
# Prune old snapshots
cutoffs="$(date '+%F')"
for days in 1 2 3 4 5 6; do
cutoffs="$cutoffs $(date -d "$days days ago" '+%F')"
done
for weeks in 2 3 4; do
cutoffs="$cutoffs $(date -d "$weeks weeks ago" '+%F')"
done
for months in 2 3 4 5 6 7 8 9 10 11 12; do
cutoffs="$cutoffs $(date -d "$months months ago" '+%F')"
done
set $cutoffs
ec2-describe-snapshots |
grep "$volume_id.*Automated volume backup" |
sort -k +5r |
while read type id volume status timestamp rest; do
if [ "$volume" != $volume_id ]; then
continue
fi
if [ $# = 0 ]; then
ec2-delete-snapshot $id
continue
fi
date=$(expr "$timestamp" : '\(....-..-..\)')
if [[ $1 < $date ]]; then
if [ -n "$last" ]; then
ec2-delete-snapshot $last
fi
last="$id"
continue
fi
last=""
shift
done
done
“Whereas EBS volumes are stored in only a single region, such that a catastrophic failure in that region could destroy your backups along with your EBS volume, snapshots are stored in S3 and replicated across all regions, resulting in a ridiculously low likelihood of data loss.”
If you are relying on this technique to offer redundancy for your backups, you may want to review the S3 documentation: http://docs.amazonwebservices.com/AmazonS3/2006-03-01/UG/Introduction.html
Specifically, “Objects stored in a Region never leave the Region unless you explicitly transfer them to another Region.” Since snapshots are stored in S3, they are just as susceptible as anything else should that region become unavailable for some reason. They are replicated across zones within a region, but NOT to other regions.
You’re correct, I misspoke. EBS snapshots aren’t replicated across regions. They are, however, replicated across availability zones, giving them a significantly higher level of redundancy than EBS volumes. Nevertheless, you’re correct that if your disaster recovery plans need to account for the possibility of an entire Amazon region kicking the bucket, you need to back up your data using some other mechanism in addition to snapshots. I’ve updated the text above to reflect this.
Nice but I think it has a couple of bugs, possibly from copy pasting?
Yeah, looks like <& and < got translated into HTML entities incorrectly. I've fixed it. Thanks for pointing that out!