UPDATED 9/4/2012: I accidentally had a hard-coded AWS_VOLUME_IDS setting in the script, which I inserted while debugging my own copy of the script and forgot to remove before posting the script here. I’ve removed it. D’oh!
UPDATED 8/19/2012: The logic in my original script for determining which backups to preserve was incorrect. It is updated below.
The easiest way to robustly back up an Amazon EBS volume is to take a snapshot. Whereas EBS volumes are stored in only a single availability zone, such that a catastrophic failure in that zone could destroy your backups along with your EBS volume, snapshots are stored in S3 and replicated across all availability zones in a region, resulting in a ridiculously low likelihood of data loss. (Nevertheless, if your disaster recovery plans need to account for the possibility that an entire EC2 region could kick the bucket, you need to back up your data some other way in addition to the mechanism outlined here.)
Many people have posted their solutions to the “automatically backing up an EBS volume on a reguilar basis” problem. Here’s mine.
To use it, create /etc/sysconfig/aws with settings for AWS_ACCESS_KEY, AWS_SECRET_KEY, and AWS_VOLUME_IDS in it. The latter should contain one or more whitespace-separated volume IDs to be backed up.
The account with the access and secret keys you specify must have at least ec2:CreateSnapshot, ec2:DescripeSnapshots, and ec2:DeleteSnapshot permissions.
Every time the script runs, it creates a new snapshot for each specified volume, then prunes previous backup snapshots of the same volume as follows:
- Save daily backups for the past week.
- Save weekly backups for the past month.
- Save monthly backups for the past year.
- Prune everything else.
You can save the script in /etc/cron.daily to run your backups automatically on a daily basis.
#!/bin/bash -le # -l above so /etc/profile.d/aws-apitools-common.sh is loaded # Daily snapshot, preserving weekly and monthly for a year. # Put AWS_ACCESS_KEY, AWS_SECRET_KEY, and AWS_VOLUME_IDS here. . /etc/sysconfig/aws export AWS_ACCESS_KEY AWS_SECRET_KEY volume_ids="$AWS_VOLUME_IDS" set -- $volume_ids if [ $# = 0 ]; then echo "Can't determine volume IDs" 1>&2 exit 1 fi for volume_id in $volume_ids; do ec2-create-snapshot $volume_id \ --description 'Automated volume backup' # Prune old snapshots cutoffs="$(date '+%F')" for days in 1 2 3 4 5 6; do cutoffs="$cutoffs $(date -d "$days days ago" '+%F')" done for weeks in 2 3 4; do cutoffs="$cutoffs $(date -d "$weeks weeks ago" '+%F')" done for months in 2 3 4 5 6 7 8 9 10 11 12; do cutoffs="$cutoffs $(date -d "$months months ago" '+%F')" done set $cutoffs ec2-describe-snapshots | grep "$volume_id.*Automated volume backup" | sort -k +5r | while read type id volume status timestamp rest; do if [ "$volume" != $volume_id ]; then continue fi if [ $# = 0 ]; then ec2-delete-snapshot $id continue fi date=$(expr "$timestamp" : '\(....-..-..\)') if [[ $1 < $date ]]; then if [ -n "$last" ]; then ec2-delete-snapshot $last fi last="$id" continue fi last="" shift done done