15.3 EBS Snapshots: Incremental Backups to S3
Right, let’s talk about EBS snapshots. This is where we stop crossing our fingers and start actually backing up our data. An EBS volume is great, but it’s stuck in a single Availability Zone. If that AZ has a really bad day, your volume has a bad day. Snapshots are your escape pod. They’re incremental, point-in-time backups of your EBS volumes that get stored in the highly durable, multi-AZ wonderland of S3.
The magic word here is incremental. The first time you snapshot a volume, it’s a full copy. But after that, AWS is clever. It only saves the blocks that have changed since your last snapshot. This is why your second snapshot often finishes in seconds and costs pennies. It’s not cloning the whole drive again; it’s just jotting down a note about what’s different. This makes them incredibly cost-effective and fast. When you delete a snapshot, AWS is smart enough to only purge the data blocks that aren’t used by any of your other snapshots. So, no, you won’t nuke your entire backup history by deleting an old one (unless it’s the only one holding onto a specific block).
How to Create a Snapshot (The Right Way)
You can snap a volume while it’s in use, but for anything beyond a dev instance playing tic-tac-toe, you need to get the filesystem consistent. The API doesn’t care about your operating system’s feelings; it just copies blocks from the virtual disk. If those blocks are mid-write, you’ll get a snapshot that looks like it was taken during an earthquake.
For Linux, we fsfreeze. This tells the OS to flush everything to disk and temporarily pause writes. It’s the difference between a clean family photo and one where the dog is a blur.
# Let's assume our volume is mounted at /important-data
# Freeze the filesystem
sudo fsfreeze -f /important-data
# Create the snapshot using the AWS CLI
# Pro-tip: Always tag your snapshots. Future You will send Past You a thank you note.
aws ec2 create-snapshot \
--volume-id vol-1234567890abcdef0 \
--description "Daily backup for my important stuff" \
--tag-specifications 'ResourceType=snapshot,Tags=[{Key=Name,Value=my-app-daily-backup}]'
# ...wait for the command to acknowledge the snapshot creation...
# Unfreeze the filesystem immediately afterwards
sudo fsfreeze -u /important-data
For a production database or something even more sensitive, you’d coordinate with your application to quiesce its writes first. This is non-negotiable.
The Anatomy of a Snapshot Operation
When you issue that create-snapshot command, here’s what happens in the background. AWS doesn’t start copying terabytes of data to S3 instantly. First, it initiates the snapshot, which creates a record and establishes the point-in-time. Then, it lazily copies the data to S3 over the next few hours. The snapshot status will be pending until this process is complete. You can use it to create a new volume immediately—AWS will pull any blocks not yet in S3 directly from your source volume on-demand. It’s a fantastic piece of engineering that makes this all seem instantaneous.
Restoring: It’s Not a Disk Clone
This is the most common mental hurdle. You don’t “restore” a snapshot onto an existing volume. Think of a snapshot as a recipe. To get your data back, you use the recipe to bake a whole new volume.
# Create a new volume from your snapshot
aws ec2 create-volume \
--snapshot-id snap-0a1234567890abcdef \
--availability-zone us-east-1a \ # You must specify an AZ!
--volume-type gp3 \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=my-app-restored-volume}]'
The new volume will be a perfect, healthy, and—this is crucial—unattached copy of your original data at the moment of the snapshot. You then attach it to an instance, mount it, and verify everything is there. This is your primary recovery method.
Common Pitfalls and “Oh Crap” Moments
- The AZ Trap: Remember, you create a new volume in a specific AZ. If your instance is in
us-east-1aand you restore the volume tous-east-1bby mistake, you can’t attach it. You’d have to snapshot the restored volume and then create a new volume from that snapshot in the correct AZ. It’s annoying. Always double-check the AZ parameter. - The Size Problem: When you restore, you can specify a size larger than the original volume (useful for giving yourself breathing room), but you cannot specify a size smaller than the original. The new volume must be at least as big as the original’s allocated size.
- The Permissions Nightmare: I’ve seen teams lock themselves out of their own backups. Your snapshots are owned by your AWS account, but if you encrypt a volume with a custom KMS key, you must ensure that the user/role creating the snapshot (and later restoring it) has permissions to use that key. If you lose access to the KMS key, your encrypted snapshots become very expensive bricks. Test your restore permissions before you have a disaster.
- The Cost Creep: While incremental, snapshots aren’t free. Keeping 100 daily snapshots of a 1 TB volume that changes 5% daily doesn’t mean you’re storing 1 TB + (100 * 50 GB). The math is more complex, but the point is: old snapshots cost money. Implement a lifecycle policy to delete them automatically after a certain period. Don’t be that person who gets a surprise bill for snapshots of a project that was decommissioned two years ago.