15.1 EBS Volume Types: gp3, gp2, io2 Block Express, st1, sc1
Right, let’s talk about spinning rust in the cloud. EBS volumes are the virtual hard drives you attach to your EC2 instances. They’re persistent, network-attached storage, which is a fancy way of saying they live on a shelf in an AWS data center somewhere and get to your server over a network cable. This is the first thing to internalize: your “local” disk is actually miles away. This network hop is the source of both its flexibility and most of its performance quirks.
AWS, in its infinite wisdom, has given us a veritable smorgasbord of volume types. They fall into two main camps: the SSDs for when you need speed (your boot volumes, databases, etc.), and the clunkier, cheaper Hard Disk Drives (HDDs) for when you need to move large, sequential chunks of data (like log processing or data warehouse storage).
The SSD Workhorses: gp3 and gp2
For probably 90% of what you’ll do, you’ll be choosing between gp3 and its predecessor, gp2. The gp stands for “General Purpose,” and they’re the sensible Toyota Camrys of the EBS world—reliable, performant, and not too flashy.
gp2 was the default for years, and its performance model was… interesting. Your IOPS (Input/Output Operations Per Second) were directly tied to the size of the volume. You got 3 IOPS per GiB, up to a cap of 16,000. Need more performance? You had to over-provision storage you didn’t want. It was like being forced to buy a bigger car just to get a faster engine.
Enter gp3. This is the volume type we deserved all along. With gp3, AWS finally decoupled performance from capacity. You get a baseline of 3,000 IOPS and 125 MiB/s of throughput regardless of volume size. You can then scale the IOPS up to 16,000 and the throughput up to 1,000 MiB/s independently, and for a reasonable price. This is a no-brainer. Unless you’re dealing with a legacy system or some bizarre cost optimization, always choose gp3. You’ll get better performance for less money 99% of the time.
Here’s how you create one. Notice I’m not even bothering with gp2.
# Create a 100 GiB gp3 volume in us-east-1a
aws ec2 create-volume \
--volume-type gp3 \
--size 100 \
--availability-zone us-east-1a
That’s it. You get your 3000 IOPS/125 MiB/s baseline. To provision more performance:
# Create a 100 GiB gp3 volume with 10,000 IOPS and 500 MiB/s throughput
aws ec2 create-volume \
--volume-type gp3 \
--size 100 \
--iops 10000 \
--throughput 500 \
--availability-zone us-east-1a
The SSD Nitro: io2 Block Express
Now, let’s say you’re running a massive Oracle database or that one monolithic application your company can’t live without. You need more than gp3 can offer. You need low-latency, high durability, and more than 16,000 IOPS. You need io2 Block Express.
This isn’t your grandfather’s EBS. io2 (and its more powerful sibling, io2 Block Express) is built on the Nitro System, which is AWS’s secret sauce for giving you bare-metal-like performance in the cloud. The key here is that io2 volumes can push up to 256,000 IOPS and 4,000 MiB/s of throughput with latency that’s 60% lower than gp3. They also boast a ludicrous 99.999% durability.
The catch? You pay out the nose for it. This is Ferrari territory. You also have to be running on a Nitro-based EC2 instance type (which is most of them these days, but always check).
# Creating a beast-mode io2 Block Express volume
aws ec2 create-volume \
--volume-type io2 \
--size 500 \
--iops 40000 \
--availability-zone us-east-1a
The HDD Options: st1 and sc1
On the other side of the tracks, we have the HDD volumes: st1 (Throughput Optimized HDD) and sc1 (Cold HDD). Forget about IOPS with these; their currency is throughput, measured in MiB/s. They’re cheap, but they’re for sequential workloads only—think big log files, ETL jobs, or data backups. If you try to run a database with random reads/writes on one of these, it will perform so poorly I will personally sense a disturbance in the Force.
st1 is the more performant of the two, capable of 500 MiB/s throughput per volume. sc1 is the absolute bottom of the barrel, the cheapest storage AWS will sell you that’s still technically attached to a computer. Use it for data you need to access infrequently and only in large, sequential streams.
A crucial pitfall: HDD volumes have a performance credit system. They burst to their max throughput, but to sustain that, they need to “recharge” by being idle. If you try to hammer them with constant data transfers, you’ll exhaust the credits and your throughput will plummet to a measly 40 MiB/s on st1 or a truly pathetic 12 MiB/s on sc1. They’re moody that way.
# Creating a large, cheap st1 volume for log storage
aws ec2 create-volume \
--volume-type st1 \
--size 2000 \ # 2 TiB
--availability-zone us-east-1a
The best practice is simple: use gp3 for almost everything. Use io2 Block Express for your most performance-sensitive, IOPs-hungry databases. Use st1 for your big data throughput workloads, and only use sc1 if you’re on a budget so tight you’re considering reusing coffee grounds. And for heaven’s sake, make sure your volume’s AZ matches your EC2 instance’s AZ. They can’t talk across Availability Zones, and the error message you get is about as helpful as a screen door on a submarine.