15.7 EBS Performance: IOPS, Throughput, and the Nitro System

Right, let’s talk about making your EBS volumes go fast. Because if you just pick a size and hope for the best, you’re going to have a bad time. Performance here boils down to two things you’re constantly balancing: IOPS (Input/Output Operations Per Second) and Throughput (MB/s). Think of IOPS as how many times you can knock on a door, and throughput as how much stuff you can shove through it once it’s open. A tiny, rapid-fire knock isn’t moving a sofa.

The Performance Levers: IOPS, Throughput, and Size

EBS gives you two main volume families: the general-purpose workhorse (gp3) and the provisioned-performance thoroughbred (io2/io1). For decades, gp2 was the default, and its performance was tied directly to its size in GiB. This was, frankly, a bizarre design choice. Want more performance? You had to pay for more storage you didn’t need. Thank goodness gp3 fixed this.

With gp3, performance is decoupled. You get a baseline of 3,000 IOPS and 125 MB/s, and you can provision more IOPS and throughput independently of the volume size. Need 10,000 IOPS on a 50 GiB volume? You can do that. This is how it should have always worked.

The io2/io1 volumes are for when you have a serious database or other IO-intensive workload. You directly provision the IOPS you need (up to a frankly silly 256,000 IOPS on an io2 Block Express volume), and you pay for that performance.

Here’s the reality check: your chosen EC2 instance type has its own network bandwidth limits. You can provision a volume capable of 4,000 MB/s, but if you attach it to a t3.micro, you’re going to get t3.micro performance. The instance is the chokepoint.

Let’s see how you check this in practice. This AWS CLI command shows you the performance limits of your instance.

# Describe your instance type's capabilities. Replace with your actual instance type.
aws ec2 describe-instance-types --instance-types m5dn.8xlarge --query "InstanceTypes[0].EbsInfo"

Look for "EbsOptimizedThroughput" (MB/s) and "EbsOptimizedIops" (IOPS) in the output. Your aggregate EBS performance across all volumes cannot exceed these limits.

The Nitro System: Why This Actually Works Now

You might be wondering, “Why does any of this work reliably?” The answer, and the reason EBS performance isn’t a complete joke anymore, is the Nitro System. In the old days, hypervisor overhead was a real performance killer. Nitro offloads the virtualization and I/O to dedicated hardware and lightweight hypervisors. This means when your instance talks to its EBS volume, it’s talking over a dedicated network channel with minimal overhead, getting you much closer to the performance you’re actually paying for. It’s the unsung hero of modern EC2. If your instance is not Nitro-based (check the AWS docs), just assume your EBS performance will be worse. Use Nitro.

Provisioning in Practice: A Code Example

Enough theory. Let’s create a volume that doesn’t suck. Here’s how you provision a gp3 volume with more than the baseline performance using the AWS CLI. Notice we set IOPS and throughput separately from size.

# Create a 100 GiB gp3 volume with 10,000 IOPS and 250 MB/s throughput
aws ec2 create-volume \
    --volume-type gp3 \
    --size 100 \
    --iops 10000 \
    --throughput 250 \
    --availability-zone us-east-1a \
    --tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=My-High-Perf-Volume}]'

# For an io2 volume, you MUST specify the IOPS. Throughput is calculated based on IOPS (256 KiB per IOP).
aws ec2 create-volume \
    --volume-type io2 \
    --size 500 \
    --iops 20000 \
    --availability-zone us-east-1a

The Gotchas: Where They Get You

The Burst Balance Trap: gp3 has a burst bucket for throughput, not IOPS. Its baseline is 125 MB/s. It can burst to 1000 MB/s, but it consumes a credit balance to do so. If you sustain high throughput, you’ll drain the bucket and crash back to 125 MB/s until it refills. If you need sustained high throughput, you must provision it. A common pitfall is seeing great performance for a few minutes, then wondering why your batch job grinds to a halt later.
The 1 MiB Rule: AWS bills for IOPS, but defines one IOP as one 16 KiB read or 4 KiB write. However, to achieve maximum throughput, you need to use larger I/O sizes. If your application does a lot of 4 KiB writes, it will never hit the maximum MB/s throughput of your volume because it’s not moving enough data per operation. This is an application-tuning issue.
The 64 KiB Joke: For gp3, throughput provisioned over 125 MB/s is only available for I/O operations larger than 64 KiB. If your workload does mostly small I/O (e.g., database transaction logs), you might pay for 500 MB/s and never see it. It’s a classic “read the fine print” moment.

So, the golden rule: don’t guess. Use CloudWatch metrics like VolumeReadOps, VolumeWriteOps, VolumeReadBytes, and VolumeWriteBytes to see what your application is actually doing, then provision accordingly. It’s the only way to avoid both wasting money and suffering terrible performance.