8.2 ASG Configuration: Min, Max, Desired Capacity

Right, let’s talk about the three numbers that actually define your Auto Scaling Group’s personality: Min, Max, and Desired capacity. This is the trifecta, the holy trinity of ASG configuration. Get these wrong, and you’re either hemorrhaging cash on idle instances or frantically paging yourself at 3 AM because your application can’t handle the load. No pressure.

Think of these values as the strict parents, the ambitious dreamer, and the sensible, current state of your fleet.

Min Capacity: The strict parent. This is the absolute minimum number of instances that must be running at all times. Even if it’s 3 AM on a Sunday and a tumbleweed is the only visitor to your site, AWS will fight to keep this many instances healthy and running. Set this to what you need for baseline traffic and, crucially, high availability. If your app needs at least two instances spread across two AZs to not fall over, your min is 2. Going to zero is a bold choice (we’ll get to that).
Max Capacity: The ambitious dreamer’s absolute limit. This is the hard ceiling, the “break glass in case of emergency” upper bound that your ASG will never, ever cross. This is your cost-containment guardrail. You might set this to 10 because that’s what your budget allows, or to 50 because that’s the limit of your application’s database connections. It’s there to save you from a runaway scaling policy that would otherwise try to launch a thousand instances and bill you for a small country’s GDP.
Desired Capacity: The sensible, current state. This is the number of instances your ASG aims to have right now. It’s the Goldilocks zone. When you first create the ASG, this is the number it starts with. This value will dance between min and max based on your scaling policies. But here’s the key thing: you can manually set it, and the ASG will obediently add or remove instances to meet that number, as long as it’s within the min/max bounds. It’s the immediate, manual override.

The Relationship and Initial Setup

When you create an ASG, you set all three. The most common sane default is to set all three to the same number initially. This creates a static, stable fleet. It’s your starting point before you add the dynamic scaling magic.

# Using AWS CLI to create a simple, static ASG first
aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name MyStaticFleet \
    --launch-template LaunchTemplateId=lt-0c8c8a8f8e8e8e8e8,Version='$Latest' \
    --min-size 2 \
    --max-size 2 \
    --desired-capacity 2 \
    --vpc-zone-identifier "subnet-123456,subnet-654321"

See? No scaling policies yet. Just two instances, sitting in two AZs, being reliable. This is how you’d run a critical service that doesn’t see highly variable traffic.

How Scaling Policies Interact

This is where it gets fun. Once you attach a scaling policy (say, to scale out on CPU), the desired capacity becomes a puppet of that policy. The policy will adjust the desired value, and the ASG’s job is to make the reality match the desire.

Let’s say min=2, max=10, and desired=2. A CPU spike triggers your scaling policy, which says “set desired capacity to 6.” The ASG sees that desired (6) is greater than current (2), so it starts launching 4 new instances. If another alarm later says “scale in, set desired to 3,” the ASG will terminate instances until it’s back to 3.

The min and max are the rails on the bowling lane; the desired capacity is the ball, bouncing between them based on the game being played (your scaling rules).

The Pitfalls and “Wait, What?” Moments

Setting Min to 0: This is the serverless dream, right? Pay only for what you use! It’s a fantastic cost-saving measure for pre-production environments, batch processing, or truly spiky workloads. But the gotcha is cold starts. When scaling from zero, you’re waiting for an instance to boot, configure itself, and pass health checks before it can serve traffic. Your latency will spike dramatically during that period. For a user-facing web app, this is often a terrible idea. For a background video processing queue? Perfect.
Ignoring Instance Health: The desired capacity is what you want, but the ASG will only ever try to meet it with healthy instances. If you set desired=4 but you only have 2 healthy instances, it will launch 2 new ones. If you have 6 instances but 2 are unhealthy and desired=4, it will terminate the unhealthy ones first to get to the desired state. Health checks are the foundation everything else is built on.
Manual Interventions: You can absolutely use set-desired-capacity to manually override the scaling policies. This is great for a planned event. But remember: the scaling policies are still there, lurking. In five minutes, an alarm could fire and override your manual change, scaling you right back to where you were. If you need a prolonged manual override, you should also temporarily suspend the scaling policies. Otherwise, it’s like trying to hold a beach ball underwater.

# Manually scale to 5 instances for a deployment
aws autoscaling set-desired-capacity \
    --auto-scaling-group-name MyDynamicFleet \
    --desired-capacity 5

# ...and then suspend scaling so it doesn't fight you
aws autoscaling suspend-processes \
    --auto-scaling-group-name MyDynamicFleet \
    --scaling-processes AlarmNotification

# Do your deployment, then resume and let it handle the scale-in
aws autoscaling resume-processes \
    --auto-scaling-group-name MyDynamicFleet \
    --scaling-processes AlarmNotification

The takeaway? These three numbers are a conversation between you, your budget, your performance requirements, and AWS’s automation. You’re not just setting values; you’re defining the rules of engagement for your entire fleet. Choose wisely.