8.4 Scaling Policies: Target Tracking, Step Scaling, Simple Scaling

Right, so you’ve got your Auto Scaling Group (ASG) set up. It’s got your instances, it knows which subnets to use, it’s all looking good. But now we get to the real magic: telling it when to scale. This is where you move from just having a group of instances to having a genuinely intelligent, reactive system. Or, you know, you create a terrifying feedback loop that spins up a thousand instances and bills your company for a small moon-landing mission. Let’s avoid that second one.

The three policies you’ll deal with are Simple, Step, and Target Tracking Scaling. Think of them as a spectrum from “I have a very specific, clunky need” to “I just want things to work, please.”

The Dinosaur: Simple Scaling

We’ll start with Simple Scaling because it’s, well, simple. And also because you should almost never use it for anything new. It’s the legacy option. Here’s how it works: you say “if metric X is above threshold Y for a period of Z, then add one instance.” Then, it adds the instance and goes into a cooldown period, where it refuses to take any further scaling actions, like a teenager who needs to decompress after taking out the trash.

The problem is glaringly obvious. What if adding one instance wasn’t enough? The alarm will still be breaching, but the ASG is just sitting there in its cooldown, twiddling its thumbs, while your application melts. You have to wait for the entire cooldown to expire before it will even consider adding another instance. It’s hopelessly inefficient for anything with rapid or variable load changes.

You might see it in insanely old CloudFormation templates, so here’s what it looks like. But consider it a history lesson, not a best practice.

SimpleScalingPolicy:
  Type: AWS::AutoScaling::ScalingPolicy
  Properties:
    AutoScalingGroupName: !Ref MyASG
    PolicyType: SimpleScaling
    AdjustmentType: ChangeInCapacity
    ScalingAdjustment: 1
    Cooldown: 300

The Workhorse: Step Scaling

Step Scaling is Simple Scaling’s smarter, more capable sibling. Instead of one clunky action, you define a series of steps. “If the metric is a little high, add a few instances. If it’s really high, add a lot of instances.” Crucially, it can take multiple actions during a single alarm evaluation without waiting for a cooldown, which is what you actually want.

You define these steps in an StepAdjustment array. The MetricIntervalLowerBound and UpperBound define the bucket for the breach amount. A LowerBound of 0 and an UpperBound of 20 means “if we’re between 0 and 20 units above the alarm threshold, do this.”

Here’s a JSON policy for a CloudWatch alarm that would trigger this. This is a common and robust configuration.

{
  "AlarmName": "High-CPU-Step-Scaling",
  "AlarmDescription": "Scale out based on CPU",
  "Metrics": [
    {
      "Id": "m1",
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/EC2",
          "MetricName": "CPUUtilization",
          "Dimensions": [{"Name": "AutoScalingGroupName", "Value": "my-asg"}]
        },
        "Stat": "Average",
        "Period": 60
      },
      "ReturnData": false
    }
  ],
  "ComparisonOperator": "GreaterThanThreshold",
  "Threshold": 70,
  "EvaluationPeriods": 2,
  "DatapointsToAlarm": 1,
  "AlarmActions": [
    "arn:aws:autoscaling:us-east-1:123456789012:scalingPolicy:policy-id:autoScalingGroupName/my-asg:policyName/StepScaleOut"
  ]
}

And the corresponding step adjustment in your ASG policy would look like this:

- MetricIntervalLowerBound: 0
  MetricIntervalUpperBound: 10
  ScalingAdjustment: 1
- MetricIntervalLowerBound: 10
  ScalingAdjustment: 2

This says: “Between 0-10% above 70% CPU? Add 1 instance. More than 10% above? Add 2 instances.” The power here is the immediate, proportional response.

The Set-it-and-Forget-it: Target Tracking

This is the one you’ll use 95% of the time because it’s brilliant. You don’t muck about with alarms and steps. You just tell the ASG, “Hey, keep the average CPU at 40%.” That’s it. It automatically creates the hidden CloudWatch alarms and adjusts itself to maintain that target.

It’s a beautiful feedback loop. It will add capacity more aggressively the further you are from the target, and it’s smart enough to not overcorrect. It’s the closest thing to magic in AWS.

The catch? You have to choose a good metric. CPU and Network In/Out are built-in and easy. For anything else, like request count per instance, you need a custom metric. Also, if your application is slow to boot, target tracking might get a bit eager and scale out more than needed before the new instances even start contributing, so pair it with a warm pool or instance weighting if that’s a concern.

Here’s how you’d slap a target tracking policy on an ASG with CloudFormation. It’s almost embarrassingly simple.

TargetTrackingPolicy:
  Type: AWS::AutoScaling::ScalingPolicy
  Properties:
    AutoScalingGroupName: !Ref MyASG
    PolicyType: TargetTrackingScaling
    TargetTrackingConfiguration:
      PredefinedMetricSpecification:
        PredefinedMetricType: ASGAverageCPUUtilization
      TargetValue: 40.0
      DisableScaleIn: false

That DisableScaleIn is a useful trick. Setting it to true means it will only scale out, not in. This is great for testing a new policy without the fear of it suddenly terminating all your instances while you’re still watching it.

The best practice? Start with Target Tracking. It’s the right choice for most workloads. Only drop down to the complexity of Step Scaling if you have a very specific, non-average metric you need to target or if you need the explicit control of the step adjustments. And leave Simple Scaling in the history books where it belongs.