11.5 Concurrency: Reserved and Provisioned Concurrency

Alright, let’s talk about concurrency. Not the computer science textbook kind, but the “how many copies of your Lambda function can run at the same time” kind. This is where we stop thinking about a single function execution and start thinking about your function as a system. And like any system, it has limits. Buckle up.

First, the big picture: concurrency isn’t just about performance; it’s about availability and cost. Get it wrong, and your beautifully architected serverless application either grinds to a screeching halt or bleeds money while doing nothing. We have two main levers to pull here: Reserved Concurrency and its more sophisticated, slightly pricier cousin, Provisioned Concurrency. They solve very different problems.

The Default: Your Shared Sandbox

By default, all your Lambda functions in an account share a giant, regional concurrency pool. It’s like a public beach—everyone gets to play, but if too many people show up, someone’s getting kicked out (with a 429 Throttled error). Your individual function doesn’t have a guaranteed spot. The default limit for this pool is 1,000 concurrent executions, but you can beg AWS for more.

This is fine for low-volume, non-critical stuff. But for anything important, it’s a recipe for disaster. Imagine your “process payment” function and your “generate weekly report” function sharing this pool. A spike in payments could throttle the report function. Annoying. But worse, a bug in the report function that causes runaway invocations could consume the entire pool and throttle all your other functions, taking your entire application down. Not exactly “fault isolation,” is it?

Reserved Concurrency: Your Private Plot of Land

This is where Reserved Concurrency comes in. It’s your way of telling AWS, “Look, I don’t care what’s happening on the rest of the beach. This specific function must have exactly this many concurrent executions available to it, and it cannot use any more than that.”

You set a hard limit. This solves two huge problems:

It guarantees capacity for that function. No other function can steal its slots.
It acts as a circuit breaker. The function can never exceed its limit, so a runaway function can’t take down the rest of your application.

You set it up in the Terraform or CloudFormation, or clickity-clicks in the console. Here’s how you’d do it with Terraform:

resource "aws_lambda_function" "payment_processor" {
  filename      = "payment_processor.zip"
  function_name = "payment_processor"
  role          = aws_iam_role.lambda_role.arn
  handler       = "index.handler"
  runtime       = "nodejs18.x"

  # This is the crucial part
  reserved_concurrent_executions = 50
}

The pitfall? It’s a hard limit. If you get 51 requests, the 51st gets throttled. You need to understand your function’s traffic patterns and set this number wisely. Also, note that reserving concurrency for one function reduces the amount available in the shared pool for everyone else. It’s not additive.

Provisioned Concurrency: Keeping the Engine Warm

Ah, the infamous “cold start.” You’ve heard of it. It’s that delay when a Lambda function hasn’t been called in a while and AWS needs to spin up a new execution environment—downloading your code, initializing runtimes, connecting to your VPC, etc. For user-facing APIs, it’s a terrible experience.

Provisioned Concurrency is your solution. It’s not a limit; it’s a minimum. You’re telling AWS, “Please, keep at least this number of execution environments for my function pre-warmed, initialized, and ready to go at all times.” Requests that hit a pre-warmed environment bypass the cold start entirely and run with the speed of a traditional server.

This is fantastic for latency-sensitive applications. But it comes with a crucial caveat: you pay for it. You’re paying for the idle compute time of those pre-warmed environments, whether they handle requests or not. It’s the closest thing Lambda gets to a traditional server model.

Setting it up is a two-step dance. First, you create a Provisioned Concurrency configuration, then you tell an alias (like PROD or LIVE) to use it. You never point provisioned concurrency directly at $LATEST; that’s a rookie mistake. You use an alias so you can easily shift traffic or roll back.

resource "aws_lambda_provisioned_concurrency_config" "payment_provisioned" {
  function_name                     = aws_lambda_function.payment_processor.function_name
  provisioned_concurrent_executions = 10
  qualifier                         = "PROD" # Crucial: point to an alias, not $LATEST
}

# You need the alias itself
resource "aws_lambda_alias" "prod" {
  name             = "PROD"
  description      = "Production alias"
  function_name    = aws_lambda_function.payment_processor.function_name
  function_version = "$LATEST" # Or a specific version for safer deployments
}

The big “gotcha” here is application autoscaling. Provisioned Concurrency doesn’t scale instantly. It takes minutes to ready new environments. For a sudden, massive spike, the initial requests will still be handled by the on-demand, cold-start-suffering part of your function until the provisioned pool can scale out. You’re buying a baseline of performance, not an infinite, instant burst. You need to configure application autoscaling to tell AWS how and when to adjust the provisioned count based on metrics, which is a whole other topic.

The Grand Strategy

So, how do you use these together? Simple.

Use Reserved Concurrency on all your production functions as a safety net. It’s cheap insurance against one function starving another.
Use Provisioned Concurrency selectively on the critical, latency-sensitive functions that are worth the extra cost. Your payment API? Probably. The function that processes logs at 3 AM? Absolutely not.

Think of Reserved as your circuit breaker and Provisioned as your performance enhancer. They are not mutually exclusive; in fact, your Provisioned Concurrency count runs within the bounds you set with Reserved Concurrency. They work together to give you both safety and speed. Now go configure them. Your users (and your wallet) will thank you.