12.4 Throttling: Default Limits, Usage Plans, and API Keys
Right, throttling. This is where we move from “Hey, my API works!” to “Oh god, my API is on fire and my wallet is melting.” Throttling is your primary defense against both accidental traffic floods and malicious denial-of-wallet attacks. AWS gives you a few tools here, and they work together in ways that are, frankly, a bit convoluted. Let’s untangle them.
First, you need to understand the two main layers of throttling you’re dealing with: the hard, unchangeable account-level limits, and the more flexible, configurable limits you set up for your customers.
The Unforgiving Account-Level Throttle
Before any of your fancy usage plans even get a say, every single request to your API Gateway hits a brutal, global throttle for your entire AWS account. By default, this is a steady-state rate of 10,000 requests per second (rps) with a burst capacity of 5,000 requests. Yes, you read that right. “Burst” here means that if you’ve been quiet, you can slam it with up to 5,000 requests in a single instant, and then it will drain that burst bucket and settle back to the steady 10k/sec.
This limit is not negotiable via the console. You can’t change it yourself. If you need more—and for any serious production service, you probably will—you have to file a support ticket and beg. The important thing to remember is that this throttle is always on. It’s the first bouncer at the door, and if your API gets too popular too fast, it will start handing out 429 Too Many Requests errors like candy on Halloween, and your usage plans won’t save you.
Usage Plans & API Keys: Throttling Your Users
Now, the part you actually configure. A usage plan is where you define how much traffic a user (or more accurately, a consumer) of your API is allowed to send. The key here is that these limits are per API key. You attach a usage plan to one or more API stages, and then you attach API keys to that usage plan.
Think of it this way:
- The Usage Plan is the rulebook: “Thou shalt not send more than 100 requests per second.”
- The API Key is the ID card that identifies a user who has to follow that rulebook.
Here’s the kicker, and it’s the thing everyone misses: API keys are not authentication. I’ll say it again for the people in the back: they are for metering and throttling, not security. Anyone who gets ahold of the key can use it. You still need IAM, Cognito, or a Lambda authorizer for actual auth. The API key is just a way to track who’s making the requests so we can apply the throttle.
Let’s build one. First, create a usage plan via the CLI. It’s faster than clicking through 17 console pages.
# Create the usage plan. Note the throttling settings.
aws apigateway create-usage-plan \
--name "My-Starter-Plan" \
--description "For new users getting started" \
--api-stages apiId=YOUR_API_ID,stage=prod \
--throttle burstLimit=100,rateLimit=50 \
--quota limit=10000,period=MONTH
Now, create an API key and link it to that plan.
# Create the API key itself
aws apigateway create-api-key \
--name "My-User-Key" \
--enabled
# Note the key ID from the output, then link it to the plan
aws apigateway create-usage-plan-key \
--usage-plan-id YOUR_PLAN_ID \
--key-id YOUR_KEY_ID \
--key-type API_KEY
The burstLimit and rateLimit here work exactly like the account-level throttle, but scoped to this single key. In this example, a user can burst to 100 requests but is limited to a steady 50 requests per second. The quota is a separate monthly limit of 10,000 requests total.
The Crucial Gotcha: Enforcing the Damn Key
Here’s the part that feels like a design flaw: simply creating the usage plan and key does nothing. By default, your API routes don’t require an API key. You must go back and explicitly tell each method to require one. If you don’t, users can call your API without a key and completely bypass all the throttling rules you just set up. It’s madness.
You do this in the Method Request settings for each HTTP method in the console, or via the CLI:
# For a specific API resource and method, update it to require an API key
aws apigateway update-method \
--rest-api-id YOUR_API_ID \
--resource-id YOUR_RESOURCE_ID \
--http-method GET \
--patch-operations op="replace",path="/apiKeyRequired",value="true"
Now, any request to GET /your-resource must include an x-api-key header. Requests without it will be rejected with a 403 Forbidden. Requests with an invalid or disabled key also get a 403. And only requests with a valid key will count against your usage plan and, if they exceed its limits, get a 429.
Best Practices and Pitfalls
- The Burst Bucket Drains: If a user hits their burst limit, they can’t burst again until their bucket “refills.” This happens at a rate equal to their
rateLimit. So a user with arateLimit=10gets aburstLimit=10. If they use that entire burst, it will take a full second of staying under the rate limit to get a single burst token back. This is a token bucket algorithm, and it’s why short, violent bursts are punished with a longer cool-down. - Monitor Your Account-Level Throttle: Use CloudWatch to watch for
429errors on theAWS/ApiGatewaynamespace. If you see them and your usage plans are fine, you’re hitting that global 10k RPS limit. Time to call AWS. - Keys Are Not Secrets: Treat API keys like public identifiers, not passwords. They are often exposed in client-side code. Their only job is to allow you to apply a throttle. If you need to protect a backend endpoint, use IAM credentials or a signed request.
- WebSocket APIs Are Different: For WebSocket APIs, throttling is applied on the connection level, not per message. The limit is on the number of messages per second that can be sent to the client from your backend. It’s a whole different ballgame to prevent you from spamming your users’ sockets into oblivion.
The system is powerful, but it’s a bit like a Rube Goldberg machine. You have to flip levers in the right order, or the whole thing just sits there. Set your usage plan, create your keys, and for the love of all that is holy, remember to set apiKeyRequired=true on your methods. Your wallet will thank you later.