19.3 Provisioned vs On-Demand Capacity Mode

Alright, let’s talk about the single biggest question you’ll face when you first set up a table: how are you going to pay for this thing? DynamoDB has two primary billing modes, and choosing the wrong one is a fantastic way to either blow your budget or throttle your application into the stone age. They are Provisioned Capacity and On-Demand Mode. Think of it like hiring a team: do you want a set number of full-time employees (Provisioned) or a temp agency that sends you exactly who you need, exactly when you need them, but charges an arm and a leg for the privilege (On-Demand)?

The Core Concept: RCUs and WCUs

Before we dive into the modes, you have to understand the currency. DynamoDB doesn’t bill you on server size; it bills you on throughput, measured in Read Capacity Units (RCUs) and Write Capacity Units (WCUs).

One RCU gives you one strongly consistent read per second for an item up to 4 KB. If you’re okay with eventually consistent reads (and you often should be), one RCU gets you two of those reads per second. It’s a 2-for-1 sale that’s always on.

One WCU gets you one write per second for an item up to 1 KB. That’s it. No consistency variants here. Writing is just more expensive.

The key thing to remember: these are units of throughput, not units of storage. You could store a petabyte of data and if you never read or write it, your throughput cost is zero. The cost comes from the traffic.

Provisioned Capacity: Predictable Workloads on a Budget

This is the original model. You walk up to the AWS console and declare, “My table shall have 10 RCUs and 5 WCUs!” DynamoDB then provisions exactly that amount of capacity for you, 24/7. You pay for that reserved capacity by the hour, whether you use it or not. It’s significantly cheaper than On-Demand if your traffic is steady and predictable.

The catch? If you exceed your provisioned capacity, your requests will get throttled (with a ProvisionedThroughputExceededException). Your application will start to fail. To handle traffic spikes, you can use auto-scaling, which is basically like setting rules for your cloud team to add or remove capacity based on metrics. It’s not instantaneous—it takes minutes—so you have to plan for gradual ramps, not viral explosions.

# Using Boto3 to create a table with provisioned capacity
import boto3

dynamodb = boto3.client('dynamodb')

response = dynamodb.create_table(
    TableName='MyProvisionedTable',
    KeySchema=[
        {'AttributeName': 'pk', 'KeyType': 'HASH'},  # Partition key
        {'AttributeName': 'sk', 'KeyType': 'RANGE'}   # Sort key
    ],
    AttributeDefinitions=[
        {'AttributeName': 'pk', 'AttributeType': 'S'},
        {'AttributeName': 'sk', 'AttributeType': 'N'}
    ],
    BillingMode='PROVISIONED',  # This is the default, but be explicit!
    ProvisionedThroughput={
        'ReadCapacityUnits': 10,
        'WriteCapacityUnits': 5
    }
)

On-Demand Mode: Unpredictable Workloads at a Premium

Introduced later, On-Demand mode is the “set it and forget it” option. You don’t specify RCUs or WCUs. AWS automatically scales your throughput up and down instantly based on your actual traffic. You just pay per request. The beauty? No more throttling. The beast? The price per request is much higher.

This is perfect for:

New applications with unknown traffic patterns.
Spiky, unpredictable workloads (e.g., a gaming app that goes viral).
Development and test environments where you can’t be bothered to manage scaling.

The biggest pitfall here is cost control. A bug in your code that causes a runaway write loop? That’s going to be a very, very expensive mistake. There are no hard limits to save you from yourself.

# Creating the exact same table, but in On-Demand mode
response = dynamodb.create_table(
    TableName='MyOnDemandTable',
    KeySchema=[
        {'AttributeName': 'pk', 'KeyType': 'HASH'},
        {'AttributeName': 'sk', 'KeyType': 'RANGE'}
    ],
    AttributeDefinitions=[
        {'AttributeName': 'pk', 'AttributeType': 'S'},
        {'AttributeName': 'sk', 'AttributeType': 'N'}
    ],
    BillingMode='PAY_PER_REQUEST'  # The magic words
    # No 'ProvisionedThroughput' key required!
)

The Critical Choice: Which One and When?

Here’s the real talk. Start with On-Demand if you’re just prototyping or if your traffic is genuinely a mystery. It removes a huge operational headache while you figure things out.

Once you understand your traffic patterns, switch to Provisioned Capacity. The cost savings are too significant to ignore for steady-state production workloads. You can often save 60-80% compared to On-Demand. Use CloudWatch metrics to figure out your baseline and set up sane auto-scaling boundaries.

And listen, this is important: you can change your mind. You can switch a table from Provisioned to On-Demand and back again—twice per calendar day. So this isn’t a lifelong commitment. It’s a dial you can adjust as your application evolves. The key is to monitor your costs and your throttling metrics constantly. Don’t just set it and forget it, even if you’re using the mode with that namesake.