Right, so you’ve built this magnificent, intricate castle in the sky with CloudFormation. It’s a thing of beauty. Now, imagine handing the keys to a well-meaning but caffeine-deprived colleague at 4 PM on a Friday and saying, “Sure, go ahead and update the production database instance type.” You feel that? That cold shiver down your spine? That’s what a stack policy is for.

A stack policy is essentially a giant “HANDS OFF” sign you can slap on specific resources within your CloudFormation stack. It’s a JSON document that defines which resources are allowed to be updated and, more importantly, which ones absolutely are not. When you apply one, CloudFormation will outright refuse any stack update that includes a change to a protected resource. It won’t ask for confirmation; it will just fail the update with a loud, satisfying “ACCESS DENIED.” This is your last line of defense against an accidental terraform apply-level oopsie in your AWS account.

The Anatomy of a (Very Paranoid) Policy

Let’s crack open this JSON and see what makes it tick. The structure is blessedly simple, relying on IAM-like policy statements. Here’s the policy I use to protect the crown jewels—my production RDS database and its associated security group.

{
  "Statement" : [
    {
      "Effect" : "Deny",
      "Action" : "Update:Replace",
      "Principal": "*",
      "Resource" : "LogicalResourceId/ProductionDatabase",
      "Condition" : {
        "StringEquals" : {
          "ResourceType" : ["AWS::RDS::DBInstance"]
        }
      }
    },
    {
      "Effect" : "Deny",
      "Action" : "Update:*",
      "Principal": "*",
      "Resource" : "LogicalResourceId/ProductionDBSecurityGroup"
    },
    {
      "Effect" : "Allow",
      "Action" : "Update:*",
      "Principal": "*",
      "Resource" : "*"
    }
  ]
}

Notice the two Deny statements and one very permissive Allow? The policy evaluator logic is “deny-by-default” for anything you explicitly list. That final Allow statement is crucial; it’s your escape valve. It says, “Deny updates to these specific nightmares, but allow updates to everything else in the stack.” Without it, you’d effectively lock down the entire stack, which is probably not what you want.

The Two Flavors of Action

This is where most people’s eyes glaze over, but stick with me. The Action field isn’t about API actions like DeleteTable; it’s about the type of CloudFormation update being attempted. You’ve got two main choices:

  • Update:Replace: This is the big, scary one. It means the update requires CloudFormation to delete the physical resource and create a new one to replace it. This is what happens when you change the name of an RDS instance or an EC2 instance’s ImageId. Protecting against this is your #1 priority.
  • Update:*: This is a catch-all for any update, including less destructive ones like Update:Modify (e.g., changing a security group rule description) or Update:NoInterruption (e.g., changing a Lambda function’s memory size). Use this for resources so fragile you don’t even want to risk a non-disruptive tweak.

How to Actually Apply This Thing

You can attach a policy when you create a stack using the AWS CLI or console. But the real magic is applying it to an existing, live stack. You do this with the set-stack-policy command. It’s terrifyingly simple.

aws cloudformation set-stack-policy \
  --stack-name MyProductionStack \
  --stack-policy-body file://./prod-stack-policy.json

And just like that, your resources are armored. To prove it works, try making a change in your template that would trigger a replacement of your protected resource and run an update. CloudFormation will roll back before it even tries, showing a failure reason that should warm the cockles of your paranoid heart: "Update of resource <Resource> is not permitted by stack policy."

The Gotchas and the Gripes

Now, the honest part. Stack policies are powerful, but they have some… idiosyncrasies. Let’s call them “character-building features.”

  1. No Wildcards in Resource Names. This is the big one, and it’s frankly absurd. You cannot use a wildcard (like ProductionDB*) to protect multiple resources. You must list every single Logical Resource ID explicitly in its own statement. This leads to brutally long, repetitive policy documents. The designers clearly prioritized simplicity over practicality here. You’ll want to use a tool like jq or a scripting language to generate these policies for large stacks.
  2. It’s All or Nothing (Temporarily). Need to do a legitimate update to a protected resource? You have two choices, both clunky: you can either remove the stack policy entirely (set-stack-policy with an empty policy document), perform the update, and then re-apply it. Or, you can use the --no-execute-changeset flag to see what will fail, then create a temporary policy that allows just that one update. It’s not elegant, but it works.
  3. It Only Applies to Stack Updates. This is critical to understand. A stack policy will NOT stop someone with direct IAM permissions from going into the RDS console and deleting the database manually. It only governs actions performed through CloudFormation. This is a fence around the CFN workflow, not a substitute for proper IAM and backup policies. You still need those.

The best practice? Use them judiciously. Protect the truly irreplaceable: your stateful resources like databases, EFS file systems, and maybe your core network components. For everything else, rely on good change management processes and pre-update change sets. They are the blunt instrument in your toolbox—incredibly effective, but you need to be careful you don’t smash your own fingers with it.