3.2 Trust Policies: Defining Who Can Assume a Role
Alright, let’s talk about the one thing standing between you and a full-blown security incident: the trust policy. This is the “who” and “how” of your IAM role. Think of the role itself as a set of super-powered permissions—a fancy costume, like Batman’s suit. The trust policy is the bouncer at the door of the Batcave who decides who gets to put that suit on. It defines which principal (a user, another role, or an AWS service) is allowed to assume this role. Without a properly configured trust policy, that powerful role is just a useless, locked-up set of permissions. No bouncer, no party.
The Anatomy of a Basic Trust Policy
At its heart, a trust policy is a JSON document that grants the sts:AssumeRole action. Let’s break down a simple, yet classic, example: letting an EC2 instance assume a role.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
See what’s happening here? The Principal is the AWS EC2 service itself. We’re not trusting a specific instance; we’re trusting the EC2 service to grant this permission to any of its instances. The Action is always sts:AssumeRole—this is the magic incantation that makes the whole thing work. The Effect is “Allow” because, well, we’re trying to let someone in. You’ll almost never use “Deny” in a trust policy; that’s better handled elsewhere, like with SCPs.
The Principal Field: It’s All About Trust
The Principal block is the most important part. Get this wrong, and you’re either locking yourself out or letting the entire internet in. You can specify:
- AWS ARNs:
"Principal": { "AWS": "arn:aws:iam::123456789012:user/alice" }to trust a specific user. You can also use an account root (arn:aws:iam::123456789012:root) to trust any identity in that entire account. Use this with extreme prejudice; it’s a ridiculously broad grant of trust. - Services:
"Service": "ec2.amazonaws.com"as above. Other common ones arelambda.amazonaws.com,states.amazonaws.com(for Step Functions), ands3.amazonaws.com(for bucket event notifications). - Federated Users: This is for when you’re using identity federation (like SAML 2.0 or web identity federation with Login with Amazon, Facebook, etc.). It looks like
"Federated": "cognito-identity.amazonaws.com".
A crucial, often-missed point: the principal is who you are trusting to make the AssumeRole call. The principal in the trust policy is not the one who will end up with the permissions; it’s the entity that is allowed to delegate those permissions. It’s a subtle but critical distinction.
Leveling Up with Conditions
The basic policy is fine, but the real world is messy. What if you only want to trust instances from a specific AWS account? Or only under certain circumstances? This is where conditions save your bacon. Conditions add fine-grained control to your trust relationship.
Let’s say you want to allow an instance in your own account to assume a role, but only if it has a specific tag, Environment=Prod, attached to its IAM Instance Profile. This is a fantastic practice because it adds a layer of attribute-based security.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"aws:PrincipalTag/Environment": "Prod"
}
}
}
]
}
Notice I used the account root as the principal but immediately narrowed the scope with a condition. The condition aws:PrincipalTag/Environment checks the tag on the IAM identity (the instance profile role) that is making the AssumeRole request. This is infinitely better than just trusting the whole account. Other incredibly useful conditions include:
aws:SourceArnandaws:SourceAccount: To ensure the request originates from a specific resource or account. Vital for locking down roles used by services like S3 and EventBridge.aws:RequestedRegion: To only allow the assume role action if it’s being called from a specific region.aws:SourceIp: For roles assumed by humans, you can restrict the assume action to only come from your corporate IP range.
The Confused Deputy Problem (And How to Solve It)
Here’s a genuinely absurd but critical scenario. Let’s say your role has a trust policy that allows s3.amazonaws.com to assume it, so S3 can put objects into it for event notifications. What’s to stop another AWS customer, in their own account, from configuring their S3 bucket to send events to your S3-related role? That’s the “Confused Deputy” problem: your service (the deputy) gets tricked into using its power on behalf of a malicious actor.
The fix is elegant and non-negotiable: use the aws:SourceAccount condition.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"aws:SourceAccount": "123456789012"
}
}
}
]
}
This condition says “Sure, S3 service, you can assume this role, but only if the API call you’re making on behalf of a bucket is for an event coming from my account (123456789012).” It slams the door shut on the confused deputy attack. You should be doing this for any service-linked role where cross-account shenanigans are possible.
Common Pitfalls and Best Practices
- The Principal / Condition Mismatch: The most common facepalm moment. You specify a very narrow principal (like a specific user ARN) and then add a broad condition (like
aws:SourceIp). Remember, the principal must first be allowed to assume the role; the conditions are additional filters. If the principal isn’t in the list, the conditions are never even evaluated. - Assuming vs. Using a Role: I’ll say it again: the trust policy governs who can assume the role. The permission policy attached to the role governs what it can do after assumption. These are two completely separate policies with separate jobs. Don’t mix them up.
- Testing in the Real World: The IAM policy simulator is a great tool, but it doesn’t simulate the
AssumeRoleAPI call with all its context (like tags on the principal). The only way to truly test a complex trust policy is to actually try to assume the role from the principal you’re expecting. Get comfortable with theaws sts assume-rolecommand in the CLI. - Keep it Simple, Seriously: I’ve seen trust policies that are Rube Goldberg machines of nested conditions. If you can’t explain it to a colleague in 30 seconds, it’s probably too complex and a future-you will hate past-you when it breaks at 3 AM. Use the narrowest principal possible and add only the conditions you absolutely need. Your security posture (and your sanity) will thank you.