37.3 IAM Roles for Service Accounts (IRSA)
Alright, let’s talk about IAM Roles for Service Accounts, or IRSA. This is, without a doubt, one of the best things to happen to Kubernetes on AWS. Before IRSA, giving a pod permissions to, say, access an S3 bucket was a bit of a nightmare. You’d have to give the EC2 instance running your worker nodes a massive IAM role with all the permissions any pod on that node could ever need. It was the equivalent of handing out the master key to the entire building to every single tenant. Horrifying from a security perspective, and a compliance auditor’s worst nightmare.
IRSA fixes this by letting you assign an IAM role directly to a Kubernetes service account. Your pod uses this service account, and presto – it gets fine-grained, specific AWS permissions, and nothing else. It’s the principle of least privilege actually implemented, not just a nice idea we talk about in meetings.
How the Magic (Actually, Just Clever Engineering) Works
It feels like magic, but it’s really just a clever combination of a few AWS services. Here’s the play-by-play:
You create an IAM OIDC Identity Provider for your cluster. This is a one-time setup that tells AWS: “Hey, trust this Kubernetes cluster (specifically its API server) when it vouches for someone.” You point it to your cluster’s
.idp.us-east-1.amazonaws.comURL. EKS makes this easy; it’s just a few CLI commands or clicks in the console.You create an IAM role with a trust policy that allows this new OIDC provider to assume it. The critical part here is the
Conditionthat checks the service account name and namespace. This is the bouncer that checks the ID.You annotate your Kubernetes Service Account with the exact ARN of the IAM role you just created. This is the link between the K8s object and the AWS object.
EKS does the heavy lifting. When a pod runs with that annotated service account, the EKS node agent (which has access to the EC2 instance’s way more limited role) calls the STS service, presents a signed service account token, and requests temporary credentials for the IAM role you specified. These credentials are then injected into your pod as a volume mount.
Let’s make this concrete. Say you have a pod that needs read-only access to a specific S3 bucket.
First, create the IAM role with a trust policy. Notice the StringLike condition—it’s ensuring only pods using the default service account in the my-app namespace can assume this role.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLEClusterID"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"oidc.eks.us-east-1.amazonaws.com/id/EXAMPLEClusterID:sub": "system:serviceaccount:my-app:default"
}
}
}
]
}
Then, attach a permission policy to the role (e.g., AmazonS3ReadOnlyAccess or a custom one for a specific bucket).
Now, create the service account in Kubernetes, annotating it with the role ARN.
apiVersion: v1
kind: ServiceAccount
metadata:
name: default
namespace: my-app
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/MyAppS3ReadOnlyRole
Finally, deploy your pod, ensuring it uses this service account.
apiVersion: v1
kind: Pod
metadata:
name: my-pod
namespace: my-app
spec:
serviceAccountName: default # This matches the ServiceAccount we created
containers:
- name: my-app
image: my-app:latest
# Your app will now find AWS credentials in /var/run/secrets/eks.amazonaws.com/serviceaccount/
# The AWS SDKs and CLI automatically discover and use these!
The beauty is that the AWS SDKs and CLI automatically look for these injected credentials. Your code doesn’t need to change a single line. It just works.
The Gotchas and Rough Edges
This is brilliant, but it’s not perfect. Here’s what they don’t always highlight in the marketing copy:
- The OIDC Provider is Regional. If you delete your cluster and recreate it with the same name in a different region, your OIDC provider URL changes. All your trust policies will break. You have to manage the provider’s lifecycle alongside your cluster.
- Trust Policy Conditions are Inflexible. You can’t use wildcards in the middle of a service account path. Want a role assumable by any service account in a namespace? You can use
...:sub": "system:serviceaccount:my-namespace:*". But you can’t dosystem:serviceaccount:*:some-specific-sa. Plan your naming conventions accordingly. - There’s a (Theoretical) STS Throttling Limit. Every pod startup calls STS. If you have a thousand pods starting simultaneously (e.g., during a massive scale-out event), you could hit STS API rate limits. It’s rare, but for hyper-scale scenarios, look into the IRSA credential cache (a sidecar project) or be mindful of your pod startup burstiness.
- Debugging is a Pain. If it doesn’t work, you’re now debugging a chain of four systems: IAM, EKS, the service account, and your pod. The error messages aren’t always helpful.
aws sts assume-role-with-web-identityis your friend for testing the trust policy manually.
Despite these quirks, IRSA is an absolute game-changer. Use it. Never go back to giving your worker nodes broad permissions. Your security team will thank you, and you’ll sleep better at night.