Right, let’s talk about giving your pods an identity. Because by default, your pods running in EKS have precisely zero IAM permissions. They’re the digital equivalent of a hermit living off-grid—completely isolated from the AWS universe. You could solve this the old, terrible way: grant the massive, terrifying IAM permissions your app needs to the EC2 instance role of the worker node. Then every pod on that node, from your mission-critical app to that random busybox pod you forgot about, inherits those god-like powers. This is a security nightmare waiting to happen, and we’re not doing that.

This is where IAM Roles for Service Accounts (IRSA) swoops in like a superhero. It lets you assign a specific, finely scoped IAM role directly to a Kubernetes ServiceAccount, which your pod then uses. It’s pod-level IAM permissions. Beautiful, secure, and exactly how this should have worked from day one.

How the Magic (and the Math) Works

It feels like magic, but it’s really just clever cryptography and AWS being… well, a system. Here’s the play-by-play:

  1. You create an IAM OIDC Identity Provider for your cluster. This is essentially you telling AWS: “Hey, trust this specific Kubernetes cluster’s API server. It’s cool.”
  2. You create an IAM role with a trust policy that says: “I will only accept requests from this specific Kubernetes ServiceAccount, and only if it presents a valid token signed by that OIDC provider we just set up.”
  3. You annotate your Kubernetes ServiceAccount with the exact ARN of that IAM role.
  4. EKS automatically injects a special token into any pod using that ServiceAccount. This isn’t a normal ServiceAccount token; it’s a projected token specifically for AWS.
  5. The AWS SDKs inside your pod (which are brilliantly designed to look for this) find this token, the AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE environment variables, and use them to assume the role and fetch temporary credentials.

The key here is the trust relationship. It uses a special StringEquals condition that matches the Kubernetes ServiceAccount’s unique issuer URL and the subject (system:serviceaccount:<namespace>:<sa-name>). AWS validates the token’s signature against the OIDC provider, checks the subject, and if it all lines up, bam, temporary credentials.

Setting It Up: The Prerequisites

Before you can start, you need the OIDC provider enabled for your cluster. If you used eksctl to create your cluster recently, it probably did this for you. If not, or if you used Terraform/CloudFormation, you’ll need to add it. Here’s how to check and enable it with eksctl:

# Check if your cluster has it enabled
aws eks describe-cluster --name your-cluster-name --query "cluster.identity.oidc.issuer" --output text

# If it's not enabled, enable it (this is idempotent)
eksctl utils associate-iam-oidc-provider --cluster your-cluster-name --approve

A Practical Example: Letting a Pod Read S3

Let’s say we have a pod in the data-processing namespace that needs read-only access to a specific S3 bucket. Here’s the entire flow.

First, create the IAM role with a trust policy. Notice the StringEquals condition—it’s the linchpin of the entire security model.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.region.amazonaws.com/id/EXAMPLECLUSTERID"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.region.amazonaws.com/id/EXAMPLECLUSTERID:aud": "sts.amazonaws.com",
          "oidc.eks.region.amazonaws.com/id/EXAMPLECLUSTERID:sub": "system:serviceaccount:data-processing:s3-read-only-sa"
        }
      }
    }
  ]
}

Then, attach a permission policy to the role. Let’s be specific.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-super-important-bucket",
        "arn:aws:s3:::my-super-important-bucket/*"
      ]
    }
  ]
}

Now, the Kubernetes side. Create the ServiceAccount with the critical annotation.

# serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: s3-read-only-sa
  namespace: data-processing
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/my-s3-read-only-role

Finally, deploy a pod that uses it. The AWS SDKs (for JavaScript, Python, Go, etc.) automatically know how to use the injected credentials. You don’t have to configure a thing.

# pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: my-app
  namespace: data-processing
spec:
  serviceAccountName: s3-read-only-sa # This is the key line
  containers:
  - name: main
    image: my-app-image:latest
    # No need to pass AWS credentials manually!

Common Pitfalls and The “Gotcha” Brigade

This is where I earn my keep. Here’s what will go wrong:

  1. The Trust Policy Typo: This is the number one issue. You will mess up the OIDC provider URL, the cluster ID, the namespace, or the ServiceAccount name in the trust policy. Double-check it. Then triple-check it. The error from AWS STS will be utterly unhelpful (“Access Denied”), so your debugging starts here.
  2. The Region Replication Gotcha: IAM is global, but the OIDC provider is regional. If your EKS cluster is in us-east-1 and you try to assume the role from a pod in us-west-2, it will fail. The OIDC provider URL is region-specific.
  3. The SDK Version: Older versions of the AWS SDKs don’t support the web identity token process. If your app is failing, make sure your language’s SDK is reasonably modern. For anything past 2019, you’re almost certainly fine.
  4. The Annotation Isn’t Magic: The annotation only works on the ServiceAccount. You can’t put it on a Pod or a Deployment and expect it to work. The pod must reference a ServiceAccount that has the annotation.

The beauty of IRSA is that it finally gives us the granularity we need. Your redis pod doesn’t need S3 permissions. Your api pod doesn’t need DynamoDB write access. Now you can build a sane, least-privilege security model without resorting to the madness of giving every node the keys to the kingdom. It’s not just a best practice; it’s the only sane way to run serious workloads on EKS.