37.8 EKS Cost Optimization: Spot Instances and Karpenter

Right, let’s talk about saving money. Because let’s be honest, the only thing more terrifying than your Kubernetes cluster melting down is the bill for the cluster that’s sitting there doing nothing. AWS is happy to sell you on-demand instances that you pay for 24/7, but we’re smarter than that. We’re going to harness two of AWS’s most powerful cost-saving tools: Spot Instances and Karpenter. One is a deeply discounted fire sale on compute capacity, and the other is the brilliant, ruthless robot that knows how to shop it.

The Beautiful Chaos of Spot Instances

First, a quick reality check. Spot Instances are AWS’s way of selling off their unused capacity, sometimes for a 90% discount. The catch? They can be yanked away from you with a two-minute warning (a “reclamation notice”) whenever AWS needs the capacity back. People hear this and panic. “My nodes will just vanish! My applications will crash!”

Welcome to the cloud, my friend. Your nodes can vanish for a dozen other reasons already. The entire point of Kubernetes is that your workloads are resilient, not your nodes. If the idea of a node disappearing gives you heart palpitations, you’ve got a much bigger problem with your application design than your billing strategy. We design for this. We expect failure. Spot is just another type of failure, and crucially, it’s one that gives us a heads-up.

The trick isn’t to avoid Spot; it’s to use it intelligently. You don’t run your stateful database on a Spot Instance (unless you’re a true chaos engineer). You run your stateless web apps, your batch jobs, your image processing workloads—the things that can be easily restarted elsewhere—on Spot.

Enter Karpenter: The Spot-Savvy Psychopath

The old way to handle this was with the Cluster Autoscaler. It worked, but let’s be generous and call it… deliberate. It scales node groups, which are collections of nodes of the same instance type. To get a good mix of Spot Instances, you had to create a dizzying array of node groups for different instance types and availability zones. It was a manual, tedious mess.

Karpenter is the antithesis of that. It doesn’t think in node groups. It thinks in direct API calls to EC2. You tell it your requirements (e.g., “I need a machine with at least 4 CPUs and 16GB of RAM”), and it immediately goes out, looks at all the instance types and availability zones in a region, and picks the cheapest, most available option that fits. It’s a hyper-efficient, sociopathic shopper whose only goal is to get you the compute you need at the lowest possible price, and it does it in seconds, not minutes.

Its true genius is in harnessing Spot. Since it evaluates the entire EC2 marketplace instantly, it can naturally gravitate towards the most stable and cheapest Spot capacity. When a Spot Instance gets the reclaim notice, Karpenter sees the pod that’s about to be evicted and immediately provisions a new replacement node before the old one is even gone. It then cordons and drains the doomed node gracefully. The disruption is minimal to zero.

Installing and Configuring Karpenter

Let’s get it running. First, you need to give it some serious IAM permissions to launch EC2 instances on your behalf. This is the scary part, so we’ll use least privilege. Save this as karpenter-policy.json:

{
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateLaunchTemplate",
                "ec2:CreateFleet",
                "ec2:RunInstances",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeInstanceTypeOfferings",
                "ec2:DescribeInstances",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeSpotPriceHistory",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups"
            ],
            "Resource": "*"
        }
    ],
    "Version": "2012-10-17"
}

Create the policy and attach it to a new IAM role for Karpenter’s ServiceAccount. (Check the official docs for the latest full commands, as they change). Now, install Karpenter via Helm:

helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version v0.36.1 --namespace karpenter --create-namespace \
  --set settings.aws.clusterName=your-cluster-name \
  --set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-your-cluster-name \
  --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::123456789:role/KarpenterControllerRole-your-cluster-name

The Magic: Provisioner and NodePool CRDs

Karpenter’s behavior is governed by a custom resource. In the newer versions, this is a NodePool (and a EC2NodeClass), which is the successor to the Provisioner CRD. I’ll show you the newer API, but be aware the old one is still common. This YAML is where the real power is.

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: spot-workloads
spec:
  template:
    spec:
      requirements:
        - key: "karpenter.k8s.aws/instance-category"
          operator: In
          values: ["c", "m", "r"] # Stick to compute-optimized, general, or memory-optimized
        - key: "karpenter.k8s.aws/instance-generation"
          operator: Gt
          values: ["2"] # Nothing older than 3rd gen, please.
        - key: "karpenter.sh/capacity-type"
          operator: In
          values: ["spot"] # <-- This is the money line. Prefer Spot.
        - key: "topology.kubernetes.io/zone"
          operator: In
          values: ["us-east-1a", "us-east-1b", "us-east-1c"]
        - key: "kubernetes.io/arch"
          operator: In
          values: ["amd64"]
      nodeClassRef:
        name: default
  limits:
    cpu: 1000 # Don't let this thing spend more than 1000 cores worth of money
  disruption:
    consolidationPolicy: WhenUnderutilized # Tell it to actively try to squeeze workloads onto fewer nodes
    expireAfter: 720h # 30d; delete nodes that have been empty too long

---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2 # Amazon Linux 2
  role: KarpenterNodeRole-your-cluster-name # The role for the nodes themselves
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: your-cluster-name
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: your-cluster-name

Apply this, and you’ve just hired your robot shopper. Now, to use it, you just need to tell your pods where they can run.

Directing Pods with Labels and Taints

Karpenter acts on pod events. A pod is created that can’t be scheduled, so Karpenter provisions a node that fits its requirements. To target our spot-workloads NodePool, we use a simple nodeSelector on our deployments.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spot-webserver
spec:
  replicas: 10
  template:
    spec:
      containers:
      - name: webserver
        image: nginx
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
      nodeSelector:
        karpenter.sh/capacity-type: spot # <-- This sends it to our Spot-focused NodePool

For your more sensitive on-demand workloads (or if you have a licensing requirement), you’d create a second NodePool with values: ["on-demand"] and use a taint/toleration system to keep the riff-raff out.

The Gotchas and The Glory

The main pitfall? Instance type diversity. Your application must be able to run on a variety of instance types. If you’ve hard-coded CPU architecture flags or made other assumptions about the underlying hardware, Karpenter will break you until you fix it. This is a feature, not a bug. It forces you to write portable software.

The glory is the savings. It’s not uncommon to see 70-80% reductions in your compute bill. Combine this with Fargate for your tiny, always-on workloads and Reserved Instances for your truly critical, steady-state on-demand needs, and you’ve mastered the EKS cost game. Karpenter isn’t the future; it’s the present. Anyone not using it is just overpaying.