9.3 Tolerations to Schedule on Tainted Nodes

Right, so you’ve got your DaemonSet humming along, deploying its pod to every node in your cluster. It’s a beautiful thing. But then you run into the real world, and the real world has problems. Some of your nodes are, shall we say, special. Maybe they’re GPU-equipped beasts that cost more than your car, reserved for machine learning workloads. Maybe they’re edge nodes with spotty connections, or they’re just old and cranky and you don’t trust anything but a specific monitoring agent to run on them.

This is where the Kubernetes scheduler’s “taint” mechanism comes in. A taint is basically a node’s way of saying, “Get off my lawn!” It repels all pods that haven’t explicitly said they’re okay with the particular brand of nonsense that node is serving. By default, your pristine DaemonSet pod will be repelled. It sees the taint, says “hard pass,” and the scheduler leaves that node alone.

To get your pod onto that tainted node, you need to give it a matching “toleration.” You’re not removing the taint from the node (other pods should still be repelled!), you’re just giving your specific pod a hall pass.

A toleration isn’t an assignment; it’s a permission. It tells the scheduler, “I’m cool with hanging out on a node that has this particular taint.” The scheduler still does its normal thing based on other constraints, but this one roadblock is removed.

The Anatomy of a Toleration

Let’s look at the spec. It’s deceptively simple, which is where most of the mistakes happen. You slap this into your Pod template spec (or directly into your DaemonSet spec).

tolerations:
- key: "gpu-node"
  operator: "Equal"
  value: "true"
  effect: "NoSchedule"

Let’s break down what these fields actually mean because if you get them wrong, nothing happens and you’ll waste an hour wondering why.

key: This is the name of the taint you’re tolerating. It’s just a string label.
operator: This is the big one. Equal means you’re saying “the taint’s value must exactly match my value field.” Alternatively, you can use Exists, which is the equivalent of saying “I tolerate any taint with this key, I don’t care what its value is.” This is a common source of confusion. Using Exists and specifying a value is invalid and will get you nowhere.
value: The value you expect the taint to have. As mentioned, only used with operator: Equal.
effect: This is the action the taint takes. The most common are NoSchedule (won’t schedule new pods), PreferNoSchedule (a gentle suggestion), and NoExecute (this one is brutal—it evicts already-running pods that don’t tolerate it after a toleration seconds grace period). Your toleration’s effect must match the taint’s effect. Tolerating a NoSchedule taint does nothing for a NoExecute taint. They are different animals.

Tolerating All the `NoExecute` Eviction

This is the one that’ll bite you if you’re not careful. You’ve happily toleranted the NoSchedule effect to get your pod onto the node. Then, later, an admin adds a NoExecute taint for some emergency maintenance. Your pod, which only tolerates NoSchedule, gets unceremoniously evicted. Whoops.

If you really need your DaemonSet pod to stick to a node through thick and thin (e.g., a node-level agent that must always run), you need to tolerate both effects.

tolerations:
- key: "dedicated"
  operator: "Equal"
  value: "special-tenant"
  effect: "NoSchedule"
- key: "dedicated"
  operator: "Equal"
  value: "special-tenant"
  effect: "NoExecute" # This is the eviction-force-field.

For a truly bulletproof DaemonSet that clings to its node like a remora, you’ll often see this pattern, which tolerates every NoExecute taint indefinitely. Use this with extreme caution, as it can prevent necessary node maintenance.

tolerations:
- operator: "Exists" # Tolerate every taint key...
  effect: "NoExecute" # ...with the NoExecute effect.

The Master Node Taint and Your DaemonSet

This is the most classic, real-world example. Out of the box, most clusters apply a taint to the control plane nodes: node-role.kubernetes.io/control-plane:NoSchedule or its older alias node-role.kubernetes.io/master:NoSchedule. This is why your nifty web app pods don’t get scheduled on the masters—which is good, you don’t want that.

But what if your DaemonSet is, say, a cluster-wide networking plugin or monitoring agent? You probably do want that on the masters. This is why core system DaemonSets like kube-proxy and Flannel/Calico include a toleration for it.

Here’s how you’d add it to your own DaemonSet. It’s so common it’s basically a best practice for any cluster-level infrastructure DaemonSet.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: my-essential-daemonset
spec:
  selector:
    matchLabels:
      app: my-essential-app
  template:
    metadata:
      labels:
        app: my-essential-app
    spec:
      tolerations:
        # This tolerates the common control-plane taint
      - key: "node-role.kubernetes.io/control-plane"
        operator: "Exists"
        effect: "NoSchedule"
      # ... and your other tolerations go here
      containers:
      - name: main
        image: my-essential-app:latest

The key takeaway? Tolerations are what make DaemonSets truly powerful, transforming them from a “deploy to every vanilla node” tool into a “deploy to every node that meets this criteria, no matter how weird it is” tool. Always double-check your operator and effect—getting them right is the difference between a pod that runs where you need it and one that politely avoids the problem entirely.

The Anatomy of a Toleration

Tolerating All the NoExecute Eviction

The Master Node Taint and Your DaemonSet

Tolerating All the `NoExecute` Eviction