19.8 Pod Disruption Budgets: Protecting Availability During Disruptions
Right, so you’ve got your pods running smoothly. They’re healthy, they’re happy, they’re serving traffic. Then you, or more likely your cluster’s automation, decide it’s time for an update, a node drain, or a scale-down. Chaos ensues. A pod gets unceremoniously evicted, your user-facing API starts coughing up 500 errors, and you get that lovely 3 AM wake-up call. We’ve all been there. The problem isn’t the disruption itself; clusters are meant to be dynamic. The problem is doing it like a bull in a china shop.
This is where the PodDisruptionBudget (PDB) comes in. Think of it not as a forcefield that prevents pods from being killed, but as a very stern, very rules-oriented bouncer for your pod eviction party. Its job is simple but critical: to ensure that a certain number or percentage of your pods remain available during voluntary disruptions. I said voluntary for a reason—it’s crucial. We’re talking about things you initiate, like draining a node for maintenance, updating a DaemonSet, or scaling down a deployment. It has absolutely no say against involuntary disruptions, like a node straight-up dying, your cloud provider yeeting an instance into the sun, or someone unplugging the server rack “to see what this button does.” For that, you need replication. The PBD is your first line of defense against your own well-intentioned chaos.
The Two Flavors of Unavailability
A PDB lets you define your tolerance for unavailability in one of two ways. You can specify minAvailable, which is the floor. “I must have at least this many pods running at all times.” Or, you can specify maxUnavailable, which is the ceiling. “I can tolerate no more than this many pods being down.” You pick one. Not both. The Kubernetes API will laugh at you if you try, and rightly so.
Let’s say you have a deployment with a solid replicas: 10. You’re running a critical, stateful service where every pod counts. Your mantra is “lose nothing.”
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-critical-pdb
spec:
minAvailable: 10 # "All hands on deck! Do not remove a single pod!"
selector:
matchLabels:
app: my-critical-app
Conversely, you might have a stateless web frontend with 100 replicas. You’re doing a rolling node update and you know your service can handle a bit of a dip. You care more about speed of the operation than perfect availability.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-web-pdb
spec:
maxUnavailable: 10% # "It's cool, you can take down up to 10 pods at a time."
selector:
matchLabels:
app: my-web-app
Percentages are almost always smarter than absolute numbers because they automatically adjust as you scale your deployment up and down. It’s self-healing configuration.
The Brutal, Logical Math of Eviction
Here’s where it gets interesting. When an eviction request comes in (say, from kubectl drain), the Kubernetes API doesn’t just check the PDB once. It performs a full-on admission control check. It asks: “If I allow this eviction, will the cluster still respect all PDBs?”
Let’s use the maxUnavailable: 1 example with 3 replicas. If two pods are running perfectly and one is already terminated, the current state is 1 unavailable. Your PDB says that’s the max. The API will now reject any new eviction request because allowing it would push the unavailable count to 2, violating the rule. The drain command will hang, waiting for the situation to change. This is why you sometimes see kubectl drain get stuck. It’s not broken; it’s being responsible. It’s waiting for one of the other pods to become ready so that the eviction of the third won’t break the budget.
Common Pitfalls and the “Why”
This is the stuff the dry docs won’t tell you, but you will absolutely trip over.
Pitfall #1: Selector Mismatch. This is the big one. You create a beautiful PDB with a selector that points to app: my-app. But your deployment’s pod template has labels app: my-app, tier: frontend. That’s fine, it matches. Then you get clever and add a canary deployment with app: my-app, track: canary. Your PDB will suddenly, and silently, start covering both your stable and canary pods. This might be what you want! But if it’s not, you’re in for a surprise. Your node drain will be blocked by the canary pod you didn’t even think about. Be surgical and precise with your selectors. Use labels like role or track to isolate your budgets.
Pitfall #2: The Unhealthy Pod Trap. A PDB only protects against the eviction of healthy pods. If a pod is Unready (failing its readiness probe), it doesn’t count toward the minAvailable number. The system considers it already “unavailable.” This is logical—why protect something that’s not serving traffic?—but it can lead to a death spiral. One pod goes unhealthy, so the PDB allows another to be evicted. Now two are down. If the first one doesn’t recover quickly, the PDB might allow a third eviction, and so on. Your budget isn’t broken; it’s working as designed, which is to preserve available capacity. The lesson here is that your application’s health probes are now part of your disruption budget logic. Make them meaningful.
Pitfall #3: The Singleton Service. You absolutely must use a PDB for a pod with only one replica. But you have to be realistic. Setting minAvailable: 1 is perfectly valid. It means “you cannot voluntarily evict this pod without first deleting this PDB.” This is a good thing! It prevents an accidental drain from taking your only database primary offline. The operational takeaway is that any voluntary disruption of that node now requires a two-step process: first, delete the PDB (acknowledging the risk), then perform the drain. It’s a manual safety catch, and sometimes that’s exactly what you need.