19.8 Pod Disruption Budgets: Protecting Availability During Disruptions

Right, so you’ve got your pods running smoothly. They’re healthy, they’re happy, they’re serving traffic. Then you, or more likely your cluster’s automation, decide it’s time for an update, a node drain, or a scale-down. Chaos ensues. A pod gets unceremoniously evicted, your user-facing API starts coughing up 500 errors, and you get that lovely 3 AM wake-up call. We’ve all been there. The problem isn’t the disruption itself; clusters are meant to be dynamic. The problem is doing it like a bull in a china shop.

19.7 Pod Priority and Preemption

Right, let’s talk about Pod Priority and Preemption. This is where Kubernetes stops being polite and starts getting real. Up until now, we’ve mostly talked about resource requests and limits as a way for the scheduler to make informed decisions. But with priority, we’re giving it a direct command: “This pod is more important than that one. Act accordingly.” Think of it like this: your cluster is a lifeboat. There’s only so much room (CPU and Memory). If a new, critically important person needs to get on (a high-priority pod), and there’s no space, someone else might have to… unceremoniously take a swim (get preempted). It’s brutal, but for many workloads (like system-critical services or CI/CD pipelines where you don’t want a build job blocking a production web server), it’s absolutely essential.

19.6 ResourceQuota: Namespace-Level Resource Caps

Right, let’s talk about ResourceQuotas. This is where the fun begins, or ends, depending on whether you’re the one setting them or the one hitting them. Think of a ResourceQuota as the stern, spreadsheet-loving parent of a Kubernetes namespace. It doesn’t micromanage how each pod behaves (that’s the LimitRange’s job, which we’ll get to), but it absolutely keeps a running tally of the total resource consumption for all pods in its domain. The moment the namespace tries to exceed its allowance, API server says “nope” and your new pod sits in a sad, Pending state. It’s the ultimate “you’ve had enough” mechanism.

19.5 LimitRange: Setting Defaults and Boundaries per Namespace

Right, let’s talk about LimitRange. This is one of those Kubernetes features that seems boring until you’ve had a pod brought to its knees because some joker deployed a memory-hogging monstrosity without setting a resources.limits field. Then it becomes the most fascinating topic in the world. Trust me. A LimitRange is essentially a bouncer for a specific namespace. Its job is simple but critical: it enforces that every pod (or container) that walks into the club plays by the resource rules. It can set defaults for requests and limits if you forget to specify them (saving you from yourself), and it can set minimum and maximum boundaries to prevent absolute chaos (saving your cluster from your colleagues). Without it, a namespace is the wild west, and someone will inevitably deploy a pod that requests 0.001 CPU and 4TB of memory.

19.4 QoS Classes: Guaranteed, Burstable, BestEffort

Alright, let’s talk about Quality of Service (QoS) classes. This is where Kubernetes stops being a polite container orchestrator and starts acting like a brutally honest bouncer at an overbooked nightclub. It has to decide which of your pods get the VIP treatment, which get to wait in the general admission line, and which might get unceremoniously kicked to the curb to make room for a bigger spender when things get hectic.

19.3 CPU Throttling vs Memory OOMKill

Alright, let’s get into the real-world consequences of getting your resource requests and limits wrong. This is where the rubber meets the road, or more accurately, where your application grinds to a halt or gets unceremoniously murdered. The key thing to remember is that the Kubernetes scheduler treats CPU and memory completely differently. Understanding this distinction is the difference between a smoothly running cluster and a 3 AM pager duty call that ruins your weekend.

19.2 Limits: The Hard Cap on Resource Consumption

Alright, let’s talk about limits. If requests are your polite, “hey, maybe could I have some more?” note to the kitchen, then limits are the bouncer at the club door. They don’t negotiate. They don’t care if your process is having the best day of its life. They just enforce the hard rule: “Thou shalt not consume more than X.” The kernel enforces this with brutal efficiency. A process hits its memory limit (memory)? SIGKILL. Not SIGTERM. Not a gentle warning. It’s oom-killed, gone, vanished from the process table. It hits its CPU limit (cpu)? The kernel’s CPU throttler (CFS – Completely Fair Scheduler, and yes, the irony is rich) ensures the process gets precisely zero cycles beyond its limit. It’s not killed, but it’s effectively frozen in time until the next measurement window. It’s a hard cap. This is why you set them.

19.1 Requests: What the Scheduler Uses for Placement

Alright, let’s talk about the one thing that actually matters to the scheduler when it’s trying to find a home for your pod: requests. Forget limits for a moment; they’re the bouncer at the club, but requests are the guest list. The scheduler only cares about the guest list. When you define a resources.requests block in your container spec, you’re not making a polite suggestion. You’re declaring, under oath, “This container will need at least this much CPU and memory to function properly.” The scheduler takes this sworn testimony and uses it to find a node with enough spare capacity to honor your request. It’s a contract. If the node can’t fulfill it, your pod ain’t getting scheduled.

— joke —

...