19.8 Pod Disruption Budgets: Protecting Availability During Disruptions
Right, so you’ve got your pods running smoothly. They’re healthy, they’re happy, they’re serving traffic. Then you, or more likely your cluster’s automation, decide it’s time for an update, a node drain, or a scale-down. Chaos ensues. A pod gets unceremoniously evicted, your user-facing API starts coughing up 500 errors, and you get that lovely 3 AM wake-up call. We’ve all been there. The problem isn’t the disruption itself; clusters are meant to be dynamic. The problem is doing it like a bull in a china shop.