10.7 Indexed Jobs: Work Queues with Stable Pod Indices

Right, so you’ve got a bunch of work to do. A big, fat, parallelizable data set. Maybe you’re resizing a million images, processing a terabyte of log files, or sending out a truly staggering number of “We miss you!” emails. You reach for a Job with a high completions count, and it works… fine. But it’s a bit, well, dumb. The pods get names like mypod-6xz8w, mypod-pq4d9—completely random. If one of your workers fails and you need to know which specific chunk of work (say, file number 42) died, good luck figuring it out from that pod name. You’re left grepping through logs like a medieval peasant.

10.6 Job Cleanup: TTL Controller and Manual Deletion

Right, so you’ve run your Job. It did its thing. The pods are sitting there, “Completed” or maybe “Failed,” cluttering up your kget pods output like dirty mugs on a desk. You’re not a digital hoarder; you want this stuff cleaned up. Kubernetes, thankfully, agrees with you. Let’s talk about how it gets done, both automatically and when you need to take matters into your own hands. The TTL Controller: Your Automatic Janitor Introduced as a beta feature way back in 1.21 (it’s stable now, don’t panic), the TTL-after-finished controller is the primary way to automate Job cleanup. It’s brilliantly simple: you tell a Job, “Hey, live for this long after you finish, and then please vanish.” You do this by setting the .spec.ttlSecondsAfterFinished field.

10.5 ConcurrencyPolicy: Allow, Forbid, Replace

Right, so you’ve got your Job set up to process a queue or crunch some numbers. It works. But what happens if the previous Job run hasn’t finished yet and the scheduled time for the next run rolls around? Chaos? A pile-up of angry, resource-hogging Pods? Kubernetes, thankfully, doesn’t just let this happen by default. It gives you a steering wheel called concurrencyPolicy to decide how to handle this exact scenario. This isn’t just a suggestion; it’s a critical piece of configuration for any non-trivial CronJob.

10.4 CronJob: Scheduled Jobs with Cron Syntax

Alright, let’s talk about CronJobs, the part of Kubernetes that tries its best to be your old-school system scheduler but ends up being a bit more… Kubernetes-y. Which is to say, powerful, but with a few more knobs to turn and a couple of new ways to shoot yourself in the foot. The core idea is simple: you want to run a Job (a Pod that runs to completion) on a schedule. Easy, right? You define a schedule using that familiar, slightly-cryptic cron syntax, point it at a Pod template, and off you go. But of course, in true Kubernetes fashion, the devil is in the details—details like time zones, concurrency, and what happens when your job takes longer to run than the time between schedules.

10.3 Job Failure Handling: backoffLimit and activeDeadlineSeconds

Right, so you’ve got your Job set up. It’s a beautiful, snowflake-unique container image running your bespoke data-processing script. You apply the manifest, and it runs. Perfect. Then, on Tuesday, the database it depends on goes down for five minutes. Your Job pod starts, instantly faceplants, and the kubelet restarts it. It faceplants again. And again. And again. You’ve just created a pathological resource-hogging failure machine that will hammer your poor, beleaguered database until you manually intervene. Not ideal.

10.2 Parallelism and Completions: Controlling Concurrency

Right, so you’ve got a job that needs to do a thing. Maybe it’s processing a million images, or sending out a batch of emails. You fire up a Job, and it creates a Pod that chugs along. But what if one Pod isn’t enough? What if you need ten, or a hundred, all working in parallel to chew through a massive work queue? That’s where we get into the real power of Jobs: parallelism and managing how they know they’re done.

10.1 Job: Run-to-Completion Workloads

Alright, let’s talk about Jobs. You’ve got your long-running services (Deployments, StatefulSets) that just hum along forever, and then you’ve got the stuff you actually want to finish. That’s what a Job is for. Think of it as a disposable, single-shot Pod with a very specific purpose: run this container, let it do its work, and when its main process exits with a code of zero, declare victory and go home. Backup scripts, database migrations, data processing batches—this is their home.

— joke —

...