5.9 Ephemeral Containers: Debugging Running Pods

Right, so your pod is running. Or, more accurately, it’s doing something that is not “working correctly.” It’s crashing, it’s slow, it’s returning 500s for reasons known only to the eldritch gods of distributed systems. The classic, knee-jerk reaction is to kubectl exec into it and start poking around with top, curl, or your favorite debugging tool.

But what if the pod’s container is a stripped-down, distroless nightmare that doesn’t even have a shell? No /bin/bash. No /bin/sh. Not even ls. It’s just your application, sitting in a bare-bones runtime, mocking you. Or worse, what if the application has crashed and the container is now in a CrashLoopBackOff, meaning you can’t exec into it at all because there’s no running process to latch onto?

This is where the designers of Kubernetes, in a moment of sheer brilliance (or perhaps desperation), said, “Fine, you want to debug? Here. But we’re not making it easy.” And thus, Ephemeral Containers were born. They are exactly what they sound like: a temporary container that you attach to a running pod, specifically for debugging. It shares the pod’s namespaces—network, process, IPC—so you can see exactly what the other containers see. Think of it as a surgical strike of introspection. You’re not modifying the original pod spec; you’re just clipping in a diagnostics module.

Why This Feels So Weird

The first thing you’ll notice is that you can’t add an ephemeral container with a standard kubectl apply and a YAML file. This isn’t a declarative operation; it’s a deliberate, imperative debug command. The API for it is different and, frankly, a bit clunky. This is a conscious design choice. They don’t want you defining ephemeral containers in your deployments. They are a break-glass, operational tool, not a part of your application’s architecture. This separation is annoying until you realize it prevents you from accidentally shipping a debug tool into production.

Getting Your Hands Dirty: The `debug` Command

The modern way to do this is with kubectl debug. Let’s say you have a pod named broken-app-xyz that’s running a distroless image and you need to see what network connections it has.

kubectl debug -it pod/broken-app-xyz --image=nicolaka/netshoot --target=broken-app

Let’s break this down because it’s important:

-it: Gives you an interactive terminal session. You know this one.
--image=nicolaka/netshoot: This is the container image you’re temporarily injecting. It’s full of fantastic networking tools (tcpdump, netstat, curl, etc.). You could use busybox or any other utility image.
--target=broken-app: This is the crucial part. It tells the ephemeral container to share the process namespace of the container named broken-app inside the pod. This is how you can see its processes. If you omit this, you’ll be in the pod’s filesystem but in your own, isolated process namespace, which is useless for debugging the main application.

Once you run this, you’re dropped into a shell inside the ephemeral container. Now you can run netstat -tunap and see exactly what your main application container is doing.

Debugging a Crashed Pod: The Copy-Pod Trick

“But my pod is crashed!” you yell. “There’s nothing running to share a process namespace with!” Correct. For this scenario, we use a different incantation. We have to copy the entire pod spec and add an ephemeral container to the copy, but with one crucial change: we tell it to pause the original container so the copied pod stays alive.

kubectl debug pod/broken-app-xyz -it --copy-to=broken-app-debug --set-spec='shareProcessNamespace=true' --container=debugger --image=busybox

This command does a lot:

--copy-to=broken-app-debug: Creates a new pod based on the original, named broken-app-debug.
--set-spec='shareProcessNamespace=true': This is the magic. It enables process namespace sharing for the entire pod. Now all containers in this pod can see each other’s processes.
It pauses the original application container(s) (so a crashed container stays stopped, allowing you to inspect its corpse).
It adds your new debugger container (using the busybox image) to the pod.

Now you can exec into the debugger container and use ps -ef to see the stopped application processes in their final state, inspect its filesystem at the moment of crash, and check its logs from the host’s perspective.

The Gotchas (Because Of Course There Are Gotchas)

API Inconsistency: You cannot add an ephemeral container to a pod created from a DaemonSet or a StatefulSet without the --copy-to trick. It’s an API limitation that makes sense from a “don’t break the controller” perspective but feels arbitrary when you’re under pressure.
It’s Ephemeral: The name is a hint. The moment you exit the debug container, it vanishes. Its logs do stick around, but you have to know how to get them: kubectl logs pod/broken-app-xyz -c ephemeral --previous.
Security Constraints: If your pod is running under a highly restrictive security context (e.g., readOnlyRootFilesystem: true, allowPrivilegeEscalation: false), your debug container might not run unless it respects those same constraints. You might need to temporarily loosen them on the copied pod, which is a security trade-off you must consciously make.

The bottom line is this: Ephemeral Containers are a powerful admission that the real world is messy. They are the ultimate “you’re not supposed to need this, but when you do, you’ll be thankful it exists” feature. They feel like a hack because, in many ways, they are. But they’re a brilliantly designed, indispensable hack that perfectly embodies the Kubernetes philosophy of providing escape hatches for operational reality. Use them wisely, and never, ever put them in your actual deployment manifests.

Why This Feels So Weird

Getting Your Hands Dirty: The debug Command

Debugging a Crashed Pod: The Copy-Pod Trick

The Gotchas (Because Of Course There Are Gotchas)

Getting Your Hands Dirty: The `debug` Command