8.7 Typical Use Cases: Databases, Kafka, Zookeeper
Right, so you’ve got your stateless web apps happily humming along on Deployments, scaling up and down without a care in the world. But now you need to run the important stuff—the things that remember who they are and where they left off. You need to run a database, a Kafka cluster, or Zookeeper. For these, a Deployment is a disaster waiting to happen. You don’t just need a Pod; you need a specific Pod with a specific identity and access to its specific data. Enter the StatefulSet, the Kubernetes controller that treats your pets like actual pets, not cattle.
The magic, and the complexity, of a StatefulSet boils down to a few key features: a stable network identity, ordered graceful deployment and scaling, and persistent storage that is, you guessed it, stable. Let’s break down why this is non-negotiable for stateful workloads.
Stable Network Identity is Everything
When a StatefulSet Pod is created, it gets a predictable name based on the StatefulSet name and a sequential index: <statefulset-name>-<ordinal-index>. So for a StatefulSet named kafka, you get Pods kafka-0, kafka-1, kafka-2, and so on.
More importantly, each Pod gets its own stable DNS hostname inside the cluster. The Pod kafka-1 will always be reachable at kafka-1.kafka.default.svc.cluster.local (assuming the kafka service exists in the default namespace). This is absolute gold. Your applications can hardcode connection strings to specific nodes. Your database replicas can form clusters by knowing exactly how to find their peers. This is the complete opposite of the random-hash Pod names you get with Deployments, where every new Pod is a stranger to the cluster.
Here’s a minimal service definition you need to create before the StatefulSet. Note the clusterIP: None; this is a “headless” service, meaning Kubernetes won’t load-balance it. Instead, it just provides the DNS records we desperately need for our individual Pods.
apiVersion: v1
kind: Service
metadata:
name: kafka
labels:
app: kafka
spec:
ports:
- port: 9092
clusterIP: None
selector:
app: kafka
Ordered, Graceful Pod Management
A StatefulSet respects order. When you scale up, it creates Pods one at a time, in order (0, then 1, then 2…). It waits for each Pod to be “Ready” before starting the next one. When you scale down, it does the reverse, terminating the highest ordinal Pod first (2, then 1, then 0…).
This is crucial. You can’t have postgres-2 try to join a cluster as a replica if postgres-1 and postgres-0 aren’t even running yet. The ordered startup ensures your cluster forms correctly. The ordered termination is a grace period for the node with the highest index to cleanly hand off its work or for its peers to mark it as dead before it vanishes.
Persistent Storage That Sticks Around
This is where the volumeClaimTemplates section comes in. Unlike a Deployment where you define a PVC manually for each instance (a nightmare), a StatefulSet template will dynamically create a unique PersistentVolumeClaim for each Pod it creates.
Here’s the kicker: if Pod kafka-1 dies and gets rescheduled, the StatefulSet controller ensures it gets reattached to the exact same PVC named data-kafka-1. The data is persistent across Pod restarts and even across node failures. This is the foundation of statefulness.
Let’s look at a StatefulSet for a Kafka broker. Notice the serviceName field pointing to the headless service we made earlier, and the volumeClaimTemplates.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kafka
spec:
serviceName: "kafka"
replicas: 3
selector:
matchLabels:
app: kafka
template:
metadata:
labels:
app: kafka
spec:
containers:
- name: kafka
image: bitnami/kafka:latest
ports:
- containerPort: 9092
env:
- name: KAFKA_CFG_NODE_ID
value: "0" # This will be overridden per-Pod, see below.
- name: KAFKA_CFG_PROCESS_ROLES
value: "controller,broker"
- name: KAFKA_CFG_CONTROLLER_QUORUM_VOTERS
# The magic: using the stable DNS to configure the cluster
value: "0@kafka-0.kafka.default.svc.cluster.local:9093,1@kafka-1.kafka.default.svc.cluster.local:9093,2@kafka-2.kafka.default.svc.cluster.local:9093"
volumeMounts:
- name: data
mountPath: /bitnami/kafka
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "my-fast-storage" # Change this to whatever you've provisioned
resources:
requests:
storage: 20Gi
The Devil’s in the Details: Pitfalls and Best Practices
You are the orchestration logic: Kubernetes gives you the stable building blocks; it doesn’t run your database’s clustering logic. You are responsible for writing init scripts or using entrypoints that can consume the Pod’s ordinal index (available in the
HOSTNAMEenvironment variable) to configure the node ID. The example above is simplistic; a better pattern is to use a downwarAPI to inject the ordinal index as an env var and script your container to use it forKAFKA_CFG_NODE_ID.Stuck Pods are a nightmare: If a Pod (
kafka-2) gets stuck in aTerminatingstate because its node is unreachable, you cannot just force delete it. You must manually clean up its resources or the API object itself, or you’ll be blocked from scaling the StatefulSet. It’s a safety feature, but it’s painful.Storage is your real bottleneck: The StatefulSet guarantees the PVC binding, but it doesn’t guarantee performance or that your underlying storage provider can handle the IOPS. If your cloud disk is slow, your database will be slow. This is the hardest part to get right.
Backups are on you: Persistent Volumes are durable, but they are not backups. You absolutely must have a robust, automated backup and disaster recovery strategy for the data inside these volumes.
kubectl delete statefulset kafkawill happily obliterate your Pods and leave the PVCs dangling if you’re not careful, which is a one-way ticket to data loss city.
StatefulSets are the grown-up’s tool in Kubernetes. They demand more respect and understanding than Deployments, but they are the only sane way to run stateful distributed systems. They acknowledge the messy reality of state and give you the tools to manage it with some semblance of order.