17.4 Persistent Volume Claims (PVC): Requesting Storage
Right, so you’ve got your Persistent Volumes (PVs) sitting there, ready for action. They’re the disks. But you and I don’t just go around grabbing random disks off a shelf and plugging them into our servers. That would be chaos. This is Kubernetes, not a yard sale. We need a system. Enter the Persistent Volume Claim (PVC). Think of a PVC as your very polite, very specific request to the cluster: “Excuse me, I would like approximately this much storage, with these performance characteristics, please and thank you.”
The magic here is the separation of concerns. As an application developer, you don’t need to care which specific NFS share or SSD in the datacenter your data lives on. You just need to know it will be there, meeting your specs. You write a PVC manifest. It’s the cluster admin’s job (or more likely, a dynamic provisioner’s) to ensure something exists that can fulfill your request. This is Kubernetes doing what it does best: abstracting away the messy physical reality.
The Absolute Basics of a PVC Manifest
A PVC is just another Kubernetes object, defined in a YAML file. At its simplest, it needs a name, an access mode, and a request for resources. Let’s break one down.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-database-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd
This YAML is you saying: “I need 10 gigs of fast SSD storage that can be written to and read from by one node at a time.” Let’s dissect why you’re saying it that way.
Access Modes Are Not Suggestions
This is a common point of confusion, so let’s be direct. accessModes are hard, technical constraints, not polite preferences. The three modes are:
- ReadWriteOnce (RWO): The volume can be mounted as read-write by a single node. This is the default for most block storage (like AWS EBS, GCP PD). It’s perfect for stateful applications like databases.
- ReadOnlyMany (ROX): The volume can be mounted read-only by many nodes. Use case? Maybe a config map stored on a volume that every pod needs to read but never write to. Pretty rare.
- ReadWriteMany (RWX): The volume can be mounted as read-write by many nodes. This is the holy grail for shared workloads, but it’s also the hardest to implement. NFS and some cloud file systems (Azure Files, AWS EFS) support this. If you request this, your cluster must have a PV that supports it, or the binding will fail.
Mismatching access modes is the number one reason for a PVC to get stuck in the Pending state. You can’t bind a ReadWriteOnce PVC to a ReadWriteMany PV. It’s like trying to fit a square peg in a round hole, and Kubernetes will just shrug and give up.
Storage Classes: The Key to Dynamic Happiness
Notice the storageClassName field? This is where the real magic happens. Without it, you’re manually provisioning volumes, which is a pain. With it, you’re telling the cluster, “Go use the provisioning logic defined in the StorageClass named fast-ssd to automatically create a PV that meets this claim.”
The StorageClass defines the provisioner (e.g., ebs.csi.aws.com), the parameters (e.g., type: gp3), and the reclaimPolicy (e.g., Delete or Retain). Your PVC just points at the right class. This is how you automate the entire storage lifecycle. If you don’t specify a storageClassName, it will use the cluster’s default StorageClass, which may or may not be what you want (it’s often some standard block storage).
The Binding Process: What Actually Happens?
You run kubectl apply -f pvc.yaml. What now?
- You see
kubectl get pvcshowPending. Kubernetes starts looking for a matching PV. - If a matching PV exists: It binds to it! The PVC status flips to
Bound. That PV is now yours and off the market. - If no PV exists BUT a StorageClass is specified: The StorageClass’s provisioner kicks in. It talks to the cloud provider’s API, creates an actual disk, automatically creates a PV object representing that disk, and binds it to your PVC. You go from
PendingtoBoundautomatically. - If no PV exists and NO StorageClass is specified: Your PVC is stuck in
Pendingforever, like a sad party guest with no invite. You’ll see this if you forget thestorageClassNameand your cluster has no default StorageClass defined.
Using a Claim in a Pod
This is the payoff. You don’t reference the PV in your Pod spec; you reference the PVC. The PV is an implementation detail.
apiVersion: v1
kind: Pod
metadata:
name: my-database-pod
spec:
containers:
- name: db
image: postgres:15
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumes:
- name: data
persistentVolumeClaim:
claimName: my-database-pvc # This is the crucial line
When this pod schedules to a node, the kubelet on that node will see the PVC reference, find the PV that is bound to it, and orchestrate the attachment and mounting of that actual storage device to your container. It’s a beautiful, complex dance that you get to watch with a simple kubectl get pods,pvc,pv.
Best Practices and Pitfalls
- Define PVCs alongside your app manifests: Keep them in the same directory, apply them together. Your storage claim is a core part of your application’s definition.
- Understand the reclaimPolicy: If your PVC’s StorageClass has
reclaimPolicy: Delete, deleting the PVC will delete the underlying PV and the storage asset! This is great for dev, terrifying for prod. For production data, you often wantreclaimPolicy: Retainon the StorageClass. Then you manually remove the PV and handle the disk yourself after the PVC is deleted. - Size matters, and you can’t easily change it: Unlike in the cloud console, you can’t just slide a Kubernetes PV size bar up. Resizing a PVC is a beta feature and depends heavily on your underlying storage provider and StorageClass supporting it. Get your size right the first time. It’s easier to request more than you need than to try and grow it later.
volumeNameis for control freaks: You can add avolumeNamefield to your PVC spec to force it to bind to a specific PV. You probably don’t need this level of micromanagement. Let the dynamic provisioner do its job.