41.3 Upgrading with kubeadm: Step-by-Step
Right, so you’ve decided to upgrade your cluster with kubeadm. Good choice. It’s the officially blessed path, which means it’s less “holding a live badger” and more “performing delicate surgery.” Still, you’re holding a scalpel, not a chainsaw, so let’s proceed with a bit of finesse. The core idea here is that kubeadm upgrades one node at a time, and it does this by first upgrading the control plane nodes, one by one, and then the workers. You don’t just throw a switch and upgrade the whole thing at once; that’s a fantastic way to schedule an unplanned outage and order a pizza for a long, sad night.
First thing’s first: check your current cluster version and see what you can even upgrade to. kubeadm is notoriously picky about only skipping one minor version at a time. You can’t just leap from 1.24 to 1.27 and expect it to work. It won’t. It will laugh at you.
kubeadm version
kubectl get nodes -o wide
kubeadm upgrade plan
That last command is your crystal ball. It checks the current state of your cluster and tells you exactly which versions are available for you to upgrade to, along with any component configs that might need manual intervention. Do not skip this step. I’ve skipped it. You’ll skip it. We all think we’re too clever until we’re not. Just run it.
Pre-flight checklist: your escape pod
Before you touch a single control plane node, you need a parachute. In this metaphor, your parachute is a backup of your etcd data and all your Kubernetes manifests. If something goes horribly wrong, you’ll want to be able to get back to where you started.
# Create a backup of etcd running on the first control plane node
sudo docker run --rm -v /etc/kubernetes/pki/etcd:/etc/kubernetes/pki/etcd:ro \
--network host \
-v $(pwd):/backup \
k8s.gcr.io/etcd:3.5.3-0 \
etcdctl snapshot save /backup/etcd-snapshot-$(date +%Y-%m-%d).db \
--cert /etc/kubernetes/pki/etcd/server.crt \
--key /etc/kubernetes/pki/etcd/server.key \
--cacert /etc/kubernetes/pki/etcd/ca.crt
# And back up your manifests. They're probably in /etc/kubernetes/manifests/
sudo cp -R /etc/kubernetes/manifests /backup/manifests-backup
Why do we do this? Because etcd is the cluster’s brain. If you corrupt it during an upgrade, you don’t have a cluster anymore; you have a very expensive and complicated paperweight. This backup is your “oh, crap” handle.
Upgrading the first control plane node
Now for the main event. We start with the first control plane node. This is where the magic (read: anxiety) happens. The process is straightforward but demands your attention.
Drain the node. This politely (and forcefully) evicts all your pods so the upgrade doesn’t interfere with running workloads. It’s like asking guests to step out of a room so you can change the carpet.
kubectl drain <first-control-plane-node> --ignore-daemonsets --delete-local-dataThe
--ignore-daemonsetsflag is non-negotiable because DaemonSets like your network plugin will refuse to be evicted.--delete-local-datais a bit scary—it will delete pods that useemptyDirvolumes. If you have important stuff there, you should have already handled it. This is a reminder that you shouldn’t rely on local storage for anything important.Upgrade kubeadm itself. You can’t use an old version of
kubeadmto upgrade to a new version of Kubernetes. That would be like using a 1990s road map to navigate today—it mostly works until you drive into a new lake.# On Ubuntu/Debian sudo apt update sudo apt install kubeadm=<new-version>-00Run the upgrade plan again. This is a sanity check to make sure everything is still looking good before we commit.
sudo kubeadm upgrade planApply the upgrade. This is the command that actually does the work. Note the
--certificate-renewalflag. It defaults totrue, which is what you want. Letkubeadmhandle rotating all those fancy certificates for you.sudo kubeadm upgrade apply v1.27.4This command upgrades the static Pod manifests for the control plane components (api-server, controller-manager, scheduler) and also updates the cluster-wide state, like the ClusterVersion and the DNS server. Watch the output. It will tell you exactly what it’s doing. If it succeeds, breathe a sigh of relief. You’re halfway there.
Update kubelet and kubectl. You’ve upgraded the control plane, but the kubelet on this node is still the old version. You need to bring it up to date and restart it.
sudo apt install kubelet=<new-version>-00 kubectl=<new-version>-00 sudo systemctl daemon-reload sudo systemctl restart kubeletUncordon the node. Bring the node back online so it can start scheduling workloads again.
kubectl uncordon <first-control-plane-node>
Upgrading the rest of the control plane
For the other control plane nodes, the process is nearly identical but much simpler. You don’t run kubeadm upgrade apply again. That command is only for the first node. For subsequent control plane nodes, you use:
sudo kubeadm upgrade node
This command is far less dramatic. It only upgrades the local static Pod manifests on that specific node, ensuring they match the new cluster version. After running it, you still need to update and restart the kubelet, just like before. Then drain and uncordon the node. Rinse and repeat for every control plane node you have.
Upgrading the worker nodes
Worker nodes are the easy part. For each one:
- Drain the node (
kubectl drain). - SSH into it.
- Upgrade
kubeadm, then runsudo kubeadm upgrade node. This command on a worker is even simpler; it just configures the local kubelet for you. - Upgrade and restart the kubelet (
sudo apt install kubelet=<new-version> && sudo systemctl daemon-reload && sudo systemctl restart kubelet). - Uncordon the node (
kubectl uncordon).
That’s it. The cluster will seamlessly re-schedule all the evicted pods onto the newly upgraded nodes. Do them one at a time, and watch your cluster’s metrics like a hawk during the process. If something goes sideways, you’ve got that etcd backup. But you probably won’t need it. You’ve got this.