Kubernetes 1.22 was recently released and the steady march towards deprecating the docker shim progresses. On-premise Kubernetes cluster admins may be a bit daunted by the task of migrating to a new container runtime, but fear not! We recently helped a client through the process and wanted to share our findings to help make the migration easier for everyone.
Preparations
Before you start re-adding nodes to the Kubernetes cluster, it’d be wise to set the kubelet-config configmap to use the systemd cgroup driver. You can find instructions on that here:
Migrating Kubernetes to Containerd from Docker
Start by picking your least favorite worker node, cordoning it and waiting for the workloads to drain from it.
kubectl drain <node-name> --ignore-daemonsets
Then remove it from the cluster.
kubectl delete node <node-name>
Now reset the removed node and clean it up:
kubeadm reset
Stop docker and kubelet
systemctl stop kubelet
systemctl disable --now docker
Remove docker and reinstall containerd. Unless you mark the containerd package as manually installed, some package managers remove it as a dependency of docker-ce.
Make sure you clean up the data that isn’t cleared when running reset
- /etc/cni/net.d/*
- /opt/cni
- /var/lib/docker
- /var/lib/dockershim
Prepare the system for containerd as described here: https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# Setup required sysctl params, these persist across reboots.
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
# Apply sysctl params without reboot
sudo sysctl --system
Now we generate the default config for containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
edit the containerd config file to enable the systemd cgroup driver for the CRI
In /etc/containerd/config.toml
And below the line:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
Add:
SystemdCgroup = true
Start and enable containerd:
systemctl enable --now containerd
Once containerd is up and running you’ll need to re-install kubelet and kubeadm. I found that kubelet only reliably detected the new container runtime when reinstalled. If anyone knows of a way to get this to work without reinstalling, send your answers on a postcard please :)
Finally rejoin the node to the cluster. You should be able to verify the Node’s CRI with
kubectl get nodes -o wide
Once this is all done for your worker nodes, your control plane nodes are next. The steps are mostly the same, but if you run etcd in-cluster, it may be a good idea to reset
your nodes before you delete
them, to ensure they safely exit the etcd cluster.
Now with Docker removed and your Kubernetes cluster fully migrated to Containerd, you have much less to worry about in the upcoming kubernetes updates! You still have a little bit of time of course, with the earliest possible deprecation being later this year with kubernetes version 1.23.