37.7 EKS Add-Ons: CoreDNS, kube-proxy, Amazon VPC CNI

Right, let’s talk about the three amigos that AWS graciously pre-installs for you on every EKS cluster: CoreDNS, kube-proxy, and the Amazon VPC CNI. Think of them less as optional “add-ons” and more as the “operating system” of your cluster. Without them, your cluster is a very expensive, very confused computer that can’t talk to itself or the outside world. AWS manages the installation and versioning of these for you, which is mostly a blessing, but as we’ll see, sometimes a curse in disguise.

The Plumbing: Amazon VPC CNI

This is the most important, and often most troublesome, piece of the puzzle. The VPC CNI (Container Network Interface) is responsible for giving every Pod its own, bona fide IP address from your VPC’s subnet. No overlays, no clever NAT tricks. It’s all just… flat network. This is fantastic because it means your Pods are first-class citizens on your VPC network. You can hit them directly from an EC2 instance (security groups permitting), and network debugging becomes a million times easier since you’re dealing with familiar AWS constructs.

The magic happens via an agent DaemonSet that runs on every node (aws-node). When the kubelet says, “Hey, I need an IP for this new Pod,” the CNI plugin makes a sneaky API call to the AWS EC2 instance metadata service to literally attach a secondary IP address from the node’s subnet to that EC2 instance’s network interface (ENI). Then it assigns that IP to the Pod.

Here’s the catch, and it’s a big one: AWS has a hard limit on how many IPs a single EC2 instance can have. This varies wildly by instance type. A t3.medium can only host about 17 Pods, while a m5.24xlarge can handle 737. You will paint yourself into a corner if you don’t plan for this. You can check the limits for your node group with a quick command:

# Get the instance type for each node
kubectl get nodes -o custom-columns=NAME:.metadata.name,INSTANCE:.metadata.labels.node\.kubernetes\.io/instance-type

Then go look up the ENI and IP limits for that instance type on AWS’s docs. If you need more Pods per node, you can enable prefix delegation on the VPC CNI, which attaches entire /28 blocks of IPs to an ENI, dramatically increasing the limit. It’s a lifesaver, but you have to configure it yourself. Don’t wait until you’re getting FailedCreatePodSandBox errors to figure this out.

The Switchboard: kube-proxy

If the VPC CNI is the plumbing, kube-proxy is the switchboard operator. Its job is brutally simple: watch the Kubernetes API for new Services (and Endpoints), and then manage the iptables rules (or IPVS, if you’re fancy) on every node to make sure traffic destined for a Service IP (ClusterIP) gets magically redirected to a healthy Pod IP.

AWS manages this as a DaemonSet for you. The main thing you need to know here is that you’re almost always fine with the default iptables mode. The kube-proxy add-on just ensures you have a compatible version running. You can see its handiwork on any node if you’re brave enough to look:

# SSH into a node and take a peek at the madness (don't modify anything!)
sudo iptables -t nat -L | grep your-service-name

The rules are a chain of probability-based randomness, which is a hilarious way to do load balancing but it works astonishingly well. The biggest “gotcha” with kube-proxy is that it’s another thing that can subtly break if you mess with the host’s network stack or iptables rules yourself. Just don’t.

The Phonebook: CoreDNS

Finally, CoreDNS. This is your cluster’s DNS server. When one Pod tries to find another Pod via a Service name like my-api.default.svc.cluster.local, it’s CoreDNS that answers the call. It’s also a Deployment that runs on your cluster, and the EKS add-on ensures it’s there and running a version that plays nice with your control plane.

Most of the time, you don’t need to touch it. But when you do, its configuration is held in a ConfigMap in the kube-system namespace. Want to add a custom DNS resolver for your .internal corporate domain? This is where you do it.

# Get the current CoreDNS configuration
kubectl -n kube-system get configmap coredns -o yaml

You’ll see a Corefile in there that looks like this cryptic but sensible configuration. The loop plugin is my favorite—it’s there to detect and break infinite DNS query loops. The fact that they needed to build that in tells you everything you need to know about networking.

.:53 {
    errors
    health
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
      pods insecure
      fallthrough in-addr.arpa ip6.arpa
      ttl 30
    }
    prometheus :9153
    forward . /etc/resolv.conf
    cache 30
    loop
    reload
    loadbalance
}

The most common pitfall? Forgetting that CoreDNS has a teeny-tiny cache (30 seconds in the example above) and pulling your hair out wondering why a DNS record isn’t updating immediately. The second most common is it getting resource-starved in a large cluster, so keep an eye on its memory and CPU requests.

So there you have it. The VPC CNI gives your Pods an IP and a voice, kube-proxy makes sure they can find each other, and CoreDNS lets them use friendly names instead of shouting numbers at each other. AWS manages them so you mostly don’t have to, but as with all things in the cloud, they hand you the leash of a powerful, sometimes unruly beast. It’s your job to know how to feed it.