16.5 Cilium: eBPF-Powered Networking, Observability, and Security

Alright, let’s roll up our sleeves and talk about Cilium. If the basic CNI plugins we’ve discussed are like reliable, simple sedans, Cilium is a fully-loaded, self-driving concept car from the future. It still connects your pods to the network, which is its job as a CNI plugin, but it throws out the old, cumbersome iptables-based plumbing and replaces it with eBPF. This isn’t just an incremental upgrade; it’s a fundamental rewrite of the kernel’s networking and security logic, and it changes everything.

eBPF lets Cilium inject highly efficient, sandboxed programs directly into the Linux kernel. Instead of a packet taking a tortuous path through a forest of iptables chains (and trust me, in a large cluster, that forest is a nightmare to navigate), an eBPF program can make a near-instantaneous decision: “Allow this packet,” “Drop this packet,” or “Send it over there.” This happens at the kernel level, before the packet even has a chance to waste CPU cycles traveling to user space and back. The performance and observability gains are, frankly, absurd.

How Cilium Uses eBPF to Actually Make Sense of Your Cluster

Remember trying to decipher connection issues with iptables -L -v -n --line-numbers and wanting to cry? Cilium fixes that by moving the complexity into intelligent, self-documenting programs. For networking, it uses eBPF to replace the kube-proxy component. Traditional kube-proxy manages thousands of iptables rules for Service load balancing. Cilium’s eBPF implementation handles this with a tiny, efficient hash table lookup. It’s the difference between searching for a book in a library by scanning every shelf versus just looking up its dewey decimal code.

But it doesn’t stop there. Because eBPF has deep visibility into kernel syscalls and network stack, Cilium can provide Hubble—a built-in, fully distributed observability platform. It’s like tcpdump for your entire cluster, without the massive performance overhead, giving you a crystal-clear view of service dependencies and traffic flow.

Here’s a taste of the Cilium CLI. Forget parsing thousands of lines of iptables; this is how you see allowed network policy:

# Get a quick overview of the health of your Cilium deployment
cilium status

# See the entire network policy universe for a specific pod
cilium monitor --related-to pod-name my-app-pod-abcd123

# Hubble in action: see real-time flow logs for everything
hubble observe

Writing Policies That Don’t Suck: The CiliumNetworkPolicy

Kubernetes native NetworkPolicies are… fine. They’re the lowest common denominator. Cilium supercharges this with its own CRD, CiliumNetworkPolicy (CNP), which is what you actually want to use. It supports way more expressive rules, including DNS-aware security, HTTP-level allow-listing, and service account-based rules that don’t make you want to pull your hair out.

Let’s say you have a API pod that should only receive POST and GET requests on /api/v1/* from a specific frontend service account. Doing this with a native NetworkPolicy is impossible. With Cilium, it’s Tuesday.

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: restrict-api-access
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: my-special-api
  ingress:
  - fromEndpoints:
    - matchLabels:
        io.kubernetes.pod.namespace: production
        k8s:app: frontend-app
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
      rules:
        http:
        - method: "POST"
          path: "/api/v1/.*"
        - method: "GET"
          path: "/api/v1/.*"

This policy is a masterpiece of specificity. It doesn’t just allow traffic from the frontend pods; it allows only certain types of HTTP traffic from them. This is L7 policy enforcement, and it’s a game-changer for reducing your attack surface. The kernel enforces these HTTP rules via eBPF, not some clunky sidecar proxy.

The Rough Edges and Where to Watch Your Step

It’s not all rainbows. Cilium is complex. You’re managing a core kernel-level component. You need a recent Linux kernel (4.9.17+ is the absolute bare minimum, but you really want 5.10 or later to get all the goodies). Upgrading Cilium itself is a more delicate operation than upgrading a simple DaemonSet; you’re essentially performing open-heart surgery on your cluster’s networking. You must read the release notes and have a rollback plan.

Another common pitfall is assuming eBPF is magic dust that makes all hardware limitations vanish. It doesn’t. You still need sufficient CPU and memory on your worker nodes for the Cilium agent and, more importantly, for the kernel to compile and run the eBPF programs. Under-provisioned nodes will feel the pain.

The best practice? Start in a non-production cluster. Use Hubble to understand your traffic flows before you start dropping traffic with aggressive policies. And for the love of all that is holy, use the cilium CLI tool. It is your brilliant debugger, your best friend, and the reason you’ll never go back to staring at iptables rules again. It’s the difference between having a map of the library and just knowing there are a lot of books in there.