43.1 CIS Kubernetes Benchmark: Key Controls
Right, the CIS Benchmark. Think of it not as a suggestion box, but as the collective, grumpy wisdom of every engineer who’s ever been paged at 3 AM because of a misconfigured kubelet. It’s a checklist of “please, for the love of all that is holy, do this so you don’t end up on a news website.” We’re not going to cover every single control—that’s a book in itself—but we’ll hit the high-impact, “why hasn’t this been the default?” ones that you can implement today to stop the metaphorical bleeding.
Pod Security: Your First and Best Line of Defense
Forget the castle-and-moat analogy; in Kubernetes, every pod is its own little fortress (or a poorly secured shed). The single biggest thing you can do is enforce Pod Security Standards. This is so important that it’s graduated from a best practice to a built-in admission controller (PodSecurity) in Kubernetes v1.25+. We’ll use the modern built-in method, because if you’re still using the old PodSecurityPolicy (PSP), you’re living in a museum, and I’m here to get you out.
The standards are simple: privileged, baseline, and restricted. Your goal is to run everything as restricted. If something breaks, you try baseline. If you have to run something as privileged, you need to have a long, awkward conversation with yourself about why.
Let’s enforce this at the namespace level. This is the cleanest way. First, let’s label a namespace to dry-run the policy and see what would get rejected. This is your “what fresh hell is this going to break?” step. Always do this first.
apiVersion: v1
kind: Namespace
metadata:
name: my-secure-app
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
# Dry-run first! This will warn but not reject.
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Apply this, then deploy your workloads. Watch the logs of your admission controllers or check events (kubectl get events -n my-secure-app). You’ll see warnings for any pod that violates the restricted profile. Once you’ve fixed everything (or made a conscious decision to use baseline for a specific namespace), you change the audit and warn labels to enforce. Now it will straight-up reject non-compliant pods.
So what does restricted actually do? It forces all the things you should have been doing anyway:
- It forbids running as root (or makes you explicitly set
runAsNonRoot: true). - It requires the pod to drop all capabilities and not add new ones (goodbye,
NET_RAW). - It restricts privilege escalation and sets the filesystem to read-only.
- It enforces seccomp profiles (more on that later).
If you get a pod that needs to break one rule (e.g., it needs to run as root), you can often use fine-grained exemptions within the pod spec itself instead of downgrading the entire namespace. This is the Kubernetes way: default deny, explicit allow.
Kubelet Hardening: Taming the Beast on Every Node
The kubelet is the wild west of Kubernetes components. It runs on every node, with immense power, and historically its configuration was a mess of flags and hope. The CIS Benchmark rightly hammers on this. The modern way to control it is via a Kubelet Configuration File. You can define this and pass it to your worker nodes via your node bootstrap script or your tool of choice (like a MachineConfig in OpenShift).
Here’s a critical subset of a hardened kubelet configuration (kubelet-config.yaml):
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
# Protect kernel defaults. This is non-negotiable.
protectKernelDefaults: true
# Make sure the kubelet only uses swap if you explicitly tell it to. You probably shouldn't.
failSwapOn: false
# This is a big one: authorization mode should always be Webhook.
# It makes the kubelet authorize API requests with the apiserver.
authorization:
mode: Webhook
# And authentication should be, well, on.
authentication:
anonymous:
enabled: false
webhook:
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
# Event records are a classic DoS vector. Create a reasonable limit.
eventRecordQPS: 5
# Rotate certificates automatically. Please.
rotateCertificates: true
# And for the love of all that is good, disable the readonly port (10255).
# Every single kubelet scanner on the internet probes this port.
readOnlyPort: 0
# Make sure you explicitly set the client CA file.
staticPodPath: /etc/kubernetes/manifests
clientCAFile: /etc/kubernetes/pki/ca.crt
Getting this configuration applied across a cluster is node-management-tool specific, but the settings themselves are universal. The readOnlyPort: 0 alone will remove your node from about 80% of the pointless threat intelligence reports you get.
Audit Logging: Because You Can’t Fix What You Can’t See
Kubernetes audit logging is like a security camera for your API server. It records who did what, when, and from where. Without it, you’re flying blind after an incident. The problem is it can generate a truly absurd volume of data. The CIS Benchmark doesn’t just say “enable it”; it wisely tells you to log the right things at the right level.
You don’t need to log every single GET request at the Request level. You’ll drown in logs and your storage bill will give you a heart attack. Instead, you use a policy that logs metadata for most reads but full request and response bodies for critical write operations.
Here’s a snippet from a sane audit policy (audit-policy.yaml):
apiVersion: audit.k8s.io/v1beta1
kind: Policy
rules:
# Don't log these repetitive, health-check type requests.
- level: None
verbs: ["get", "watch", "list"]
resources:
- group: "" # core
resources: ["endpoints", "services", "services/status"]
# Log metadata for most requests at the Metadata level.
- level: Metadata
resources:
- group: "" # core
- group: "apps"
- group: "networking.k8s.io"
omitStages:
- "RequestReceived"
# But for security-critical stuff, log the whole request and response.
# This is for investigating "how did that role get created?"
- level: RequestResponse
verbs: ["create", "patch", "delete", "deletecollection"]
resources:
- group: "" # core
resources: ["secrets", "configmaps"]
- group: "rbac.authorization.k8s.io"
resources: ["clusterroles", "roles", "clusterrolebindings", "rolebindings"]
omitStages:
- "RequestReceived"
You then point your API server to this policy with --audit-policy-file. The key is to be strategic. Log everything you need for forensics, but not so much that you can’t find the signal in the noise.
Network Policies: The Firewall You Didn’t Know You Needed
The default network policy in most Kubernetes clusters is “allow all.” This is objectively insane. It means a compromised pod in namespace A can immediately start probing your database in namespace B. The CIS Benchmark says “hey, maybe don’t do that.” You need to implement network policies to segment your cluster.
Start with a default-deny rule in every namespace. This is the network equivalent of locking your front door.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: my-app
spec:
podSelector: {}
policyTypes:
- Ingress
This policy selects all pods (podSelector: {}) and says no ingress traffic is allowed. It’s the ultimate “you can’t come in” sign. Now, your application is broken because nothing can talk to it. Good! Now you build policies that explicitly allow only the traffic that is necessary.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-app-to-db
namespace: my-app
spec:
podSelector:
matchLabels:
app: api-server
policyTypes:
- Egress
egress:
- to:
- podSelector:
matchLabels:
app: postgres-db
ports:
- protocol: TCP
port: 5432
This policy says “pods labeled app: api-server can only talk egress to pods labeled app: postgres-db on port 5432.” This is the principle of least privilege in action. It’s tedious, yes, but it’s what separates a toy cluster from a professional one. It means that if an attacker compromises your front-end web pod, they can’t even see your database, let alone connect to it.