22.8 Kyverno: Kubernetes-Native Policy Engine

Right, so you’ve got a cluster. It’s a beautiful, humming ecosystem of pods and services. And then some maniac (it might be you, no judgment) tries to deploy a pod that mounts the host filesystem. The chaos potential is staggering. This is where policy engines come in, and Kyverno is the one that speaks Kubernetes’ language natively. It doesn’t need to translate; it just gets it.

Think of Kyverno as your cluster’s bouncer, rulebook, and automated paperwork clerk, all rolled into one. Unlike generic admission controllers that might use some other language, Kyverno policies are Kubernetes Custom Resources. You define your rules in YAML, just like everything else you deploy. This is its killer feature: you don’t need to context-switch to yet another toolchain.

The Three Flavors of Kyverno Policies

Kyverno policies can do three distinct jobs, and it’s crucial to know which one you’re reaching for.

Validate: This is the classic bouncer. It checks incoming requests against rules and can either just warn the user (“Hey, you probably shouldn’t do that”) or outright deny the request (“Absolutely not, go fix your YAML”). Use this for enforcement: requiring labels, blocking privileged pods, forbidding latest tags.

Mutate: This is your overzealous but helpful paperwork clerk. It sees a request that’s almost right and fixes it on the way in. Forgot a team label? The mutation policy will slap it on for you. This is fantastic for adding default values, standardizing annotations, or injecting sidecars.

Generate: This is the magic trick. A generate policy watches for something (like a new Namespace being created) and then creates an entirely new resource elsewhere in response. Imagine automatically creating a NetworkPolicy default-deny every time a new namespace is spun up. It’s like a supercharged operator that you configure declaratively.

Anatomy of a Validate Policy

Let’s look at a classic: blocking pods from running as root. This is Validate policy 101.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-non-root-user
spec:
  background: false # We care about new pods, not existing ones
  validationFailureAction: Enforce # Deny the request. Use 'Audit' for a dry-run.
  rules:
  - name: check-security-context
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      message: "Pods must not run as root. Set runAsNonRoot: true or runAsUser > 0."
      pattern:
        spec:
          securityContext:
            runAsNonRoot: true

The background field is a sneaky important one. If set to true, Kyverno will continuously scan existing resources to see if they comply. For a rule like this, you usually only want to check new stuff, so false is correct. The match block is self-explanatory. The validate block contains the logic. Here we’re using a pattern, which is like a stencil that the incoming resource must match.

The Power (and Pitfalls) of Mutations

Mutations are incredibly powerful but can feel a bit like sorcery. Let’s automatically add a cost-center label to every Pod in the dev namespace.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-cost-center-label
spec:
  background: false
  rules:
  - name: inject-cost-center
    match:
      any:
      - resources:
          namespaces:
          - dev
          kinds:
          - Pod
    mutate:
      patchStrategicMerge:
        metadata:
          labels:
            cost-center: "dev-budget-2024"

Why the pitfall? Order of operations. If you have multiple mutating policies, they run in a specific order. Kyverno sorts them by name, so a policy named aaa-add-label will run before zzz-add-annotation. Your mutations might depend on a previous mutation having already happened. It’s a classic “it worked on my cluster” issue. Always name your mutating policies deliberately.

Generating Resources with ClusterPolicies

This is where Kyverno ascends to another level. Let’s automatically create a default-deny NetworkPolicy in every new namespace. This is a massive security win.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: generate-default-deny-netpol
spec:
  rules:
  - name: gen-default-deny
    match:
      any:
      - resources:
          kinds:
          - Namespace
    generate:
      kind: NetworkPolicy
      apiVersion: networking.k8s.io/v1
      name: default-deny
      namespace: "{{request.object.metadata.name}}" # This is key!
      synchronize: true # Delete the NetworkPolicy if the namespace is deleted
      data:
        spec:
          podSelector: {}
          policyTypes:
          - Ingress
          - Egress

The magic here is the namespace: "{{request.object.metadata.name}}" field. Kyverno has access to the admission request context, so it can take the name of the newly created namespace and use it as the namespace for the generated NetworkPolicy. The synchronize: true flag ensures cleanup happens, preventing a dangling policy if the namespace is deleted.

Best Practices and The Gotchas

Start with validationFailureAction: Audit. Roll out new policies in dry-run mode first. Check the logs to see what would have been blocked. Your first policy will break something you forgot about.
Be Specific in Your Match Blocks. A policy that matches Pod and Namespace will catch everything. This can murder your API server’s performance. Be as specific as possible in the match and exclude blocks.
Prefer precondition blocks for complex logic. Instead of building a monstrous pattern, use a precondition to filter resources with JMESPath expressions first. It’s cleaner and more performant.
The Kyverno CLI is your best friend. Use kyverno apply /path/to/policy.yaml --cluster to test policies against live resources before you even apply the policy. It saves so much pain.

Kyverno is one of those tools that, once you start using it, you wonder how you ever managed a cluster without it. It turns security and governance from a manual, nagging process into a automated, declarative system. Just treat its mutating powers with the respect they deserve.