21.7 OPA/Gatekeeper and Kyverno: Policy Engines

Alright, let’s talk about the grown-ups in the room for policy enforcement: OPA/Gatekeeper and Kyverno. You’ve got your Pod Security Standards, but they’re just that—standards. They’re a list of “thou shalts” and “thou shalt nots” sitting on a website. To actually enforce them in your cluster, you need a bouncer. That’s what these policy engines are. They’re admission controllers that intercept requests to the Kubernetes API server and say, “Nope, not gonna happen,” based on rules you define. Forget boring manual checks; this is where you automate your cluster’s law and order.

Now, you have two primary contenders here, and they approach the problem from fundamentally different philosophies. Choosing between them isn’t just about features; it’s about how you want to think about policy.

The Constraint Framework: OPA with Gatekeeper

First up is the duo: Open Policy Agent (OPA) and its Kubernetes-native sidekick, Gatekeeper. Think of OPA as the brilliant, generic policy brain—it can enforce rules for anything from your Kubernetes API calls to your Terraform plans. But to effectively work with Kubernetes, it needs a translator. That’s Gatekeeper’s job.

Gatekeeper provides the Custom Resource Definitions (CRDs) that let you define policies as Kubernetes-native objects. You write your rules in OPA’s language, Rego. It’s powerful, but let’s be honest, Rego has a learning curve that can feel like trying to solve a Rubik’s cube while blindfolded. You define two things: a ConstraintTemplate (the reusable policy logic) and a Constraint (the actual instance of that policy, with its parameters).

Here’s a ConstraintTemplate that acts like a PSA (Pod Security Standard) Baseline rule, forbidding containers from running as root.

apiVersion: templates.gatekeeper.io/v1
kind: ConstraintTemplate
metadata:
  name: k8scontainernoroot
spec:
  crd:
    spec:
      names:
        kind: K8sContainerNoRoot
      validation:
        openAPIV3Schema:
          properties:
            message:
              type: string
  targets:
    - target: admission.k8s.gatekeeper.v1
      rego: |
        package k8scontainernoroot

        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          container.securityContext.runAsUser == 0
          msg := sprintf("Container '%v' must not run as root (runAsUser=0).", [container.name])
        }

        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          not container.securityContext.runAsUser
          msg := sprintf("Container '%v' must set runAsUser to a non-zero value.", [container.name])
        }

And then you apply a Constraint to actually enforce it:

apiVersion: constraints.gatekeeper.io/v1beta1
kind: K8sContainerNoRoot
metadata:
  name: no-root-containers
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
  parameters:
    message: "You shall not run as root!"

The power here is immense. You can write a Rego policy for almost any imaginable scenario. The pitfall? That immense power. You can write inefficient Rego that kills your API server’s performance, and you have to manage the lifecycle of both the templates and the constraints.

The Kubernetes-Native Native: Kyverno

Kyverno (Greek for “govern”) takes a completely different approach. Its core philosophy is: “Why learn a new language? Policies should be Kubernetes resources that you write in YAML, just like everything else.” It’s a brilliant idea. Instead of Rego, you use expressions that look like Kubernetes field selectors.

This makes Kyverno policies often much easier to read and write for anyone already comfortable with K8s YAML. Let’s implement that same “no root” policy.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-non-root-user
spec:
  validationFailureAction: Enforce
  background: false
  rules:
  - name: validate-runAsUser
    match:
      resources:
        kinds:
        - Pod
    validate:
      message: "Running as root is not allowed. Please set a non-zero runAsUser."
      pattern:
        spec:
          containers:
          - =(securityContext):
              =(runAsUser): "?0"

See? No custom language. It’s just YAML, matching on the structure of a Pod spec. The ?0 is a pattern anchor that means “not zero”. It’s declarative and intuitive. Kyverno also has two killer features Gatekeeper lacks: mutating policies and image verification. You can fix things, not just block them. Want to add a standard label to every Namespace? Kyverno can do that on the fly. Want to ensure only signed, trusted images from your registry run? Kyverno’s your tool.

The trade-off? While it’s catching up, some of the mind-bendingly complex logical checks you can do in raw Rego might be harder or less performant in Kyverno. It’s a tool designed for 95% of the Kubernetes policy use cases, and it excels at them.

So, Which One Do You Pick?

This is the eternal debate. My rule of thumb is this:

Choose Kyverno if your policies are primarily about Kubernetes itself, you value a low learning curve, and you love the idea of mutating policies to set sane defaults. It feels like a natural extension of Kubernetes.
Choose OPA/Gatekeeper if you need to write policies that extend beyond Kubernetes (e.g., for Terraform, API endpoints), you have extremely complex logical requirements, or your team already has deep Rego expertise.

You can even run both. It’s not crazy. Use Kyverno for its brilliant mutations and simple validations, and keep Gatekeeper around for that one insanely complex policy that only Rego can express. The key is to stop thinking of policy as a checklist and start treating it as code: version it, test it, and deploy it through CI/CD like any other critical infrastructure.