43.5 Runtime Security with Falco: Detecting Anomalous Behavior
Right, so you’ve got your cluster up. It’s running. You’ve hopefully locked down the API server, you’re using RBAC like a responsible adult, and your network policies are tighter than a submarine’s door. Good. But here’s the uncomfortable truth: that’s all preventative security. It’s the castle walls and the moat. It assumes the bad guy is still outside. What happens when someone, through a clever exploit or a horrifying credential leak, gets inside? You need a guard who walks the halls, listens at doors, and shouts “HEY, THAT’S WEIRD” at the top of their lungs when they see something they don’t like. That guard is Falco.
Falco is the de facto standard for runtime security in Kubernetes. It’s an intrusion detection system that hooks directly into the kernel of your nodes (via eBPF or a kernel module) to watch everything: every system call, every file opened, every network connection made by every pod. It compares this firehose of events against a powerful set of rules written in YAML. When it sees something naughty, it can scream at you via Slack, PagerDuty, or just straight-up kill the offending process. It’s glorious.
Why Falco Uses eBPF (And Why You Should Care)
You might see old instructions mentioning a Falco kernel module. Ignore them. The future, and the present, is eBPF (extended Berkeley Packet Filter). Think of eBPF as safely injecting tiny, sandboxed programs directly into the Linux kernel without having to recompile it or reboot your machine. It’s a superpower. Falco uses it to tap into the syscall stream, the most raw and unfiltered source of truth about what’s happening on your node. This is why it’s so effective; it sees everything, completely bypassing any application-level obfuscation. A malicious process can’t hide its execve call from the kernel.
Installing Falco: The Helm Way
You could install Falco manually on each node. Don’t. You’re running Kubernetes. Use the package manager. Helm is the way.
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update
helm install falco falcosecurity/falco --namespace falco --create-namespace
This drops Falco as a DaemonSet, meaning one pod will run on every node in your cluster, watching everything on that node. It’s the most efficient way to get comprehensive coverage.
Writing Rules That Aren’t Useless
The out-of-the-box rules are decent, a great starting point. But they’re generic. The real magic happens when you write your own. This is where you stop detecting theoretical threats and start catching the stuff that would actually ruin your Tuesday.
Let’s say you have a super sensitive pod that should never, ever spawn a shell. The following rule would lose its mind if it did.
- rule: Shell Spawn in Sensitive App
desc: A shell was spawned by a process in the sensitive-app container.
condition: >
container.name == "sensitive-app" and
proc.name in (bash, sh, zsh)
output: >
A shell was spawned in a sensitive container (user=%user.name container_name=%container.name
shell=%proc.name parent=%proc.pname cmdline=%proc.cmdline)
priority: CRITICAL
tags: [container, shell, mitre_execution]
Let’s break this down because the condition field is the beating heart of the rule:
container.name == "sensitive-app": Scope it. This rule only cares about this specific container. This prevents a flood of alerts every time a developer useskubectl execto debug their own pod.proc.name in (bash, sh, zsh): The action. It’s looking for the execution of these known shell binaries.- The combination of the two is what makes it precise and actionable. The
outputfield is your alert message; make it tell you exactly what you need to know to start investigating.
The Pitfall of Alert Fatigue (And How to Avoid It)
The quickest way to get your Falco alerts ignored is to have them fire constantly for nonsense. Tuning is not optional; it’s the entire job. Your first pass will be noisy. Embrace it.
Start with
DRYRUN: For a new custom rule, setoutput: dry-run:instead of a real alert. This lets you see what would have triggered without waking anyone up at 3 AM. Watch the Falco logs for a day.output: dry-run: A shell was spawned... (cmdline=%proc.cmdline)Use Exceptions: Falco’s rules have a built-in
exceptionslist for this. Maybe your CI/CD system legitimately needs to spawn a shell in this container for initialization. Don’t disable the rule; create an exception for that specific process or parent process.- rule: Shell Spawn in Sensitive App ... exceptions: - name: ci_cd_process fields: [proc.pname] comps: [in] values: - ["my_ci_runner"]Adjust Priorities: Not every alert is a “CRITICAL.” Is a shell spawn critical? Probably. Is a process writing to a log file you didn’t expect? Maybe that’s a “WARNING” while you investigate.
Responding to Threats, Not Just Yelling About Them
Getting an alert is step one. Doing something about it is step two. Falco can be configured with output connectors to send events to other systems, but for a truly robust response, integrate it with the Kubernetes API itself. This is where Falcosidekick comes in, a companion project that can forward Falco alerts to a dozen different places and, crucially, to a custom Kubernetes service.
Imagine a service that receives a Falco alert for “Shell Spawn in Sensitive App” and immediately executes a kubectl delete pod on the offending pod. It’s automated incident response. This is advanced, and you need to be incredibly careful you don’t create a self-DoS system, but the capability is there. This moves you from detection to active defense.
The key takeaway? Falco isn’t a “set and forget” tool. It’s a core part of your cluster’s nervous system. You need to feed it with well-tuned rules that reflect your actual environment, listen to its complaints, and refine its understanding. Do that, and you’ll have that brilliant, paranoid friend walking the halls of your cluster, making sure nothing moves without a good reason.