21.6 Seccomp and AppArmor Profiles
Right, let’s talk about making your containers less of a welcome mat for attackers. You’ve got your Pod Security Admission set up, maybe you’re even using a Restricted policy. Good. But that’s just the bouncer at the club door checking the dress code. Seccomp and AppArmor are the bouncers that actually frisk you, making sure you’re not bringing in any shady system calls or file access patterns. They are, without a doubt, two of the most effective tools in your runtime security arsenal. And yes, I know their names sound like rejected Star Wars characters, but bear with me.
The core idea is brilliantly simple: even if an attacker finds a way to execute code inside your container, their options are severely limited if you’ve already told the kernel, “Hey, this process is only allowed to do these specific things.” It’s the principle of least privilege, enforced at the kernel level.
Seccomp: The Syscall Police
Seccomp (short for Secure Computing Mode) is a Linux kernel feature that filters system calls. A system call, or syscall, is how a program asks the kernel to do something on its behalf—like opening a file or making a network connection. By default, a container has access to a broad set of syscalls, many of which it will never need. A webserver doesn’t need to mount filesystems, for instance.
The magic happens with a profile, which is just a JSON file that defines a default action (SCMP_ACT_ERRNO, SCMP_ACT_ALLOW, etc.) and a list of syscalls with their specific actions. Kubernetes lets you apply this to a pod.
Here’s the thing: writing these profiles from scratch is a pain. You’re basically reverse-engineering your application. The smart move is to start with the audit profile, see what your app actually uses, and then lock it down.
apiVersion: v1
kind: Pod
metadata:
name: my-secure-pod
spec:
containers:
- name: nginx
image: nginx
securityContext:
seccompProfile:
type: Localhost
localhostProfile: profiles/audit.json # Logs but doesn't block
# ... rest of pod spec
You run this, exercise all your app’s functionality, and then check the audit logs (/var/log/audit/audit.log or journalctl). Once you have a list, you can create a restrictive profile. But honestly? You rarely need to. The Kubernetes project maintains a fantastic set of default runtime profiles. For 99% of workloads, you should just use the runtime default.
apiVersion: v1
kind: Pod
metadata:
name: my-secure-pod
spec:
containers:
- name: nginx
image: nginx
securityContext:
seccompProfile:
type: RuntimeDefault # <- This is the good stuff. Use it.
restartPolicy: Never
Why isn’t this the actual default in Kubernetes yet? Politics and legacy concerns, mostly. It should be. Make it your default.
AppArmor: The Filesystem Bouncer
While Seccomp worries about what a process can do (syscalls), AppArmor worries about what it can access (files, directories, network ports, capabilities). It’s a Mandatory Access Control (MAC) system that uses profiles loaded into the kernel. It’s a bit more granular for filesystem access.
Now, a warning: AppArmor is a bit more finicky than Seccomp. It requires the module to be loaded on the host node’s kernel, and the profile must be installed on every node before you try to run a pod that uses it. This is its biggest operational hurdle.
A profile might prevent a process from writing to /etc/passwd or only allow it to read files in /usr/share/nginx/html/. Here’s a snippet from a hypothetical profile for nginx:
# /etc/apparmor.d/nginx-deny-write
profile nginx-deny-write flags=(attach_disconnected) {
#include <abstractions/base>
#include <abstractions/nginx>
# Deny all file writes globally
deny /** w,
# Then explicitly allow logging (a necessary evil)
/var/log/nginx/** w,
/var/log/nginx/*.log w,
}
You’d load it on the host with sudo apparmor_parser -r -W /etc/apparmor.d/nginx-deny-write. Then, in your pod, you reference it:
apiVersion: v1
kind: Pod
metadata:
name: my-apparmor-pod
annotations:
container.apparmor.security.beta.kubernetes.io/nginx: localhost/nginx-deny-write
spec:
containers:
- name: nginx
image: nginx
See the awkward part? It’s an annotation, not a first-class field in the securityContext. This is one of those “questionable choices” I mentioned. The community has been slow to promote this to a stable API, so we’re stuck with annotations for now.
Best Practices and Pitfalls
- Start with the Runtimes: Before you write a single line of custom profile, use
seccompProfile.type: RuntimeDefault. It will block a huge amount of dangerous syscalls with zero effort. - Audit First, Enforce Later: Always test with a logging or audit profile (for both tools) before you flip the switch to enforcing mode. You will break your application if you guess wrong.
- Node Homogeneity: For AppArmor, your nodes must be identical. If the profile isn’t on a node, the pod will flat-out refuse to start on it. This can seriously mess with your scheduling.
- The Dependency Trap: Your custom profile for
my-appmight work until it callscurlorwgetas a subprocess. Those tools will inherit the restrictive profile and likely fail. You either need to profile every binary you call or architect your app to avoid shelling out. - They Are Not Silver Bullets: These are fantastic depth-of-defense tools, but they aren’t magic. A determined attacker with access to allowed syscalls can still do damage. They raise the barrier to entry significantly; they don’t build an impenetrable wall.
The bottom line is this: not using these tools is like building a fortress and leaving the back gate wide open. The runtime default seccomp profile is trivial to implement and provides massive bang for your buck. Start there today. AppArmor requires more operational overhead but is invaluable for locking down access patterns on critical workloads. Master them both. Your cluster will thank you.