39.3 Controlling the OOM Killer with oom_score_adj
Right, so you’ve met the OOM Killer. It’s the kernel’s panic button, the process-sacrificing daemon that steps in when your system is gasping for its last breath of RAM. It’s not a graceful exit; it’s a chaotic, shotgun-blast approach to memory reclamation. You can’t turn it off, and you wouldn’t want to—without it, a single memory-hogging process would lock up your entire machine. But you can bribe it. You can whisper in its ear and say, “Hey, if you have to kill someone… please, not this one.” Or, conversely, “For the love of all that is holy, kill that one first.” That’s what oom_score_adj is for.
Think of every process on your system as having a hidden “killability” score, the oom_score. When memory gets critically low, the OOM Killer basically sorts the process list by this score and shoots the one at the top. The oom_score_adj value is your direct line to influence this score. It’s a knob you can turn from -1000 to +1000, and it gets added to the kernel’s internal calculation (which is based on things like memory usage and runtime).
How oom_score_adj Actually Works
The math is simple: final_oom_score = kernel_calculated_oom_score + oom_score_adj. A value of -1000 is the ultimate “get out of jail free” card. It makes a process nearly unkillable. A value of +1000 is basically painting a giant target on the process’s back, screaming “SHOOT ME FIRST.” Most processes live at 0, taking their chances with the kernel’s judgment.
You’ll find the current oom_score and oom_score_adj for any process in the /proc filesystem. Let’s say your important database has PID 42.
# Let's see how killable process 42 is
cat /proc/42/oom_score
cat /proc/42/oom_score_adj
# Let's make it much less likely to be killed
echo -1000 | sudo tee /proc/42/oom_score_adj
# Now check its new, much safer oom_score (it's been adjusted)
cat /proc/42/oom_score
The key insight here is that you’re adjusting a score, not setting an absolute priority. The kernel’s calculation still matters. A process using 20GB of RAM with a oom_score_adj of -500 will probably still have a higher final score than a process using 10MB with an adjustment of 0. You’re influencing the outcome, not dictating it.
The Permanent, “Right” Way: Using systemd
Poking values into /proc is fine for a quick test, but it’s ephemeral—it vanishes on reboot. For any serious service, you’ll want to make this permanent. If you’re on a modern Linux distro, systemd is how you do it. The designers actually got this one right; it’s a clean, simple integration.
You can set this directly in a service’s unit file. Let’s protect our precious redis-server:
sudo systemctl edit redis-server.service
This opens an override file. Add the following, saving and closing when done:
[Service]
OOMScoreAdjust=-500
Now reload systemd and restart the service to apply the change:
sudo systemctl daemon-reload
sudo systemctl restart redis-server
Systemd will handle writing the correct value to the oom_score_adj file for all processes in the service’s cgroup. It’s robust and manages the lifecycle for you. This is almost always the best practice.
The Pitfalls and the “Gotchas”
Here’s where I get honest. This isn’t a magic forcefield. If you protect too many processes from the OOM Killer, you’ve effectively neutered it. The kernel will still desperately need memory, but all the top candidates for killing will be untouchable thanks to your adjustments. The result? The system hard-locks. You’ve traded a single process being killed for the entire box going down. Not great.
The other classic mistake is being too aggressive in the other direction. You might think, “I’ll just set oom_score_adj=1000 on all my unimportant cron jobs!” But if one of those jobs briefly uses a lot of memory during a critical system-wide OOM event, it will be first against the wall. This might be fine… unless that “unimportant” job was in the middle of a critical financial transaction or writing to a database. You can’t easily unwind a killed process.
My rule of thumb: use negative adjustments sparingly and strategically. Protect your absolute most critical infrastructure: your database, your main application server. Maybe one or two others. Leave everything else at 0. Use positive adjustments even more rarely, and only on processes that are truly disposable and known to be memory-hogs.
When All Else Fails: A Nuclear Option (and a Joke)
Sometimes, you need to be sure a process won’t be killed, consequences be damned. The kernel provides an even bigger hammer: oom_score_adj’s older, weirder cousin, oom_adj. The interface is deprecated and clunky, but it exists. Setting oom_adj to -17 automatically sets oom_score_adj to -1000. It’s the same ultimate protection, just through a legacy interface.
It’s a testament to the fact that even kernel developers sometimes can’t bear to completely remove old, janky code. It’s the programming equivalent of that one weird tool in your garage you never use but can’t bring yourself to throw away because what if you need it. You almost certainly don’t. Stick with oom_score_adj and systemd. It’s the clean, maintainable way to tell the OOM Killer who your favorites are.