67.4 memory_profiler: Tracking Memory Usage
Right, let’s talk about memory. It’s the one resource your application can’t just politely ask for more of from the operating system until things get awkward and it gets killed. Most performance tutorials focus on CPU time, but memory bloat is a silent killer. It slows everything down thanks to garbage collection, and it can lead to catastrophic, opaque crashes. So we’re going to stop guessing and start measuring with memory_profiler. This tool is like putting a fuel gauge on your code’s gas-guzzling SUV.
First things first, you’ll need to install it. None of this comes standard, thankfully.
pip install memory-profiler
Basic Line-by-Line Profiling
The killer feature of memory_profiler is its line-by-line analysis. You decorate a function, run a script, and it prints a beautiful, horrifying report of where every memory byte is being allocated. Let’s look at a classic offender: list comprehensions that are a little too comprehensive.
Create a script, call it demo_mprof.py:
@profile
def create_a_mess():
# This seems innocent, right?
innocent_list = [i for i in range(100000)]
print("Checkpoint 1")
# This is the real culprit. We're not just creating a list of numbers,
# we're creating a list of lists of numbers. This is O(n^2) on memory. Yikes.
nested_list = [[j for j in range(i)] for i in range(10000)]
print("Checkpoint 2")
return innocent_list, nested_list
if __name__ == "__main__":
create_a_mess()
Now, run it using the mprof command:
mprof run demo_mprof.py
After it runs, generate the report:
mprof plot
This will fire up a matplotlib window showing the memory usage over time. You’ll see a steep cliff where nested_list was created. But for the real gritty details, run the script directly with the profiler:
python -m memory_profiler demo_mprof.py
You’ll get a console output that looks something like this (abbreviated):
Filename: demo_mprof.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
1 45.898 MiB 45.898 MiB 1 @profile
2 def create_a_mess():
3 49.633 MiB 3.734 MiB 1 innocent_list = [i for i in range(100000)]
4 49.633 MiB 0.000 MiB 1 print("Checkpoint 1")
5
6 133.555 MiB 83.922 MiB 10001 nested_list = [[j for j in range(i)] for i in range(10000)]
7 133.555 MiB 0.000 MiB 1 print("Checkpoint 2")
Look at that Increment on line 6: 83.922 MiB. That’s your smoking gun. The Occurrences column shows it took 10,001 steps to build that monstrosity (the outer loop + all the inner loops). This is immediately more useful than just knowing your whole script uses a lot of memory.
Why The @profile Decorator is Magic (and a Trap)
You might have noticed we just slapped @profile on our function without importing it. That’s because the memory_profiler module injects it into the builtins when you run the script with the -m memory_profiler command. It’s clever, but it’s also a classic footgun.
Try running your script normally with python demo_mprof.py. You’ll get a NameError because @profile isn’t defined. This is the kind of “clever” design decision that makes me sigh. The best practice? Don’t leave the decorator in your production code. Use it in your profiling scripts, which are separate from your main codebase. Or, if you must, do this:
# Conditionally define the decorator if not running under the profiler
# This is a bit hacky but it works.
try:
# __name__ is set to 'memory_profiler' when run via the module
if __name__ != 'memory_profiler':
from memory_profiler import profile
except ImportError:
def profile(func):
return func
Profiling Entire Processes and Time-Based Sampling
Sometimes the line-by-line approach is too granular. You might want to see the memory usage of a whole web server process over 24 hours. The mprof command we used earlier is your friend here. It samples memory usage at a regular interval.
# Start sampling in the background (every 0.1 seconds)
mprof run --interval=0.1 my_long_running_script.py
# ... wait for it to finish, or kill it with Ctrl+C ...
# View the plot
mprof plot
This generates a .dat file and then plots it. The chart will show you steady-state memory, periodic leaks (maybe from a recurring task), and big spikes. If the memory usage has a “sawtooth” pattern—sharp rises followed by sharp falls—that’s usually healthy garbage collection. If it’s a “staircase” pattern—it goes up and never down—you’ve got a memory leak. Congratulations, you’ve found a bug.
Common Pitfalls and The Garbage Collection Gotcha
The most important thing to remember is that memory_profiler is not omniscient. It’s measuring the memory of the entire Python process, which includes the Python interpreter itself and any C extensions. This leads to the biggest “aha!” moment for new users:
A line showing a large memory increment might not be caused by that line, but by that line triggering garbage collection from previous operations.
Python uses reference counting and a cyclic garbage collector. Sometimes you’ll allocate a bunch of objects that are eligible for collection, but they just sit there until the GC decides to run. The line of code that finally pushes memory usage over a threshold will be the one that gets blamed for the GC’s housekeeping work. If you see a negative Increment, that’s the GC finally doing its job. It’s infuriating, but it’s life. Your job is to learn to read the narrative, not just the individual data points.
So, use memory_profiler not as a single source of truth, but as the most powerful flashlight you have for shining into the dark, murky corners of your program’s memory habits. It tells a story. Your job is to interpret it.