64.6 coverage.py: Measuring What Is Tested

Right, let’s talk about coverage. You’ve written some tests. You’ve run them. They pass. You feel good. But a nagging question remains: did I actually test all the code I just wrote, or did I just run the happy path and call it a day? This is where coverage.py comes in—it’s the brutally honest friend who tells you there’s spinach in your teeth. It doesn’t care about your intentions; it just reports which lines of your code were executed while your tests were running.

Think of it as a highlighter for your source code. After your test suite finishes, coverage.py shows you which lines got highlighted (executed) and, more importantly, which ones are still bleak, untouched plain text (not executed). This is invaluable. It moves you from “I think my tests are pretty good” to “I have concrete data showing that 23% of my calculate_risk() function has never even been called, which is terrifying.”

Installing and Running the Damn Thing

First, get it. It’s a Python library, so it’s a one-liner.

pip install coverage

Now, you don’t run your tests with python anymore. You run them with coverage. The most straightforward way is:

# Run your tests, collecting data
coverage run -m pytest  # if you use pytest

# Or, if you're a unittest purist
coverage run -m unittest discover

# Generate the report
coverage report

This will spit out a tidy console table showing each module and its line coverage percentage. It’s a good quick hit. But the real magic is in the detailed HTML report.

coverage html

This creates a htmlcov/ directory. Open index.html in your browser. Now you can click through every file and see every single line. Green lines were run. Red lines were not. It’s impossible to ignore a giant red block of code you forgot to test. This is where the real engineering begins.

What Coverage Is (And What It Brutally Isn’t)

Here’s the critical insight: coverage measures execution, not quality.

A 100% coverage score does not mean your code is bug-free. It means every line was run at least once. I could write a test that calls a function with None for every argument, triggering all sorts of exceptions and hitting every line. The test would be useless, but coverage would be perfect. Coverage tells you what is untested; it’s your job to write the tests that make that coverage meaningful.

It also doesn’t measure logical paths. Look at this trivial function:

def check_value(x):
    if x > 10:
        return "High"
    else:
        return "Low"

A test with x=15 will execute the if branch and the return "High" line. A test with x=5 will execute the else branch and the return "Low" line. You need both to get 100% coverage here. But what about x=10? The else branch is taken (10 > 10 is False), so it’s covered. But what if the logic was if x >= 10? Suddenly, x=10 takes the first branch. The point is, coverage shows you which lines were hit, but you still have to use your brain to ensure all the decisions (branch coverage) are tested.

Configuring It to Shut Up About Your Venv

Out of the box, coverage.py will analyze every single Python file it sees, including all the libraries in your virtual environment. This is useless noise. Your venv/lib/site-packages/ directory is not your problem. You need a .coveragerc file to tell it what to ignore.

# .coveragerc
[run]
omit =
    */site-packages/*
    */venv/*
    */tests/*

[report]
# Show the slowest 10 tests? Great for finding bottlenecks.
show_missing = True
skip_covered = True

The omit directive is crucial. It prevents third-party library code from cluttering your report. I also often omit my tests/ directory itself—I don’t care if my test code is covered, I care if my application code is covered.

The Pitfalls: Lies, Damned Lies, and Statistics

The biggest mistake is chasing a high percentage like it’s a high score in a video game. Engineering teams that mandate “100% coverage or the build fails” are often rewarded with a codebase full of garbage tests that achieve nothing but hitting lines. The goal isn’t 100%; the goal is meaningful coverage. Use a high number as a guide, not a law. 95% with thoughtful tests is infinitely better than 100% with nonsense.

Another pitfall is not understanding context. coverage.py can’t measure code run in separate processes. If you use multiprocessing or kick off a subprocess, those lines won’t be tracked by the main coverage process. You’re on your own there.

Integrating into Your Workflow

This shouldn’t be a thing you run manually. It should be part of your CI/CD pipeline. Every pull request should generate a coverage report, and the diff in coverage should be a talking point. “This new feature adds 200 lines of code, but only 40% of it is covered by tests. Let’s fix that before we merge.”

Tools like pytest-cov integrate it directly into your test runner for an even smoother experience.

pip install pytest-cov
pytest --cov=myproject --cov-report=term-missing

The --cov-report=term-missing flag is brilliant—it adds a column to the console output showing which specific lines are missing. No more excuses.

So, use coverage.py relentlessly. Let it be your guide, your critic, and your motivator. But never let it be your boss. You’re the engineer. It’s just a tool. A very, very good one.