64.5 Test-Driven Development: Red-Green-Refactor

Right, so you’ve heard the gospel of Test-Driven Development. You’ve seen the zealots preach about the “design tool” and the “safety net.” And you’re probably thinking, “That sounds nice, but my deadline is Friday.” I get it. Let’s cut through the dogma and talk about what TDD actually is: a fantastically productive way to write code if you use it as a disciplined feedback loop, not a religious artifact. The core rhythm is stupidly simple: Red, Green, Refactor. It’s the discipline that’s hard.

The Three-Step Cadence

Think of it like a heartbeat. One full cycle, repeated over and over.

Red: Write a failing test. Not a “I’ll just stub this out real quick” test. A genuinely, gloriously failing test. You haven’t written the implementation yet! This is the most critical step. It proves your test is actually testing something and isn’t a false positive. If you skip this, you’re just writing tests after the code, which is fine, but it’s not TDD.
Green: Write the simplest, dumbest, most embarrassing code that could possibly make the test pass. I’m talking return 42; levels of stupid. Your goal here isn’t elegance, architecture, or performance. Your goal is to pass the test. This keeps you focused and prevents you from over-engineering a solution for problems you don’t even have yet.
Refactor: Now you can make it good. Now you can tidy up your embarrassing implementation, remove duplication, apply patterns, and make it clean, efficient, and readable. The key is your tests are now your backstop. You can change the implementation with confidence, knowing the tests will tell you if you break the intended behavior.

Let’s build a classic example, a Calculator class. Our first test might be for an add method.

# test_calculator.py
import pytest
from calculator import Calculator

def test_add_two_numbers():
    calculator = Calculator()
    result = calculator.add(2, 3)
    assert result == 5

Running this now (pytest) will fail spectacularly. Of course it will. We haven’t even created the calculator.py file. This is the Red stage. Good. We’ve proved our test is valid.

Now, for Green, we write the absolute minimum.

# calculator.py
class Calculator:
    def add(self, a, b):
        return 9 # Dumb, but it makes the test pass, right?

Wait, no. That’s wrong. The test expects 5, not 9. See how even this simple step can go awry? Let’s do it properly.

# calculator.py
class Calculator:
    def add(self, a, b):
        return a + b

Run the test. It passes. Green. We’re done. Ship it!

Just kidding. Now we Refactor. Is there anything to refactor here? Not really. It’s already pretty clean. So we move on. The cycle is complete. We add another test, maybe for adding negative numbers.

def test_add_negative_numbers():
    calculator = Calculator()
    result = calculator.add(-1, -1)
    assert result == -2

We run the tests. They both pass. Great. No new implementation needed. The cycle was Red (we wrote the test, it would have failed if the implementation was broken), Green (it already was), Refactor (n/a).

Why This Feels Backwards (And Why It Isn’t)

Your brain will scream that this is inefficient. “I know how to write an add function! Why am I pretending I don’t?” Because you’re not just testing the code; you’re testing the design of its API.

You first experience the API from the caller’s perspective (calculator.add(2, 3)). Does it feel good? Are the parameters in the right order? Is the name clear? This immediate feedback forces you to design usable interfaces. You’re writing the manual first, then building the machine to match it.

The Art of the “Simplest Thing”

The “simplest thing” rule is what separates TDD practitioners from TDD theorists. Your job in the Green phase is to be a code monkey, not a software architect.

Is the requirement to find the largest number in a list? Your first pass should be return 42 to pass the test assert find_largest([42]) == 42. Then you write a test for [1, 42, 3] and you update it to return max([1, 42, 3]) or just return 42 again if you’re cheeky. Finally, a third test with [100, 42, 300] forces you to write the real, generic logic.

This feels silly, but it prevents you from writing a complex, generic solution upfront for a problem that only has one test case. You let the tests drive the complexity.

The Refactor Crucible

The Refactor step is where TDD pays its rent. This is where you clean up the mess you were allowed to make in the Green phase. You see duplication between your tests? Refactor it into a setup function or a fixture. You see a messy function in your implementation? Now is the time to break it into smaller, well-named functions, all while your test suite assures you you’re not breaking anything.

The beautiful part is that the “testability” of your code becomes a first-class concern. Code written under TDD is almost always less coupled and more modular because you were forced to write it in isolation, from the outside in. If it’s hard to test, it’s a design smell, and TDD makes you smell it immediately.