71.3 The dis Module: Disassembling Bytecode
Right, let’s get our hands dirty. You’ve heard Python is an “interpreted language,” but that’s a bit of a simplification. It’s not interpreting your my_script.py file directly. First, it compiles your beautiful, readable source code into a much simpler, more mechanical set of instructions called bytecode. This is the machine language for the Python Virtual Machine (PVM), and it’s what actually gets executed. To see this in action, you don’t need a hex editor; you have the dis module.
dis is our decompiler, our X-ray goggles, our window into the soul of the PVM. It takes a function, method, or code object and shows you the literal step-by-step instructions Python will follow. This isn’t just academic; it’s how you understand why that clever one-liner is secretly a performance nightmare, or debug something that should work but mysteriously doesn’t.
What You’re Actually Looking At
Let’s start simple. Don’t just read this, type it in.
import dis
def hello_world():
x = 1
y = 2
z = x + y
return z
dis.dis(hello_world)
You’ll get an output that looks something like this (the numbers on the far left can vary by Python version, and that’s fine):
4 0 LOAD_CONST 1 (1)
2 STORE_FAST 0 (x)
5 4 LOAD_CONST 2 (2)
6 STORE_FAST 1 (y)
6 8 LOAD_FAST 0 (x)
10 LOAD_FAST 1 (y)
12 BINARY_ADD
14 STORE_FAST 2 (z)
7 16 LOAD_FAST 2 (z)
18 RETURN_VALUE
Let’s break this down. Each line represents a single bytecode instruction. The first number (e.g., 4) is the line number in your source code—incredibly helpful for mapping this back to what you wrote. The second number (0, 2, 4…) is the byte offset of the instruction within the bytecode sequence. Notice they’re all even? That’s because in current Python versions, each instruction is 2 bytes (one for the opcode, one for the argument).
The word like LOAD_CONST is the operation itself. The number after it is its argument. And the part in parentheses (1) is a human-friendly representation of what that argument means—in this case, the constant value 1.
The flow here is dead simple: load a constant onto an internal stack, store it in a local variable (x), do it again for y, then load both variables back onto the stack, add them together (BINARY_ADD pops the two top stack items, adds them, and pushes the result back on), store that result in z, and finally load z back onto the stack to return it. The PVM is a stack-based machine, which is why you see all this loading and storing. It’s the core of how everything works.
Constants vs. Variables: The Great Stack Heist
Notice the difference between LOAD_CONST and LOAD_FAST. LOAD_CONST grabs a value from the function’s co_consts tuple—a collection of immutable values like integers, small strings, and tuples of other constants. LOAD_FAST and STORE_FAST, on the other hand, are for local variables. They reference values by a fast, array-like index, not by name. This is why local variable access is so quick. The “FAST” is a literal description. The interpreter has already figured out all the variable names at compile time and replaced them with indices.
Now, watch what happens when we do something slightly less straightforward.
def use_global():
return some_global_variable
dis.dis(use_global)
1 0 LOAD_GLOBAL 0 (some_global_variable)
2 RETURN_VALUE
Ah, LOAD_GLOBAL. This is a much more expensive operation than LOAD_FAST. The interpreter has to look up the name in the global scope (which involves checking a dictionary), and this cost is repeated every single time the instruction is executed. This is a key insight from using dis: local variables are cheap, global lookups are not. If you’re in a tight loop, copy that global to a local first.
When the Compiler Outsmarts You (Or Doesn’t)
The compiler that generates this bytecode isn’t infinitely clever. It follows rules. Sometimes those rules produce results that are, frankly, a bit silly. Let’s look at a classic.
def constant_fold():
return 5 * 3 + 1
def not_constant_fold():
a = 5
return a * 3 + 1
print("Constant folded:")
dis.dis(constant_fold)
print("\nNot constant folded:")
dis.dis(not_constant_fold)
Constant folded:
2 0 LOAD_CONST 1 (16)
2 RETURN_VALUE
Not constant folded:
5 0 LOAD_CONST 1 (5)
2 STORE_FAST 0 (a)
6 4 LOAD_FAST 0 (a)
6 LOAD_CONST 2 (3)
8 BINARY_MULTIPLY
10 LOAD_CONST 3 (1)
12 BINARY_ADD
14 RETURN_VALUE
Isn’t that brilliant? In constant_fold(), the compiler saw a bunch of constants and did the math itself at compile time. It just LOAD_CONST 16 and returns. No calculation needed at runtime. This is called constant folding. But in not_constant_fold(), because we stored 5 in a variable first, the compiler gave up and compiled the full sequence of instructions. It’s not smart enough to trace the value of a back to a constant. This is why you might see performance differences between seemingly equivalent code.
Best Practices and Pitfalls
- Use it for Performance Debugging: If a function feels sluggish,
disit. You’ll quickly spot expensive global lookups (LOAD_GLOBAL) or attribute lookups (LOAD_ATTR) inside loops. The fix is often to assign the value to a local variable first. - Understand Language Constructs: Why is a list comprehension usually faster than a
forloop with.append()?disboth. You’ll see the comprehension avoids the overhead of loading theappendmethod attribute for every single iteration. - Don’t Obsess Over Instruction Count: Fewer bytecode instructions generally means faster code, but it’s not a perfect correlation. Some instructions (like
LOAD_GLOBAL) are far more expensive than others (likeLOAD_FAST). Use a profiler for the final say. - The Stack is Your Friend: Remember the model. Almost all operations involve pushing to and popping from the stack. If an output feels confusing, mentally trace the stack’s state after each operation.
The dis module demystifies the “magic” of Python. It turns an abstract concept like “interpreted” into a concrete, understandable process. It’s the single best tool for going from writing Python to genuinely understanding it.