72.8 sys.settrace(): Writing Your Own Debugger or Profiler
Right, so you’ve graduated from print() statements and you’re ready to get serious. You’ve used pdb and maybe even a fancy IDE debugger, and a little voice in your head asked, “How do they do that?” The answer, my friend, is sys.settrace(). It’s the arcane, powerful, and slightly terrifying incantation that allows you to hook into the CPython interpreter’s execution flow. It’s how debuggers, profilers, and coverage tools are born.
Think of it as a callback function you provide. The interpreter, in a moment of breathtaking honesty, will call your function every time it’s about to execute a new line of code (line events), call a function (call events), or return from one (return events). It hands your function a stack frame and an event name, essentially saying, “Here’s what I’m about to do. Want to do anything about it?”
The Core Concept: The Trace Function
First, you define a trace function. This function must have a specific signature. The interpreter is a stickler for formalities.
import sys
def my_trace_function(frame, event, arg):
# frame: the current stack frame object
# event: a string: 'call', 'line', 'return', 'exception', 'opcode'
# arg: meaning depends on the event. For 'return', it's the value about to be returned.
# Your logic here...
# This function must return itself to continue tracing within the current scope.
return my_trace_function
To start the magic, you tell the system to use your function:
sys.settrace(my_trace_function)
And to stop it, because you will absolutely want to stop it:
sys.settrace(None)
A Simple (and Terrible) Line Tracer
Let’s build something useless but educational: a tracer that just prints every line of code it’s about to execute. You’ll immediately see why this is a chaotic idea.
import sys
def noisy_tracer(frame, event, arg):
if event == 'line':
filename = frame.f_code.co_filename
lineno = frame.f_lineno
print(f"Executing {filename} at line {lineno}")
return noisy_tracer # Keep tracing in this scope
# Let's torture a simple function
def simple_function(x):
result = x * 2
print(f"Result is {result}")
return result
# Arm the tracer
sys.settrace(noisy_tracer)
# This call will now produce a ridiculous amount of output
simple_function(21)
# For the love of all that is holy, disable it immediately after.
sys.settrace(None)
Run that. See? It’s absurd. You get output for every single line, including the one inside the print call itself. This is the first lesson: tracing is incredibly slow and should only be used for targeted debugging or profiling, never in production code. The performance hit is monumental.
A More Practical Example: A Call Profiler
Let’s do something actually useful. We’ll write a basic profiler that records the time spent inside every function. This demonstrates handling the ‘call’ and ‘return’ events.
import sys
import time
class SimpleProfiler:
def __init__(self):
self.call_stack = [] # To track start times for nested calls
self.stats = {} # Function name -> total time spent
def trace(self, frame, event, arg):
if event == 'call':
# Get function name and store start time
func_name = frame.f_code.co_name
self.call_stack.append((func_name, time.time()))
elif event == 'return':
# Pop the last call and calculate elapsed time
if self.call_stack:
func_name, start_time = self.call_stack.pop()
elapsed = time.time() - start_time
# Accumulate the time
if func_name not in self.stats:
self.stats[func_name] = 0.0
self.stats[func_name] += elapsed
return self.trace # Keep tracing
def start(self):
sys.settrace(self.trace)
def stop(self):
sys.settrace(None)
self.print_stats()
def print_stats(self):
for func, total_time in self.stats.items():
print(f"{func}: {total_time:.6f} seconds")
# Usage
profiler = SimpleProfiler()
def slow_function():
time.sleep(0.1)
def fast_function():
pass
def main():
profiler.start()
slow_function()
for _ in range(1000):
fast_function()
profiler.stop()
if __name__ == "__main__":
main()
This will output something like:
slow_function: 0.100129 seconds
fast_function: 0.000123 seconds
main: 0.100456 seconds
Notice how main’s time includes the time of all its children? That’s the beauty of using a stack. Our tracer correctly handles the nested nature of function calls.
Gotchas and Battle-Scarred Wisdom
- It’s Global:
sys.settrace()is a global setting. It affects all threads. This is why modern debuggers often usesys.settracefor the main thread and thethreadingmodule’ssettracefor others. It’s a mess. - Return Your Function: Your trace function must return itself (or another trace function) if you want tracing to continue within the current local scope. If you return
None, tracing is disabled for that scope. This is a common foot-gun. - Performance is Abysmal: I cannot stress this enough. The interpreter is effectively doing a function call for every line and every function call. Your code will run orders of magnitude slower.
- It’s a Debugging Hammer: This isn’t something you sprinkle into your application logic. It’s a tool for building tools—namely debuggers and profilers. If you find yourself wanting to use it in your actual project code, there is almost certainly a better way.
- The
argParameter: Its value changes based on the event. For ‘return’, it’s the return value. For ’exception’, it’s the tuple(exception_type, exception_value, traceback). For ‘call’ and ’line’, it’s justNone. You have to check the event.
So, while you’ll probably never implement a full debugger yourself (leave that to the mad scientists who built pdb), understanding sys.settrace() demystifies how your tools work. It’s the secret door into the interpreter’s execution model, and knowing it exists makes you a more powerful programmer. Just wash your hands after using it.