67.7 Python-Specific Wins: Local Variables, Attribute Lookup, Slots
Right, let’s talk about making your Python code less… pokey. We’ve all been there. You’ve written something beautiful, it’s logically pristine, and then you run it. And you go get a coffee. And you come back. And it’s still chugging away. Before you start rewriting the whole thing in Rust (a noble, if dramatic, impulse), let’s look at some of the low-hanging, Python-specific fruit you can pluck for some easy speed wins.
The secret sauce here isn’t magic; it’s understanding how the Python interpreter, CPython, works under the hood. Most of our performance gains come from reducing the number of steps it has to take. And two of the biggest culprits for unnecessary steps are variable and attribute lookup.
The Speed of Local Variables
Here’s a piece of trivia that’ll win you zero bar bets but might make your code faster: accessing a local variable is dramatically quicker than accessing a global variable or an attribute. Why? It all comes down to the dictionary.
In Python, almost every scope has a dictionary behind it storing its variables. globals() gives you the global dict, locals() gives you the local one (though it’s a bit more complicated, which we’ll get to). The interpreter finds variables by looking them up in these dictionaries.
But local variables inside a function get a special treat. CPython will store them in a fast, array-like structure (not a hash table) that can be accessed by a fixed index, not a name lookup. It’s a direct pointer reference. A global variable, however, requires a dictionary lookup in the module’s globals() dict. A dictionary lookup is fast, but it’s not “array index with a known offset” fast.
Let’s prove it with a quick and dirty benchmark using timeit.
import timeit
global_var = "I'm global"
def test_local():
local_var = "I'm local"
for i in range(1_000_000):
x = local_var # Fast local lookup
def test_global():
for i in range(1_000_000):
x = global_var # Slower global lookup
# Time the operations
local_time = timeit.timeit(test_local, number=100)
global_time = timeit.timeit(test_global, number=100)
print(f"Local variable time: {local_time:.3f}s")
print(f"Global variable time: {global_time:.3f}s")
print(f"Global is {global_time / local_time:.1f}x slower")
On my machine, the global lookup is consistently about 1.5x to 2x slower. The lesson? If you’re using a global constant inside a tight loop, bring it local. Just do local_copy = global_name at the top of your function. You’ve just turned a slow dictionary lookup into a blazingly fast array reference for every subsequent access. It’s a trivial change for a potentially massive cumulative gain.
The Perils of Dot-Driven Development (Attribute Lookup)
If you thought global lookups were bad, meet their slower cousin: attribute lookup. Every time you write obj.attr, Python has to do work. It might have to look up the attribute on the instance itself (obj.__dict__), then if it’s not there, it has to walk up the chain through the class and its parent classes. This Method Resolution Order (MRO) is powerful, but it’s not free.
Now imagine doing that a million times in a loop. It adds up fast. The fix is simple and, frankly, a bit ugly: cache the attribute in a local variable.
class SomeClass:
def __init__(self, name):
self.name = name
def slow_method(self):
for i in range(1_000_000):
# This does an attribute lookup EVERY. SINGLE. TIME.
if self.name == "target":
pass
def fast_method(self):
# Cache the attribute to a local variable ONCE.
name = self.name
for i in range(1_000_000):
if name == "target":
pass
# Let's benchmark the misery
obj = SomeClass("target")
slow_time = timeit.timeit(obj.slow_method, number=10)
fast_time = timeit.timeit(obj.fast_method, number=10)
print(f"Slow method: {slow_time:.3f}s")
print(f"Fast method: {fast_time:.3f}s")
print(f"Slow is {slow_time / fast_time:.1f}x slower")
You’ll often see the fast method run twice as fast or more. The tighter the loop, the bigger the win. This is probably the single most effective micro-optimization you can make in Python. It feels like cheating, but it’s just being smart. Stop making the interpreter do the same work over and over again.
Using __slots__ to Tell Python to Cool It With the Dicts
Here’s where we get into more advanced territory. By default, every instance of a Python class has a __dict__ dictionary that stores all its attributes. This is incredibly flexible—you can add new attributes whenever you want! It’s also memory-inefficient and slow.
For a million-instance data structure, all those dictionaries represent a huge amount of overhead. Enter __slots__. It’s a class-level attribute where you tell the interpreter: “Hey, instances of this class will only ever have these specific attributes. You can pre-allocate space for them in a fixed-size array and skip creating the whole __dict__.”
class RegularUser:
def __init__(self, name, user_id):
self.name = name
self.user_id = user_id
class SlotsUser:
__slots__ = ['name', 'user_id'] # <-- This is the magic line
def __init__(self, name, user_id):
self.name = name
self.user_id = user_id
# Memory usage comparison
from sys import getsizeof
regular_users = [RegularUser("Alice", i) for i in range(1_000_000)]
slots_users = [SlotsUser("Alice", i) for i in range(1_000_000)]
print(f"Memory per regular user: {getsizeof(regular_users[0])} bytes")
print(f"Memory per slots user: {getsizeof(slots_users[0])} bytes")
You’ll see the slots version uses significantly less memory—often 2-3x less. This also translates to speed: attribute access on a slotted class is faster because it’s a direct lookup in a fixed-size array, not a hash table in __dict__.
Now, the massive, glaring caveats: Using __slots__ means you can’t dynamically add new attributes to these instances. No more user.email = "alice@example.com" if email isn’t in __slots__. It also breaks certain features like weak references unless you explicitly add '__weakref__' to your __slots__ tuple. Use __slots__ only for classes you are certain are finished and will be instantiated a vast number of times. It’s a trade-off: you’re sacrificing flexibility for performance and memory. Don’t just slap it on every class you own; you’ll regret it the first time you need to monkey-patch something for debugging. It’s a powerful tool, not a default setting.