33.6 collections.namedtuple: Lightweight Immutable Records
The collections.namedtuple function provides a highly efficient way to create simple, immutable data-holding classes. It serves as a middle ground between a basic tuple and a full-fledged class, offering the readability of named attributes while retaining the memory efficiency and performance characteristics of a tuple. Under the hood, a named tuple is a subclass of the built-in tuple type, which is why it inherits traits like immutability, iteration support, and unpacking.
Creating a namedtuple
To create a new named tuple class, you call namedtuple() from the collections module. This function is a factory function; it dynamically generates a new class based on the specifications you provide. It requires two arguments: the name of the new class and a string of field names. The field names can be provided as a string of space- or comma-separated names or as a list of strings.
from collections import namedtuple
# Create a 'Person' class with fields 'name', 'age', and 'job'
Person = namedtuple('Person', 'name age job')
# Alternatively, using a list or commas
# Person = namedtuple('Person', ['name', 'age', 'job'])
# Person = namedtuple('Person', 'name, age, job')
# Instantiate the new class
alice = Person('Alice', 30, 'Software Engineer')
bob = Person(name='Bob', age=25, job='Data Analyst')
print(alice) # Output: Person(name='Alice', age=30, job='Software Engineer')
print(bob.name) # Output: Bob
Accessing Data: By Index and By Name
The primary advantage of a named tuple over a regular tuple is the ability to access fields by a meaningful name (alice.job) instead of a magic number index (alice[2]). This makes the code self-documenting and less prone to errors if the order of fields changes. However, because it is a tuple, all tuple operations remain available.
# Access by index (like a regular tuple)
print(alice[0]) # Output: Alice
# Access by field name (the namedtuple advantage)
print(alice.name) # Output: Alice
# Iteration and unpacking work as expected
for field in alice:
print(field)
name, age, job = alice
print(f"{name} is a {job}") # Output: Alice is a Software Engineer
Immutability and _replace()
A core characteristic inherited from tuple is immutability. Once an instance is created, its fields cannot be altered. This makes named tuples excellent for representing constant, hashable data, such as database records or configuration settings. Attempting to change a value will raise an AttributeError. To “modify” an instance, you must use the _replace() method, which returns a new named tuple with the specified changes, leaving the original unchanged.
# alice.age = 31 # This would raise: AttributeError: can't set attribute
# Create a new instance with the 'age' field updated
alice_updated = alice._replace(age=31)
print(alice_updated) # Output: Person(name='Alice', age=31, job='Software Engineer')
print(alice is alice_updated) # Output: False (they are different objects)
_fields, _asdict(), and _make()
Named tuples provide several useful helper methods. The _fields property returns a tuple of the field names, which is useful for introspection. The _asdict() method returns an OrderedDict (or a regular dict in Python 3.8+) mapping field names to their corresponding values, which is perfect for serializing data to JSON. The _make() class method creates a new instance from an iterable (like a list or a database cursor result), serving as a more readable alternative to the standard constructor.
print(Person._fields) # Output: ('name', 'age', 'job')
# Convert to a dictionary
person_dict = alice._asdict()
print(person_dict) # Output: {'name': 'Alice', 'age': 30, 'job': 'Software Engineer'}
# Create an instance from an iterable
data_from_db = ['Charlie', 40, 'Manager']
charlie = Person._make(data_from_db)
print(charlie) # Output: Person(name='Charlie', age=40, job='Manager')
Inheritance and Adding Methods
While the primary use case is for data storage, you can extend a generated named tuple class via inheritance. This allows you to add new methods or override default ones, such as __str__. This technique combines the memory efficiency of the named tuple with custom behavior.
class EnhancedPerson(Person):
def __str__(self):
return f"{self.name} ({self.age}) - {self.job}"
@property
def is_senior(self):
return self.age >= 65
david = EnhancedPerson('David', 68, 'Retired')
print(david) # Output: David (68) - Retired
print(david.is_senior) # Output: True
Common Pitfalls and Best Practices
- Immutability: Remember that instances are immutable. This is a feature for data integrity but can be a limitation if you need mutable records. For mutable alternatives, consider
types.SimpleNamespaceordataclasses.dataclass. - Field Names: Field names cannot be Python keywords (e.g.,
class,def). If your data source has such headers, you must rename them. Therenameparameter (namedtuple('MyClass', 'name class def', rename=True)) can automatically rename invalid fields to positional names (e.g.,_1,_2). - Default Values: Standard
namedtupledoes not support default values for fields. A common pattern to implement them is to override the__new__method. Thedataclassis a better choice if default values are a primary requirement. - Memory Usage: Named tuples are very memory efficient because they do not have a per-instance
__dict__(unless subclassed and something triggers its creation). Each instance stores only its field values, making them ideal for creating large numbers of objects. - Type Hints: The generated class can be used with type hints, but for explicit type checking, you should annotate it. For more modern and flexible type-hinted data classes, the
typing.NamedTupleor the@dataclassdecorator are recommended.