45.7 collections.abc: Abstract Base Classes for Container Types
The collections.abc module provides a set of abstract base classes (ABCs) that define and enforce the API contracts for various container types in Python. These ABCs serve a dual purpose: they act as a blueprint for building custom containers that integrate seamlessly with the Python ecosystem, and they provide a robust, high-level way to perform type checking based on an object’s capabilities rather than its concrete class. This “duck typing” approach is a cornerstone of Python’s design philosophy.
The Role of Abstract Base Classes
Abstract Base Classes exist to formalize the interfaces that Python’s built-in containers implement. Before their introduction, checking if an object was a sequence typically involved verifying it was an instance of list or tuple, which was fragile and broke polymorphism. The ABCs allow you to ask a more fundamental question: “Does this object behave like a sequence?” by checking isinstance(obj, abc.Sequence). This check will return True for list, tuple, str, and any custom class that correctly implements the __len__ and __getitem__ methods. This makes code more flexible and future-proof.
Key ABCs and Their Hierarchies
The module defines a hierarchy of ABCs, each specifying a required set of methods. Understanding this hierarchy is crucial.
Container: The most basic, requiring only the__contains__method (for theinoperator).Iterable: Requires__iter__; the foundation for all objects that can be looped over.Sized: Requires__len__; for objects that can report their size.Collection: A useful base class that combinesIterable,Sized, andContainer.Sequence: Inherits fromReversibleandCollection. Requires__getitem__and__len__. Represents immutable, ordered containers liketupleandstr. Mutable sequences likelistare also considered sequences.MutableSequence: Inherits fromSequenceand adds required methods for mutation:__setitem__,__delitem__,insert,append, etc.Mapping: Inherits fromCollection. Requires__getitem__,__iter__,__len__, and__contains__. Represents a read-only key-value store like thetypes.MappingProxyTypeof adict.MutableMapping: Inherits fromMappingand adds__setitem__,__delitem__,pop,clear, etc., defining the interface for mutable dictionaries.Set: Inherits fromCollection. Requires__iter__,__len__, and the methods that implement mathematical set operations:__le__(subset),__contains__, etc.MutableSet: Inherits fromSetand adds methods for in-place mutation:add,discard,clear, etc.
Using ABCs for Type Checking and Registration
The primary use of these ABCs is in type checking with isinstance() and issubclass(). This is far superior to checking against concrete types.
from collections.abc import Sequence, MutableMapping
def process_items(items):
if not isinstance(items, Sequence):
raise TypeError("items must be a sequence")
for item in items:
print(item)
def update_config(config, new_settings):
if not isinstance(config, MutableMapping):
raise TypeError("config must be a mutable mapping")
config.update(new_settings)
# These calls work correctly:
process_items([1, 2, 3]) # A list is a Sequence
process_items((1, 2, 3)) # A tuple is a Sequence
process_items("hello") # A string is a Sequence
my_dict = {}
update_config(my_dict, {'theme': 'dark'}) # A dict is a MutableMapping
print(my_dict) # Output: {'theme': 'dark'}
You can also register your own custom classes with an ABC, even if they don’t explicitly inherit from it, using the register() method. This declares that your class provides the required interface, making it pass isinstance() checks.
from collections.abc import Sequence
class MyCustomContainer:
def __init__(self, data):
self.data = data
def __len__(self):
return len(self.data)
def __getitem__(self, index):
return self.data[index]
# Register MyCustomContainer as a virtual subclass of Sequence
Sequence.register(MyCustomContainer)
custom_obj = MyCustomContainer([10, 20, 30])
print(isinstance(custom_obj, Sequence)) # Output: True
Implementing a Custom Container Using ABCs
The most powerful application is to inherit from an ABC to build a custom container. This ensures you don’t forget to implement any required methods and provides default implementations for many others. For example, creating a custom list that only accepts integers.
from collections.abc import MutableSequence
class IntList(MutableSequence):
def __init__(self, initial_data=None):
self._data = []
if initial_data is not None:
self.extend(initial_data) # Use our own extend which uses our append
def _validate(self, value):
if not isinstance(value, int):
raise TypeError("IntList only supports integer values.")
def __getitem__(self, index):
return self._data[index]
def __setitem__(self, index, value):
self._validate(value)
self._data[index] = value
def __delitem__(self, index):
del self._data[index]
def __len__(self):
return len(self._data)
def insert(self, index, value):
self._validate(value)
self._data.insert(index, value)
# __contains__, index, count, etc. are automatically inherited with efficient defaults.
# Usage
try:
my_ints = IntList([1, 2, 3])
my_ints.append(5) # Works
my_ints.append('a') # Raises TypeError
except TypeError as e:
print(e)
Common Pitfalls and Best Practices
- Inheritance vs. Registration: Prefer explicit inheritance from an ABC when building a new container from scratch. Use registration only for integrating existing classes that you cannot alter.
- Performance of Default Methods: The ABCs provide default implementations for many methods (e.g.,
Sequence.__contains__does a linear scan using__getitem__). For large custom containers, you should override these with more efficient implementations if possible (e.g., by implementing__contains__directly for a tree-based structure). isinstanceovertype: Always useisinstance(obj, abc.SomeABC)instead oftype(obj) is listto check an object’s type. This makes your code compatible with a wider range of objects, including subclasses and virtual subclasses.- Understanding Mutability: Be precise. A
listis aMutableSequence, but atupleis only aSequence. Passing atupleto a function that expects aMutableSequenceand tries to mutate it will correctly raise an error at the type-checking stage, preventing a more cryptic error later.