28.7 Container Protocol: __len__, __getitem__, __setitem__, __delitem__, __contains__
The container protocol in Python allows objects to emulate the behavior of built-in container types like lists, dictionaries, and sets. By implementing a specific set of dunder methods, you can create custom classes that support operations such as indexing, slicing, membership testing, and determining length. This protocol is fundamental to making your objects “Pythonic”—they work intuitively with the language’s built-in functions and idioms.
Implementing len for Size Reporting
The __len__ method should return a non-negative integer representing the number of items in the container. This method is called by the built-in len() function. A key requirement is that the length must be a non-negative integer; returning a negative number will raise a ValueError.
class Playlist:
def __init__(self, *songs):
self._songs = list(songs)
def __len__(self):
return len(self._songs)
def add_song(self, song):
self._songs.append(song)
my_playlist = Playlist('Song A', 'Song B', 'Song C')
print(len(my_playlist)) # Output: 3
my_playlist.add_song('Song D')
print(len(my_playlist)) # Output: 4
It’s crucial that __len__ is efficient. If calculating the length requires traversing the entire data structure (e.g., a linked list), this can lead to performance issues. The length should also be consistent; it should not change unless the container itself is modified.
Mastering getitem for Indexing and Slicing
The __getitem__ method enables object indexing (e.g., obj[0]) and slicing (e.g., obj[1:5]). It receives the key or index as its argument. To fully support the Python data model, your implementation should handle both integer indices and slice objects.
class Playlist:
def __init__(self, *songs):
self._songs = list(songs)
def __getitem__(self, index):
if isinstance(index, slice):
# Handle slicing by creating a new Playlist instance
return Playlist(*self._songs[index])
elif isinstance(index, int):
# Handle single index
return self._songs[index]
else:
raise TypeError(f"Index must be int or slice, not {type(index).__name__}")
my_playlist = Playlist('Song A', 'Song B', 'Song C', 'Song D')
print(my_playlist[1]) # Output: 'Song B'
print(my_playlist[1:3]) # Output: <__main__.Playlist object at ...>
print(my_playlist[1:3][0]) # Output: 'Song B' (works because the slice returns a new Playlist)
A common pitfall is forgetting to handle slice objects, which will cause slicing operations to fail. When implementing slicing, consider whether to return a new instance of your container type (as shown) or a simpler built-in type like a list. The former is often more intuitive.
Implementing setitem and delitem for Mutability
To create a mutable container that supports assignment (e.g., obj[2] = new_value) and deletion (e.g., del obj[2]), you need to implement __setitem__ and __delitem__ respectively.
class Playlist:
def __init__(self, *songs):
self._songs = list(songs)
def __setitem__(self, index, value):
if isinstance(index, slice):
# Handle slice assignment
self._songs[index] = value
else:
self._songs[index] = value
def __delitem__(self, index):
del self._songs[index]
my_playlist = Playlist('Song A', 'Song B', 'Song C')
my_playlist[1] = 'New Song B'
print(my_playlist[1]) # Output: 'New Song B'
del my_playlist[0]
# Playlist now contains: ['New Song B', 'Song C']
When implementing __setitem__ for slice assignment, remember that the right-hand side of the assignment must be an iterable if you’re replacing multiple elements. A critical best practice is to ensure these methods properly handle edge cases like out-of-bounds indices, which should raise an IndexError.
Optimizing Membership Testing with contains
The __contains__ method implements membership testing with the in operator. While Python can fall back to iterating through the container using __getitem__ if __contains__ isn’t defined, providing a custom implementation can significantly improve performance for large collections.
class Playlist:
def __init__(self, *songs):
self._songs = list(songs)
def __contains__(self, song):
# Optimized membership test using a set for faster lookups
return song in set(self._songs)
my_playlist = Playlist('Song A', 'Song B', 'Song C')
print('Song B' in my_playlist) # Output: True
print('Song D' in my_playlist) # Output: False
Without a custom __contains__ implementation, x in obj would sequentially check each element via __getitem__, which for large collections has O(n) time complexity. By implementing __contains__, you can optimize this operation—in the example above, using a set converts the membership test to approximately O(1) average case complexity.
Integration with Iteration and Other Protocols
Implementing the container protocol often naturally leads to supporting other Python protocols. A class with __getitem__ automatically becomes iterable: Python will repeatedly call __getitem__ with indices starting from 0 until an IndexError is raised. However, for more control and efficiency, it’s recommended to also implement __iter__.
my_playlist = Playlist('Song A', 'Song B', 'Song C')
for song in my_playlist: # Works due to __getitem__
print(song)
Additionally, implementing these methods makes your objects compatible with many built-in functions like reversed(), sorted(), and list comprehensions, further integrating them into the Python ecosystem. The key to a robust implementation is maintaining consistency with built-in types in terms of error handling (raising appropriate exceptions like IndexError and TypeError) and expected behavior.