31.4 Writing a Reusable Validator Descriptor
While Python’s property decorator is excellent for adding managed access to a single attribute in a single class, it lacks reusability. If you need the same validation logic across multiple attributes or multiple classes, copying and pasting the property definition is a maintenance nightmare. This is where writing a descriptor, specifically a reusable validator descriptor, becomes a powerful tool. A descriptor is a class that implements at least one of the __get__, __set__, or __delete__ methods. These methods are the “machinery” that properties are built upon, and by creating your own, you can encapsulate validation logic into a single, reusable object.
The Anatomy of a Validator Descriptor
A reusable validator descriptor is a class designed to be instantiated and assigned as a class attribute. Its __set__ method is where the core validation logic resides. The key to its reusability is that the descriptor instance stores the validated value on the instance it belongs to, using a unique attribute name to avoid collisions.
class ValidatedString:
def __init__(self, min_length=0, max_length=None):
self.min_length = min_length
self.max_length = max_length
# Create a unique, private-looking storage name based on the descriptor's own id
self.storage_name = f"_{id(self)}_value"
def __set__(self, instance, value):
if not isinstance(value, str):
raise TypeError(f"Expected a string, got {type(value).__name__}")
if len(value) < self.min_length:
raise ValueError(f"Value must be at least {self.min_length} characters long")
if self.max_length is not None and len(value) > self.max_length:
raise ValueError(f"Value must be at most {self.max_length} characters long")
# Store the validated value on the instance itself
setattr(instance, self.storage_name, value)
def __get__(self, instance, owner):
if instance is None:
return self # Accessed on the class, not an instance
# Return the value stored on the instance, or a default if not set
return getattr(instance, self.storage_name, None)
Why the Storage Name is Crucial
Notice the use of self.storage_name. A common pitfall is trying to store the value on the descriptor instance itself (e.g., self._value). This is incorrect because the descriptor is a class attribute shared by all instances of the owning class. Storing the value on self would mean all instances of the class would share the same value! By storing the data on the instance object using a dynamically generated, unique name, we ensure each instance has its own separate storage. Using f"_{id(self)}_value" is a simple way to guarantee uniqueness, though in practice, using the intended public name with a prefix (e.g., f"__{self.public_name}") is also common if the descriptor is aware of it.
Implementing the Descriptor in a Class
Using the descriptor is straightforward. You instantiate it as a class attribute. Each instance of the descriptor can be configured independently, making the validation logic highly reusable and configurable.
class UserProfile:
# Reusable descriptor for a non-empty username
username = ValidatedString(min_length=1, max_length=20)
# Reusable descriptor for a bio with a configurable max length
bio = ValidatedString(max_length=500)
def __init__(self, username, bio):
self.username = username # This assignment triggers ValidatedString.__set__
self.bio = bio # This assignment triggers the other descriptor's __set__
# Example usage
try:
profile = UserProfile("ValidUser", "This is my bio.")
print(profile.username) # Output: ValidUser
invalid_profile = UserProfile("", "This will fail.") # Raises ValueError
except ValueError as e:
print(e)
Handling Defaults and the __get__ Method
The __get__ method must handle two cases: when accessed from an instance (returning the stored value) and when accessed from the class (typically returning the descriptor object itself, which is useful for introspection). It’s also a best practice to provide a sensible default if the attribute hasn’t been set yet, like None in our example. Returning getattr(instance, self.storage_name, None) achieves this gracefully without raising an AttributeError.
Advanced Best Practices: Leveraging the Descriptor Protocol Fully
For a truly robust validator, consider these enhancements:
__delete__Method: Implement__delete__to control what happens whendel obj.attris called, often removing the stored value from the instance.__set_name__(Python 3.6+): This method is automatically called when the descriptor is instantiated as a class attribute and is informed of the name it was assigned to. This is the modern, cleaner way to set the storage name, eliminating the need for theid(self)trick.
class ModernValidatedString:
def __set_name__(self, owner, name):
self.storage_name = f"_{name}" # e.g., assigned to 'username', store as '_username'
def __set__(self, instance, value):
... # validation logic remains the same
setattr(instance, self.storage_name, value)
By crafting a reusable validator descriptor, you move beyond one-off properties and create a robust, configurable, and maintainable system for enforcing data integrity across your entire codebase.