Inheritance is a powerful mechanism for creating hierarchies of related classes, and data classes fully support this paradigm. When a data class inherits from another data class, it inherits all fields from the parent class and can define its own additional fields. This allows for the creation of specialized data models while maintaining a common structure and reusing boilerplate code.

Basic Inheritance Syntax and Field Ordering

When creating a subclass of a data class, the subclass automatically inherits all the fields defined in its parent. The fields are ordered with the parent’s fields first, followed by the child’s fields. This ordering is crucial because it determines the order of parameters in the automatically generated __init__ method and affects the behavior of methods like __repr__ and the comparison methods (__eq__, __lt__, etc.).

from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int

@dataclass
class Employee(Person):
    employee_id: str
    department: str

# The __init__ signature is: __init__(self, name: str, age: int, employee_id: str, department: str)
emp = Employee("Alice", 30, "E12345", "Engineering")
print(emp)  # Output: Employee(name='Alice', age=30, employee_id='E12345', department='Engineering')

The reason for this strict ordering is to ensure that the generated methods behave predictably. For example, when comparing two Employee instances, the comparison is done as a tuple: (name, age, employee_id, department). If the parent’s fields were not always first, comparisons between a parent Person instance and a child Employee instance could become semantically ambiguous and technically error-prone.

Providing Default Values in a Hierarchy

A common challenge arises when a parent data class defines a field with a default value and a child data class defines a field without a default. Fields without default values cannot come after fields with default values in the parameter list of an __init__ method. Since the child’s fields are appended to the parent’s, this rule must be respected across the entire inheritance chain.

@dataclass
class Vehicle:
    wheels: int = 4  # Default value

# This will cause a TypeError!
@dataclass
class Car(Vehicle):
    make: str  # No default value -> Must come before fields with defaults

# The generated __init__ would be: __init__(self, wheels=4, make: str)
# This is invalid in Python because a non-default argument follows a default argument.

To fix this, you must ensure that any field in a child class without a default value is placed before any fields in the parent class that have default values. The only way to achieve this is to redefine the parent’s fields with default values in the child class, using the same default. This can be done by setting init=False on the inherited field to exclude it from the __init__ and then redefining it in the correct order, but the standard and clearest approach is to restructure the hierarchy so that all classes are @dataclass and defaults are consistent.

@dataclass
class Vehicle:
    # This field must be without a default if a child might have non-default fields.
    wheels: int

@dataclass
class Car(Vehicle):
    make: str
    wheels: int = 4  # Now we provide the default in the child, fixing the __init__ order

car = Car(make="Toyota") # wheels will use its default value of 4
print(car) # Output: Car(wheels=4, make='Toyota')

Overriding Methods and the @dataclass Decorator

The @dataclass decorator generates methods for the class it decorates. When you apply it to a child class, it does not re-generate methods for the parent class; it only generates methods for the child class based on the total set of fields found in the entire MRO (Method Resolution Order). This is why you must re-apply the @dataclass decorator on every subclass in the hierarchy.

You can freely override any of the auto-generated methods (__init__, __repr__, __eq__, etc.). If you need to override __init__, it is best practice to call the generated __init__ via super() to ensure all fields are properly initialized, and then add any custom logic.

@dataclass
class AuditableEntity:
    created_by: str

    def __post_init__(self):
        self.creation_timestamp = time.time()  # Custom field not in __init__

@dataclass
class User(AuditableEntity):
    username: str
    email: str

    def __post_init__(self):
        # It's critical to call the parent's __post_init__ if it exists
        super().__post_init__()
        # Custom validation for the child class
        if "@" not in self.email:
            raise ValueError("Invalid email address")

# The __init__ generated for User includes created_by, username, and email.
# The __post_init__ chain ensures both the parent's and child's custom logic runs.
try:
    user = User(created_by="admin", username="johndoe", email="invalid-email")
except ValueError as e:
    print(f"Error: {e}") # Output: Error: Invalid email address

Key Pitfalls and Best Practices

  1. Field Ordering is Absolute: Always be mindful of the combined field order (parent first, then child). Mismatches in default values between parent and child are the most common source of errors.
  2. Explicit is Better: If your inheritance hierarchy becomes complex with many default values, consider making defaults explicit at every level or using a different pattern, like composition with a shared base @dataclass object.
  3. Call super() in __post_init__: If you override __post_init__ in a subclass and the parent also has one, you must call super().__post_init__() to ensure the parent’s initialization logic is not skipped.
  4. Inheritance from Non-Data Classes: A data class can inherit from a regular class. The @dataclass decorator will only process the fields it finds on the data class and its data class parents. Any attributes defined in the regular parent class are ignored by the decorator’s code generation but are still inherited normally.
  5. Frozen Inheritance: If a parent data class is frozen (frozen=True), any child data class must also be frozen. Attempting to create a non-frozen subclass from a frozen parent will result in a ValueError. This is enforced to maintain the immutable contract of the parent class.