13.6 Merging Dicts: | Operator and ** Unpacking
In Python, merging dictionaries is a fundamental operation for combining data from multiple sources. Historically, developers relied on methods like dict.update() or loops, but these modified the original dictionary in-place. Modern Python (3.5+ for ** unpacking in dict literals, 3.9+ for the | operator) provides two elegant, expressive, and non-destructive ways to merge dictionaries: the | operator and ** unpacking. Understanding the nuances of each is crucial for writing clean and effective code.
The | Merge Operator
Introduced in Python 3.9, the | operator provides a clean, intuitive syntax for merging two dictionaries. It creates a new dictionary containing the combined key-value pairs of both operands. This operation is non-destructive; the original dictionaries remain unchanged.
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 99, 'c': 3}
merged_dict = dict1 | dict2
print(merged_dict) # Output: {'a': 1, 'b': 99, 'c': 3}
print(dict1) # Output: {'a': 1, 'b': 2} (unchanged)
print(dict2) # Output: {'b': 99, 'c': 3} (unchanged)
The operator is right-biased. In the case of duplicate keys, the value from the right-hand dictionary (dict2) overwrites the value from the left-hand dictionary (dict1). This behavior is consistent with other update operations in Python. The |= operator is the corresponding in-place version, which acts like dict.update().
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 99, 'c': 3}
dict1 |= dict2 # In-place update
print(dict1) # Output: {'a': 1, 'b': 99, 'c': 3}
The ** Unpacking Method
Before the | operator, the primary method for creating a new merged dictionary was using the ** unpacking syntax within a dictionary literal. This technique, available since Python 3.5, unpacks the key-value pairs of the source dictionaries into a new dictionary literal.
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 99, 'c': 3}
merged_dict = {**dict1, **dict2}
print(merged_dict) # Output: {'a': 1, 'b': 99, 'c': 3}
The unpacking order defines the conflict resolution. The dictionary unpacked last takes precedence. In the example above, **dict2 comes after **dict1, so its value for key 'b' wins. You can unpack more than two dictionaries in a single expression.
default_config = {'theme': 'dark', 'language': 'en'}
user_config = {'language': 'fr', 'notifications': True}
session_config = {'theme': 'light'}
final_config = {**default_config, **user_config, **session_config}
print(final_config)
# Output: {'theme': 'light', 'language': 'fr', 'notifications': True}
# session_config (last) overrides 'theme', user_config (middle) overrides 'language'
Key Differences and Best Practices
While both methods often produce the same result, their differences are critical.
- Mutability: The
|operator is explicit and designed solely for dictionary merging. The**unpacking is a more general-purpose feature being repurposed within a dict literal. The intent of the|operator is often clearer to readers. - Readability and Chaining: The
|operator can be cleanly chained for merging multiple dictionaries, which is more readable than a long chain of unpacking.# With | operator (cleaner for chaining) result = dict1 | dict2 | dict3 | dict4 # With ** unpacking (can become unwieldy) result = {**dict1, **dict2, **dict3, **dict4} - Type Flexibility: The
**unpacking method requires its arguments to be mapping objects (e.g., dicts). The|operator is more flexible; it can be implemented for other types via the__or__dunder method. For example,collections.Counteralso uses|for union.
Best Practice: For new code targeting Python 3.9 and above, prefer the | operator for its clarity and purpose-built design. Reserve ** unpacking for more complex scenarios, like merging dictionaries with additional literal key-value pairs in the same expression.
# Easily adding a new key during the merge with **
base_settings = {'timeout': 30}
extra_settings = {**base_settings, 'retries': 3, 'mode': 'verbose'}
print(extra_settings) # Output: {'timeout': 30, 'retries': 3, 'mode': 'verbose'}
# Equivalent with | requires creating a temporary dict
extra_settings = base_settings | {'retries': 3, 'mode': 'verbose'}
Common Pitfalls and Edge Cases
Non-String Keys: Both methods handle non-string keys perfectly, which is a significant advantage over passing a dictionary to a function with
**(which requires string keys).dict1 = {1: 'one'} dict2 = {2: 'two'} merged = dict1 | dict2 # Works correctly: {1: 'one', 2: 'two'}Shallow Copy: Both techniques perform a shallow copy. Only the references to the keys and values are copied into the new dictionary. If the values are mutable objects (like lists or other dicts), modifying them in the original dict will affect the merged dict, and vice-versa.
dict1 = {'list': [1, 2, 3]} dict2 = {'a': 1} merged = dict1 | dict2 merged['list'].append(4) print(dict1['list']) # Output: [1, 2, 3, 4] - Original is modified!To avoid this, use a deep copy (
copy.deepcopy) if you need complete independence from the original dictionaries.Performance: For very large dictionaries, both methods have similar performance characteristics as they both need to create a new dictionary and copy all key-value pairs. The
|operator might have a slight constant-time advantage as it’s implemented at a lower level, but in practice, the difference is often negligible. The in-place|=operator will generally be faster for large merges as it avoids creating a completely new dictionary structure, instead extending the existing one.