17.6 Walrus Operator in Comprehensions
The Walrus operator (:=), introduced in Python 3.8, offers a powerful way to streamline comprehensions by allowing you to assign a value to a variable and use that same value in an expression within a single line. Its most compelling use case within comprehensions is to avoid redundant calculations, thereby improving both performance and code clarity. Instead of calling an expensive function or recalculating a value multiple times in the filter clause and the output expression, you can compute it once, assign it with the walrus, and reuse the result.
Basic Syntax and a Motivating Example
Consider a list of strings, and you wish to create a list of their lengths, but only for strings where the length is greater than 3. Without the walrus operator, you might write:
words = ["hi", "hello", "world", "python", "a"]
# Inefficient approach: len(word) is called twice for each qualifying word
long_words = [len(word) for word in words if len(word) > 3]
print(long_words) # Output: [5, 5, 6]
Here, len(word) is computed twice for every word that passes the filter: once in the condition (len(word) > 3) and once in the output expression (len(word)). For a cheap operation like len, this is negligible, but for a slow function or complex calculation, this duplication becomes costly. The walrus operator elegantly solves this:
words = ["hi", "hello", "world", "python", "a"]
# Efficient approach: len(word) is called once, assigned to 'n', and reused
long_words = [n for word in words if (n := len(word)) > 3]
print(long_words) # Output: [5, 5, 6]
The expression (n := len(word)) does two things: it assigns the result of len(word) to the variable n, and the entire parenthesized expression evaluates to that assigned value. This value is then used for the conditional check > 3. Since n is now in scope for the rest of the comprehension, it can be used in the output expression.
Scope and Variable Lifetime
A crucial detail often overlooked is the scope of the variable assigned by the walrus operator. The variable (n in our example) is bound in the scope containing the comprehension, not a scope local to the comprehension itself. This means the variable persists after the comprehension has finished executing.
words = ["hello", "world"]
result = [n for word in words if (n := len(word)) > 2]
print(result) # Output: [5, 5]
print(n) # Output: 5 - 'n' leaks into the outer scope!
This “leaking” is a common pitfall. It can lead to subtle bugs if an existing variable is accidentally overwritten. It is considered a best practice to use descriptive variable names for walrus assignments to minimize the risk of naming collisions and to make it obvious that the variable is being set in the outer scope.
Use in Dictionary and Set Comprehensions
The walrus operator is equally valuable in dictionary and set comprehensions. The same principle applies: compute a value once and reuse it.
data = ["apple:5", "banana:12", "cherry:0", "date:8"]
# Create a dict of fruit: count for items with count > 0
fruit_counts = {
name: count
for item in data
if (count := int((parts := item.split(':'))[1])) > 0
for name in [parts[0]] # A trick to use 'parts' in the output
}
print(fruit_counts) # Output: {'apple': 5, 'banana': 12, 'date': 8}
This example demonstrates a more complex but powerful pattern. The inner part (count := int((parts := item.split(':'))[1])) > 0 first splits the string and assigns the resulting list to parts. It then indexes that list, converts the value to an integer, assigns it to count, and uses it for the filter. To then use the parts variable in the output expression (to get the fruit name), a subsequent for clause (for name in [parts[0]]) is used. This is a common idiom for making multiple walrus-assigned variables available for the output expression.
Pitfalls and Best Practices
Parentheses are Mandatory: The walrus operator has a very low precedence. You must use parentheses around the entire assignment expression in contexts like a filter condition. Forgetting them will lead to a
SyntaxError.# Wrong: SyntaxError result = [n for x in range(5) if n := x > 2] # Correct result = [n for x in range(5) if (n := x) > 2]Avoid Overuse and Unreadability: The primary goal is clarity and efficiency. Don’t use the walrus operator to cram overly complex logic into a single line. If an expression becomes hard to read, it’s often better to refactor the comprehension into a traditional
forloop for maintainability.Beware of Leaking Variables: As shown earlier, the variable assigned in the walrus persists. Be mindful of this and choose names accordingly. If you need to use the walrus operator but want to avoid leaking, a subsequent traditional
forloop can reassign ordelthe variable.Not for Generator Expressions: The scoping behavior is identical in generator expressions. However, because generators are lazy, the value of the leaked variable will be the value from the last iteration of the generator once it has been fully consumed, which can be highly confusing and error-prone.
gen = (n for word in ["hello", "world"] if (n := len(word)) > 2) print(list(gen)) # Output: [5, 5] print(n) # Output: 5 (after consumption)