6.3 id(), is, and ==: Identity vs Equality
In Python, understanding the distinction between identity and equality is fundamental to avoiding subtle bugs and writing robust, predictable code. This distinction is embodied in the is operator and the == operator, respectively. While they can sometimes appear to yield the same result, they test for two completely different concepts.
The Core Concepts: Identity and Equality
Identity refers to whether two variables reference the exact same object in memory. Every object created in Python has a unique identity, a constant integer (or long integer) which is effectively its memory address. This identity can be retrieved using the built-in id() function. The is operator compares these identities; a is b is equivalent to id(a) == id(b).
Equality, on the other hand, refers to whether two objects contain the same data or have the same value, as defined by the object’s type. The == operator invokes the __eq__() method of the left operand (or the right operand if the left one doesn’t have the method), which is responsible for defining what “value” means for that object. For example, for two lists, == returns True if they have the same length and all corresponding elements are equal.
list_a = [1, 2, 3]
list_b = [1, 2, 3] # Same value, different object
list_c = list_a # Another name for the same object
print(id(list_a)) # Output: e.g., 140245920384000 (unique per run)
print(id(list_b)) # Output: e.g., 140245920387904 (different from list_a)
print(id(list_c)) # Output: e.g., 140245920384000 (same as list_a)
print(list_a is list_b) # False: Different identities
print(list_a is list_c) # True: Same identity
print(list_a == list_b) # True: Same value
When is and == Seem to Behave the Same
The most common point of confusion arises with small integers and the None object. Due to performance optimizations in CPython (a process called interning), integers between -5 and 256 are cached. This means every variable assigned, say, the number 42 will point to the same cached object in memory.
a = 42
b = 42
print(a is b) # True: Because of interning for small integers
print(a == b) # True
c = 1000
d = 1000
print(c is d) # False: Typically, because 1000 is not interned
print(c == d) # True
This behavior is an implementation detail of CPython and should never be relied upon. Always use == for comparing integer values.
The Singleton Pattern: None, True, False
The rule of using is for identity has one critical and universal exception: comparing to singletons. Singletons are types that only ever have one instance. In Python, None, True, and False are singletons. It is not only acceptable but considered best practice to use the is operator when checking if a variable is None.
Why? It is faster and more explicit. The is operator compares two memory addresses, a very cheap operation. The == operator must call the __eq__ method, which could be overloaded (though it isn’t for None), making it theoretically slower and less direct. It clearly communicates the intent: “I am checking for the one and only None object.”
def check_status(result):
# Correct and idiomatic
if result is None:
print("No result was calculated.")
# Incorrect and unpythonic (though it might work)
# if result == None:
# print("No result was calculated.")
check_status(None)
Common Pitfalls and Best Practices
A frequent bug occurs when programmers use is to compare values instead of identity. This often happens with strings. While string interning sometimes makes is seem to work, it is unreliable and a major source of errors.
# Pitfall: Using `is` for string value comparison
name = "hello world"
greeting = "hello world"
print(name is greeting) # Output: Might be True or False (implementation-dependent)
print(name == greeting) # Output: Always True
# Best Practice: Always use `==` for value comparison of all types except singletons.
Another pitfall involves mutable default arguments. The same list object is reused for every function call that doesn’t provide an argument, leading to unexpected shared state.
# Pitfall: Mutable default argument
def add_to_list(item, my_list=[]): # my_list= is evaluated ONCE at definition time
my_list.append(item)
return my_list
list1 = add_to_list(1) # returns [1]
list2 = add_to_list(2) # returns [1, 2] NOT [2]! Same list object was used.
print(list1 is list2) # True: They are the same object.
# Best Practice: Use None as a default for mutable arguments
def add_to_list_fixed(item, my_list=None):
if my_list is None: # Correct use of `is` for singleton
my_list = [] # Create a new list object each time
my_list.append(item)
return my_list
In summary, the choice between is and == is a choice between comparing object identity and object value. Use is exclusively to check if two variables point to the same object, most importantly when checking for the singletons None, True, and False. For all other value-based comparisons, use the == operator. Adhering to this rule will make your code more correct, predictable, and idiomatic.