The Nature of Binary Representation

Before manipulating bits, it’s crucial to understand that these operations are performed on the binary representation of integers. In Python, integers are represented using a system called two’s complement, which allows for the consistent representation of both positive and negative numbers. The number of bits used is variable and handled by the Python interpreter, abstracting away the hardware-specific limitations found in lower-level languages like C. This means that negative numbers are represented by a theoretically infinite number of leading ‘1’s, and positive numbers by leading ‘0’s. This internal detail explains why bitwise operations behave consistently across different numbers, regardless of their sign.

Bitwise AND (&)

The bitwise AND operator compares each bit of its first operand to the corresponding bit of its second operand. If both bits are 1, the corresponding result bit is set to 1. Otherwise, the result bit is 0. This operation is fundamental for masking, a technique used to extract specific bits from a number. A mask is a value that has 1s in the positions you wish to keep and 0s in the positions you wish to ignore. For instance, to check if the least significant bit (LSB) of a number is set (which indicates if the number is odd), you can AND it with 1.

x = 13  # Binary: 1101
mask = 1 # Binary: 0001
result = x & mask
print(f"Is {x} odd? {result == 1}")  # Output: Is 13 odd? True
print(f"Result: {result}")           # Output: Result: 1

# Extracting the 3rd bit (from the right, 0-indexed)
y = 22  # Binary: 10110
third_bit_mask = 0b100 # Decimal: 4
extracted_bit = (y & third_bit_mask) >> 2
print(f"The third bit of {y} is: {extracted_bit}") # Output: The third bit of 22 is: 1

Bitwise OR (|)

The bitwise OR operator compares each bit of its operands. If either bit is 1, the corresponding result bit is set to 1. This is primarily used for setting specific bits to 1. A common use case is to combine multiple flags into a single value. You create a value where only the bits representing the flags you want to enable are set to 1 and then OR it with the original value.

flags = 0    # Start with all flags off (binary: ...0000)
FLAG_A = 0b0001
FLAG_B = 0b0010
FLAG_C = 0b0100

# Enable FLAG_A and FLAG_C
flags = flags | FLAG_A | FLAG_C
print(bin(flags))  # Output: 0b101

# Check if FLAG_B is enabled
if flags & FLAG_B:
    print("FLAG_B is on")
else:
    print("FLAG_B is off")  # Output: FLAG_B is off

Bitwise XOR (^)

The exclusive OR (XOR) operator sets the result bit to 1 only if the corresponding bits in the operands are different. If they are the same, the result bit is 0. A key property of XOR is that it is reversible: (a ^ b) ^ b = a. This makes it invaluable for simple encryption algorithms, toggling bits, and finding the unique item in a sequence of pairs.

a = 0b1100
b = 0b1010
result = a ^ b
print(bin(result))  # Output: 0b110

# Toggling a specific bit (the 2nd bit)
x = 9  # Binary: 1001
toggle_mask = 0b0100 # Decimal: 4
x_toggled = x ^ toggle_mask
print(f"Original: {bin(x)}")      # Output: Original: 0b1001
print(f"Toggled:  {bin(x_toggled)}") # Output: Toggled:  0b1101 (13)

# Simple "swap" without a temporary variable
x = 5
y = 3
x = x ^ y
y = x ^ y  # Now y equals original x (5)
x = x ^ y  # Now x equals original y (3)
print(f"x: {x}, y: {y}")  # Output: x: 3, y: 5

Bitwise NOT (~)

The bitwise NOT operator is a unary operator that flips all the bits in its operand. It turns 0s into 1s and 1s into 0s. Due to Python’s use of two’s complement and an infinite-bit representation, the result of ~x is -x - 1. This often surprises newcomers who expect ~0b0011 to be 0b1100 in a 4-bit system. Instead, it’s a very large negative number. This is because NOT flips all bits, including the infinite leading zeros of a positive number into infinite leading ones, which represents a negative number in two’s complement.

x = 5        # Binary: ...000101
not_x = ~x   # Binary: ...111010 (which is -6 in two's complement)
print(not_x) # Output: -6

# To get the unsigned-like behavior for a fixed number of bits (e.g., 4 bits):
mask = 0b1111
result = ~x & mask
print(bin(result)) # Output: 0b1010 (which is 10)

Bit Shifts (<<, >>)

The left shift operator (<<) moves all bits to the left by a specified number of positions. Zeros are shifted in on the right. This operation is equivalent to multiplying the number by 2 ** n, provided no overflow occurs. The right shift operator (>>) moves all bits to the right. For positive numbers, zeros are shifted in on the left. This is equivalent to integer division by 2 ** n.

# Left Shift: Multiplication by powers of two
value = 3
left_shifted = value << 3  # 3 * (2**3) = 24
print(left_shifted) # Output: 24

# Right Shift: Division by powers of two
value = 20
right_shifted = value >> 2 # 20 // (2**2) = 5
print(right_shifted) # Output: 5

# For negative numbers, the sign is preserved due to two's complement.
neg_value = -16
right_shifted_neg = neg_value >> 2 # -16 // 4 = -4
print(right_shifted_neg) # Output: -4

Common Pitfalls and Best Practices

  1. Operator Precedence: Bitwise operators have lower precedence than comparison and arithmetic operators. 1 << 2 + 3 is interpreted as 1 << (2 + 3), not (1 << 2) + 3. Always use parentheses to make your intentions explicit and avoid subtle bugs.
  2. Not for Floats: Bitwise operations are only defined for integers. Attempting to use them on a float will raise a TypeError.
  3. Readability vs. Performance: While shifts are faster than arithmetic for powers of two, modern compilers and interpreters often optimize this anyway. Prioritize readability (x * 8) unless you are in a proven performance-critical section or are actually working with bit-level data (like flags or packed binary data).
  4. Sign Extension in Right Shifts: Remember that right-shifting a negative number performs an arithmetic shift (preserving the sign), not a logical shift (always shifting in zeros). This is the correct behavior for two’s complement but can be unexpected if you are used to working with unsigned integers.
  5. Use of Constants: When working with masks or flags, define them as constants with clear names (e.g., READ_PERMISSION = 0b001) and use binary notation (0b1010) or hex notation (0xA) to make the bit patterns obvious. This is far more maintainable than using decimal numbers.