Integers: Arbitrary Precision and Literals

Arbitrary Precision

Unlike integers in many other programming languages, which are constrained by fixed bit sizes (e.g., 32-bit or 64-bit), Python’s int is a variable-length data structure that can represent numbers limited only by the available memory of the host machine. This design is known as arbitrary precision. Internally, Python uses a sequence of digits in base 2³⁰ (a base chosen for efficient memory use and arithmetic operations on most hardware) and dynamically allocates more memory as the number grows larger. This means you can perform calculations with astronomically large numbers without fear of overflow errors, a common pitfall in languages with fixed-precision integers.

# Calculating 1000! (factorial of 1000) is trivial in Python
import math
huge_number = math.factorial(1000)
print(f"1000! has {len(str(huge_number))} digits")
# Output: 1000! has 2568 digits

Literal Forms

Integer literals can be written in three different bases: decimal, binary, hexadecimal, and octal. The interpreter automatically converts these literals into their base-10 integer object representation. This flexibility is invaluable for low-level programming, bitwise operations, and when working with constants defined in other bases.

  • Decimal: The most common form, using digits 0-9 with no prefix.
  • Binary: Prefixed with 0b or 0B, using digits 0 and 1. This is extremely useful for defining bit masks and flags.
  • Hexadecimal: Prefixed with 0x or 0X, using digits 0-9 and letters a-f (case-insensitive). This is ubiquitous in computing for memory addresses, color codes, and other values where a compact representation is desired.
  • Octal: Prefixed with 0o or 0O (the old leading-zero-only syntax, e.g., 0755, was removed in Python 3 for clarity). Used less frequently but still relevant for Unix file permission settings.
decimal_literal = 42
binary_literal = 0b101010    # 42 in decimal
hex_literal = 0x2A           # 42 in decimal
octal_literal = 0o52         # 42 in decimal

print(decimal_literal, binary_literal, hex_literal, octal_literal)
# Output: 42 42 42 42

# Underscores for readability (Python 3.6+)
large_number = 1_000_000
bytes_in_gibibyte = 0x400_00000  # 1,073,741,824

Type Coercion and the int() Constructor

The int() constructor is the primary tool for converting other data types to an integer. Its behavior depends on the input type. For floating-point numbers (float), it performs truncation towards zero. For strings, it parses the string for a valid integer representation, optionally allowing you to specify the base. It’s crucial to understand that converting a float to an int does not perform rounding; it simply discards the fractional component. This is a common source of off-by-one errors for those new to the language.

# Conversion from float (truncation)
print(int(3.999))  # Output: 3 (not 4!)
print(int(-2.75))  # Output: -2

# Conversion from string
print(int("12345"))     # Output: 12345
print(int("0x2A", 16))  # Output: 42 (must specify base 16 for hex string)
print(int("1010", 2))   # Output: 10 (must specify base 2 for binary string)

# Handling invalid conversions
try:
    int("123.45") # This will fail, it's a float string
except ValueError as e:
    print(e) # Output: invalid literal for int() with base 10: '123.45'

try:
    int("Hello World")
except ValueError as e:
    print(e) # Output: invalid literal for int() with base 10: 'Hello World'

Common Pitfalls and Best Practices

  1. Float Truncation: Always be mindful that int(3.9) results in 3. If you need proper rounding, use the round() function first (int(round(3.9)) yields 4), or use floor division // with an understanding of its behavior with negatives.
  2. String Conversion Errors: When converting user input or data from files, always use try/except blocks to handle potential ValueError exceptions caused by invalid input. Never assume the input string will be perfectly formatted.
  3. Readability with Large Literals: For code maintainability, use underscores to separate groups of digits in large integer literals (e.g., 1_000_000). This makes the code significantly easier to read and prevents errors from miscounting zeros.
  4. Memory Usage: While arbitrary precision is powerful, be aware that operations on truly massive integers (thousands of digits) are computationally expensive and can consume substantial memory. For most applications, this is not a concern, but it is an important consideration for specialized numerical computing.

Floats: IEEE 754 and Floating-Point Pitfalls

Floating-point numbers, as defined by the IEEE 754 standard, are the primary data type for representing real numbers in most programming languages, including Python. While incredibly useful, their representation in a finite amount of memory leads to inherent limitations and surprising behaviors that every developer must understand to avoid subtle and critical bugs.

The IEEE 754 Double-Precision Format

At the heart of Python’s float type is the IEEE 754 standard for double-precision binary floating-point arithmetic. This format uses 64 bits to represent a number: 1 bit for the sign, 11 bits for the exponent, and 52 bits for the significand (or mantissa). This structure allows it to represent an enormous range of values, from incredibly small to astronomically large, but not with perfect precision. The key thing to understand is that it represents numbers in base-2 (binary), not base-10 (decimal). Many simple decimal numbers, like 0.1, have an infinite repeating representation in base-2, just as 1/3 (0.333...) does in base-10. Since the number of bits is finite, these values must be rounded to the nearest representable binary fraction, leading to tiny rounding errors.

The Classic Representation Error

The most common pitfall arises from this base-2 representation. The decimal value 0.1 cannot be represented exactly as a binary fraction. The stored value is an extremely close approximation, but it is not exact. This becomes visible when performing arithmetic.

# The classic example of floating-point imprecision
result = 0.1 + 0.2
print(result)          # Output: 0.30000000000000004
print(result == 0.3)  # Output: False

# This is not a bug in Python; it's a fundamental property of IEEE 754 floats.
# Let's see the exact value stored:
import format
print(format(0.1, '.20f'))  # Output: 0.10000000000000000555
print(format(0.2, '.20f'))  # Output: 0.20000000000000001110
print(format(0.3, '.20f'))  # Output: 0.29999999999999998890

Comparing Floats for Equality

Because of these representation errors, directly comparing two floats for exact equality using == is often a logical error. The correct practice is to check if the two numbers are “close enough” within a defined tolerance. This tolerance is often called an “epsilon” and should be chosen based on the magnitude of the numbers you are comparing and the required precision of your application.

# The WRONG way to compare floats
a = 0.1 + 0.2
b = 0.3
if a == b:
    print("Equal (You probably won't see this)")
else:
    print("Not equal (You will see this)")

# The RIGHT way: check if the absolute difference is within a tolerance
tolerance = 1e-10
if abs(a - b) < tolerance:
    print("Essentially equal")  # This will print
else:
    print("Not equal")

# Using math.isclose() (Python 3.5+)
import math
if math.isclose(a, b, rel_tol=1e-9, abs_tol=1e-9):
    print("math.isclose says they are close")  # This will print

The math.isclose() function is the modern best practice, as it handles both relative (rel_tol) and absolute (abs_tol) tolerances, making it robust across different scales of numbers.

The Perils of Accumulating Errors

When performing a large number of sequential floating-point operations, these tiny rounding errors can accumulate, leading to significant inaccuracies. This is a major concern in scientific computing, financial modeling, and graphics. For example, summing a large list of floats can yield a result that drifts from the true mathematical sum.

# Demonstrating error accumulation
total = 0.0
# Add 0.1 ten thousand times
for _ in range(10_000):
    total += 0.1

# Mathematically, this should be 1000.0
print(total)                   # Output: 1000.0000000001588
print(1000.0 - total)          # Output: -1.588018648175667e-10

Special Values: inf, -inf, and nan

The IEEE 754 standard defines special values to handle edge cases. Division by zero produces a signed infinity. Operations that have no mathematical definition (e.g., 0.0 / 0.0, inf - inf) produce a “Not a Number” value, nan. These values propagate through calculations.

# Working with special values
positive_inf = float('inf')
negative_inf = float('-inf')
not_a_number = float('nan')

print(10 / 0.0)          # Output: inf
print(positive_inf * 5)  # Output: inf
print(positive_inf + negative_inf)  # Output: nan (indeterminate form)

# A crucial note: nan is not equal to itself. This is by definition.
print(not_a_number == not_a_number)  # Output: False
# To check for nan, you MUST use math.isnan()
import math
print(math.isnan(not_a_number))  # Output: True

When to Use Alternatives (Decimal/Fraction)

For use cases where precision is paramount and decimal representation is assumed (e.g., financial calculations), the float type is inappropriate. Python’s decimal.Decimal module exists for this purpose. It represents numbers in base-10, avoiding the binary conversion issues of float. For exact rational arithmetic (e.g., working with fractions), the fractions.Fraction type is the correct choice.

# Solving the 0.1 + 0.2 problem with Decimal
from decimal import Decimal
decimal_result = Decimal('0.1') + Decimal('0.2')
print(decimal_result)          # Output: 0.3
print(decimal_result == Decimal('0.3'))  # Output: True

# Using Fraction for exact rational math
from fractions import Fraction
fraction_result = Fraction('1/10') + Fraction('2/10')
print(fraction_result)        # Output: 3/10
print(float(fraction_result)) # Output: 0.3

The key takeaway is that float is a tool for approximate arithmetic over a wide range of values, optimized for speed and memory efficiency. For exact decimal or rational arithmetic, one must consciously choose the more specialized Decimal or Fraction types. Understanding the IEEE 754 pitfalls is essential for writing robust numerical code.

The decimal Module: Exact Decimal Arithmetic

The decimal module provides support for fast correctly-rounded decimal floating-point arithmetic, addressing several critical limitations of binary floating-point types like float. While float offers excellent performance by leveraging hardware-based binary arithmetic, it cannot precisely represent many decimal fractions (like 0.1) due to the fundamental mismatch between base-2 and base-10 representation. This leads to subtle rounding errors that accumulate over financial and monetary calculations, where exact decimal representation is paramount. The decimal module exists to provide arithmetic that is “what you see is what you get,” making it the preferred choice for applications requiring exact decimal representation, such as accounting, financial technology, and any context where rounding must adhere to strict standards (e.g., GAAP, EU VAT rules).

The Decimal Data Type

The core of the module is the Decimal class, which constructs a decimal object from an integer, string, float, or tuple. The most crucial and recommended way to create a Decimal is from a string. This is because a string represents the exact decimal number you intend, whereas initializing with a float immediately imports the float’s inherent imprecision.

from decimal import Decimal

# Correct: Initializing from a string for exact representation
exact_value = Decimal('0.1')
print(exact_value)  # Output: 0.1

# Incorrect (often): Initializing from a float imports its imprecision
inexact_value = Decimal(0.1)
print(inexact_value) # Output: 0.1000000000000000055511151231257827021181583404541015625

The internal representation of a Decimal number is not base-2 but base-10. It consists of a sign, a coefficient (stored as an integer), and an exponent. This allows it to represent numbers like 0.1, 1.25, or 3.14159 exactly, without the rounding errors inherent in their binary floating-point approximations.

The Context: Precision and Rounding

Arithmetic operations are governed by a context, which is a set of parameters defining precision, rounding rules, and limits. The context is thread-local, meaning each thread can have its own settings. You can get the current context using getcontext() and modify its attributes.

from decimal import Decimal, getcontext

# Get the current context
ctx = getcontext()

# Set precision to 3 significant digits (not decimal places)
ctx.prec = 3

# Set rounding mode to ROUND_HALF_UP (standard rounding)
ctx.rounding = 'ROUND_HALF_UP'

num = Decimal('1.23456')
result = num * Decimal('2')  # Calculation happens within the context
print(result)  # Output: 2.47 (2.46912 rounded to 3 significant digits: 2.47)

The prec attribute controls significant digits, not digits after the decimal point. This is a common point of confusion. A precision of 3 means the number 123.45 has 5 significant digits but would be rounded to 3 (e.g., 123).

Rounding Modes

The context’s rounding mode provides fine-grained control over how results are rounded, which is essential for compliance with different financial or scientific standards. Common modes include:

  • ROUND_CEILING: Towards Infinity.
  • ROUND_FLOOR: Towards -Infinity.
  • ROUND_DOWN: Towards zero.
  • ROUND_UP: Away from zero.
  • ROUND_HALF_UP: Rounds to the nearest number; if equidistant, rounds up (common commercial rounding).
  • ROUND_HALF_DOWN: Rounds to the nearest number; if equidistant, rounds down.
  • ROUND_HALF_EVEN: Rounds to the nearest number; if equidistant, rounds to make the final digit even (aka “banker’s rounding,” minimizes total bias).
ctx.prec = 1
ctx.rounding = 'ROUND_HALF_UP'
print(Decimal('1.5') * Decimal('1')) # Output: 2

ctx.rounding = 'ROUND_HALF_EVEN'
print(Decimal('1.5') * Decimal('1')) # Output: 2
print(Decimal('2.5') * Decimal('1')) # Output: 2 (rounds to even)

Quantize: Rounding to a Specific Decimal Place

While the context controls rounding for arithmetic operations, you often need to round a final result to a specific number of decimal places (e.g., two places for currency). The quantize() method is designed for this purpose. It returns a new Decimal value rounded to a specified exponent, defined by another Decimal instance.

from decimal import Decimal, ROUND_HALF_UP

price = Decimal('13.9499999999')
# Create a Decimal that defines the rounding target: 2 decimal places
twoplaces = Decimal('0.01')

# Round the price to two decimal places using standard rounding
rounded_price = price.quantize(twoplaces, rounding=ROUND_HALF_UP)
print(rounded_price)  # Output: 13.95

# You can also use a Decimal with the desired exponent directly
rounded_to_three = price.quantize(Decimal('0.001'))
print(rounded_to_three) # Output: 13.950

Special Values and traps

The decimal module gracefully handles special values like Infinity (Decimal('Inf')), Negative Infinity (Decimal('-Inf')), and Not a Number (Decimal('NaN')). The context also defines how to handle exceptional conditions like division by zero, overflow, or underflow. By default, these conditions return a special value (like NaN or Infinity). However, for robust applications, you should enable traps to make them raise exceptions.

ctx = getcontext()

# Enable the DivisionByZero trap
ctx.traps[decimal.DivisionByZero] = True

try:
    Decimal('1') / Decimal('0')
except decimal.DivisionByZero:
    print("Cannot divide by zero!")

# Disable the trap to return Infinity
ctx.traps[decimal.DivisionByZero] = False
result = Decimal('1') / Decimal('0')
print(result)  # Output: Infinity

Best Practices and Common Pitfalls

  1. Always Initialize from Strings: The single most important practice is to initialize Decimal instances from strings or integers to avoid importing float imprecision.
  2. Understand Precision vs. Decimal Places: Remember that prec in the context refers to significant digits, not decimal places. Use quantize() for controlling decimal places.
  3. Set Context Explicitly: Don’t assume the default context. For applications, explicitly set the precision and rounding mode at the start to ensure consistent behavior.
  4. Be Mindful of Performance: Decimal arithmetic is significantly slower than float arithmetic because it is software-based and handles more complex precision rules. Use it where decimal exactness is required, not for high-performance scientific computing.
  5. Use Local Context for Temporary Changes: The localcontext() manager allows you to temporarily change the context for a block of code, which is safer than modifying the global context.
from decimal import Decimal, localcontext, getcontext

with localcontext() as ctx:
    ctx.prec = 5
    ctx.rounding = 'ROUND_DOWN'
    # Calculations in this block use the local context
    result = Decimal('1.23456789') / Decimal('1')
    print(result)  # Output: 1.2345

# Outside the block, the original context is restored
print(getcontext().prec) # Back to default (usually 28)

The fractions Module: Rational Numbers

The fractions module provides support for rational number arithmetic through the Fraction class, allowing precise representation of numbers as numerator/denominator pairs. This is invaluable in domains requiring exact fractional arithmetic, such as financial calculations, symbolic mathematics, and physical measurements, where the inherent floating-point representation errors of float are unacceptable. A Fraction instance automatically reduces itself to its lowest terms, ensuring canonical representation and preventing equivalent fractions from being treated as distinct values.

Creating Fraction Objects

There are several ways to instantiate a Fraction, offering flexibility for different input types. The most common method is by passing two integers representing the numerator and denominator.

from fractions import Fraction

# From numerator and denominator
f1 = Fraction(3, 4)
print(f1)  # Output: 3/4

# From a single number (numerator, denominator becomes 1)
f2 = Fraction(8)
print(f2)  # Output: 8

# From a string representation
f3 = Fraction('22/7')
print(f3)  # Output: 22/7

# From a float (but caution advised, see pitfalls)
f4 = Fraction(0.125)
print(f4)  # Output: 1/8

# From another Fraction
f5 = Fraction(f1)
print(f5)  # Output: 3/4

When a float is provided, the Fraction constructor converts it using its exact float value, which can sometimes lead to unexpected results due to the way floats are stored in binary. The string constructor is often more intuitive for human-inputted values.

Automatic Reduction and Properties

A fundamental characteristic of the Fraction class is that it automatically reduces the fraction to its lowest terms upon creation. This is achieved by finding the greatest common divisor (GCD) of the numerator and denominator using the math.gcd() function. This ensures that Fraction(10, 20) is stored and printed as 1/2, not 10/20.

You can access the individual components of the reduced fraction using the numerator and denominator attributes.

f = Fraction(10, 20)
print(f)              # Output: 1/2
print(f.numerator)   # Output: 1
print(f.denominator) # Output: 2

# The reduction uses the GCD
import math
gcd_val = math.gcd(10, 20) # gcd_val = 10
reduced_num = 10 // gcd_val # 1
reduced_den = 20 // gcd_val # 2

Arithmetic Operations

Fraction objects seamlessly integrate with Python’s arithmetic operators (+, -, *, /, //, %, **). The result of any operation between two Fraction objects, or between a Fraction and an int, is another Fraction in its reduced form. This allows for the creation of complex expressions with perfect precision.

f1 = Fraction(1, 2)
f2 = Fraction(1, 3)

# Addition
result_add = f1 + f2  # (1/2 + 1/3) = 5/6
print(result_add)     # Output: 5/6

# Multiplication
result_mul = f1 * f2  # (1/2 * 1/3) = 1/6
print(result_mul)     # Output: 1/6

# Division
result_div = f1 / f2  # (1/2) / (1/3) = 3/2
print(result_div)     # Output: 3/2

# Mixed operations with integers
result_mixed = f1 * 4 + Fraction(1, 8)
print(result_mixed)   # Output: 17/8

Comparison and Other Methods

Fractions can be compared using standard comparison operators (<, <=, >, >=, ==, !=). The Fraction class also provides useful methods like limit_denominator(), which finds the closest fraction to the current value with a denominator less than or equal to a specified maximum. This is particularly useful for approximating irrational numbers or floats that can’t be represented exactly.

f_large = Fraction(1, 12345)

# Find a simpler approximation
approx = f_large.limit_denominator(1000)
print(approx)  # Output: 0 (It finds 0/1 as the closest fraction with denom <= 1000)

# More practical use: approximating pi
import math
f_pi = Fraction(math.pi)
approx_pi = f_pi.limit_denominator(1000)
print(approx_pi)  # Output: 355/113 (A famous approximation of π)

Common Pitfalls and Best Practices

The primary pitfall involves constructing a Fraction directly from a float. Because most floats are approximations, the resulting Fraction will represent the exact binary value of that float, not the decimal number the programmer might have intended.

# Pitfall: Constructing from a float
f_bad = Fraction(0.1) # 0.1 cannot be represented exactly as a float
print(f_bad) # Output: 3602879701896397/36028797018963968

# Best Practice: Construct from a string for decimal numbers
f_good = Fraction('0.1')
print(f_good) # Output: 1/10

Best Practice: Always prefer the string constructor (Fraction('3/4')) when the value originates from user input or a known decimal literal. This bypasses the intermediate and often imprecise float representation entirely.

Another consideration is performance. Arithmetic operations with Fraction are significantly slower than with int or float because each operation involves integer arithmetic for both numerator and denominator, followed by a GCD calculation for reduction. For performance-critical numerical code where absolute precision isn’t required, float or Decimal might be more appropriate. The Fraction class is the ideal tool when the problem domain is inherently rational and exactness is paramount.

Complex Numbers: Literals and the cmath Module

Complex numbers in Python are first-class citizens, represented by the complex data type. They are an essential tool for advanced mathematics, engineering, and scientific computing, providing a native way to work with two-dimensional numbers consisting of a real and an imaginary part.

Literal Syntax and the complex() Constructor

The most straightforward way to create a complex number is by using a literal. You append a j or J to a numeric literal to denote the imaginary part. The real part can be omitted if it is zero. This syntax is compact and mirrors common mathematical notation.

# Literal syntax examples
z1 = 3 + 4j      # Real part: 3, Imaginary part: 4
z2 = -2.5 - 1.7J # Real part: -2.5, Imaginary part: -1.7
z3 = 7j          # Real part: 0, Imaginary part: 7
z4 = 5           # This is an integer, not a complex number (real part: 5, imag: 0)
z5 = 5 + 0j      # This is a complex number

print(f"Type of z1: {type(z1)}")  # Output: <class 'complex'>
print(f"Type of z4: {type(z4)}")  # Output: <class 'int'>

Alternatively, you can use the complex(real, imag) constructor function. This is particularly useful when the real and imaginary parts are stored in variables or generated programmatically. The arguments can be any numeric type (int, float, etc.), and both default to 0.0.

# Using the complex constructor
real_part = 10
imag_part = -3
z6 = complex(real_part, imag_part) # Creates 10-3j

# Converting a string representation (no spaces allowed)
z7 = complex("5+2j")  # Creates 5+2j
# z8 = complex("5 + 2j") # This would raise a ValueError due to the space

Accessing Parts and Basic Attributes

A complex number object has two read-only attributes, .real and .imag, which return the real and imaginary components as floats. The .conjugate() method returns the complex conjugate, which is a new complex number with the same real part and a negated imaginary part. This is a fundamental operation in complex arithmetic.

z = 3 - 4j
print(f"Real part: {z.real}")      # Output: 3.0
print(f"Imaginary part: {z.imag}") # Output: -4.0
print(f"Conjugate: {z.conjugate()}") # Output: (3+4j)

The cmath Module: Advanced Operations

While the built-in complex type handles basic arithmetic, the cmath module provides a comprehensive suite of mathematical functions for complex numbers. It mirrors the standard math module but is designed specifically for the complex plane. Using math functions on complex numbers will raise a TypeError, as most are undefined for complex inputs.

Why a separate module? The algorithms for functions like square roots, logarithms, and trigonometric functions are fundamentally different for complex numbers. For example, the square root of a negative real number is a complex number, an operation the math.sqrt function cannot perform.

import math
import cmath

z = -4
# math.sqrt(z)  # This would raise a ValueError: math domain error

# Correct way: use cmath
complex_sqrt = cmath.sqrt(z)
print(complex_sqrt) # Output: 2j

# Other essential cmath functions
z = 1 + 1j
print(f"Phase (angle in radians): {cmath.phase(z)}") # ~0.7854
print(f"Polar coordinates: {cmath.polar(z)}")        # (~1.4142, ~0.7854)
print(f"Rectangular from polar: {cmath.rect(1.414, 0.785)}") # ~(1+1j)

# Exponential and logarithmic functions
print(f"e^z: {cmath.exp(z)}")
print(f"Natural log: {cmath.log(z)}")
print(f"Base-10 log: {cmath.log10(z)}")

# Trigonometric functions
print(f"Sine: {cmath.sin(z)}")
print(f"Inverse Sine: {cmath.asin(z)}")
print(f"Hyperbolic Sine: {cmath.sinh(z)}")

Common Pitfalls and Best Practices

  1. The j Literal Requirement: The imaginary unit must be directly appended to the number without a space. 5j is correct; 5 j is a syntax error. This is a common source of typos for those new to the syntax.
  2. Type Confusion: Be mindful that 5 is an int and 5 + 0j is a complex. While arithmetic operations between numeric types are promoted, this can lead to unexpected type results if you are not careful.
  3. Prefer cmath over math for Complex Numbers: Always use cmath when you know you are dealing with complex numbers. Attempting to use math functions will result in errors.
  4. Understanding Branch Cuts: Functions in the complex plane, like cmath.sqrt and cmath.log, often have “branch cuts” – curves in the complex plane across which the function is discontinuous. The cmath module implements standard branch cuts as defined in relevant mathematical standards. For most applications, you can use the functions without worrying about this, but it is a critical concept for advanced work to avoid unexpected results.
  5. Floating-Point Precision: Just like the float type, the complex type (which uses two floats internally) is subject to the limitations of floating-point arithmetic, including rounding errors and precision limits. For scenarios requiring exact decimal representation, consider using the Decimal type for the real and imaginary parts and building your own complex number class, though this sacrifices the performance and convenience of the built-in type.

Numeric Operators: //, %, **, divmod()

Floor Division (//)

The floor division operator (//) performs division and then applies the floor function to the result, returning the largest integer less than or equal to the result. It is crucial to understand that “floor” means rounding towards negative infinity, not towards zero. This distinction is critical when working with negative numbers.

For positive numbers, // behaves identically to truncating the decimal part.

# With positive numbers
print(10 // 3)  # Output: 3 (since 10/3 ≈ 3.333, floor is 3)
print(5.9 // 2) # Output: 2.0 (5.9/2=2.95, floor is 2.0. Result is float if an operand is float)

With negative numbers, the behavior clarifies why it’s floor division and not truncation. Truncation would round towards zero, but floor always rounds down on the number line.

# With negative numbers: Floor rounds towards -∞
print(-10 // 3)  # Output: -4
# Why? -10/3 ≈ -3.333. The largest integer less than or equal to -3.333 is -4.
# Truncation would give -3, but floor gives -4.

print(10 // -3)  # Output: -4 (10/-3 ≈ -3.333, floor is -4)
print(-10 // -3) # Output: 3 (-10/-3 ≈ 3.333, floor is 3)

Modulo (%)

The modulo operator returns the remainder of a division operation. Its behavior is intrinsically linked to floor division through the fundamental identity: x == (x // y) * y + (x % y). This identity must always hold true, which dictates how % behaves with negative numbers.

For positive operands, it’s straightforward.

print(10 % 3)  # Output: 1 (10 - (3*3) = 1)
print(12.5 % 3.2) # Output: 2.9 (12.5 - (3.2 * 3) = 12.5 - 9.6 = 2.9)

When the divisor is negative, the result takes the sign of the divisor. When the dividend is negative, the result is non-negative. This ensures the core identity remains valid.

# The result has the same sign as the divisor (y)
print(10 % -3)   # Output: -2. Why? To satisfy the identity: (10 // -3) is -4. (-4 * -3) + ? = 10 -> 12 + ? = 10 -> ? must be -2.
print(-10 % 3)   # Output: 2. Identity: (-10 // 3) is -4. (-4 * 3) + ? = -10 -> -12 + ? = -10 -> ? must be 2.
print(-10 % -3)  # Output: -1. Identity: (-10 // -3) is 3. (3 * -3) + ? = -10 -> -9 + ? = -10 -> ? must be -1.

Exponentiation (**)

The exponentiation operator raises the left operand to the power of the right operand. It has right-sided associativity, meaning a ** b ** c is evaluated as a ** (b ** c).

print(2 ** 3)    # Output: 8
print(4 ** 0.5)  # Output: 2.0 (square root)
print(8 ** (1/3)) # Output: 2.0 (cube root, note floating-point approximation)

# Right-sided associativity
result = 2 ** 3 ** 2 # Calculated as 2 ** (3 ** 2) = 2 ** 9 = 512
print(result) # Output: 512

A common pitfall is negative bases with fractional exponents, which often result in a complex number.

# This results in a complex number
print((-9) ** 0.5) # Output: (3.4641016151377544e-16+3j)
# To get a real-number square root, use math.sqrt on the absolute value and adjust sign.
import math
def safe_sqrt(x):
    if x >= 0:
        return math.sqrt(x)
    else:
        return complex(0, math.sqrt(-x))

The divmod() Function

The divmod() function is a convenience function that takes two arguments and returns a tuple (quotient, remainder), where quotient is x // y and remainder is x % y. It performs both operations simultaneously, which is more efficient than making two separate calls and ensures both results adhere to the identity x = quotient * y + remainder.

# Basic usage
result = divmod(10, 3)
print(result)        # Output: (3, 1)
print(3 * 3 + 1 == 10) # True, verifying the identity

# With negative numbers
result = divmod(-10, 3)
print(result)        # Output: (-4, 2)
print(-4 * 3 + 2)   # Output: -10 (-12 + 2 = -10)

result = divmod(10, -3)
print(result)        # Output: (-4, -2)
print(-4 * -3 - 2)  # Output: 10 (12 - 2 = 10)

Best Practices and Pitfalls

  1. Clarity with Negative Numbers: Always be mindful of the floor direction with // and the sign of the result with % when negatives are involved. If your logic requires truncation toward zero, consider using math.trunc() or converting to an int after standard division.

  2. Floating-Point Precision: Remember that float operations are subject to rounding errors. This can lead to surprising results with these operators.

    # Floating-point imprecision
    print(0.1 % 0.03) # Output: 0.009999999999999995 instead of 0.01
    

    For financial or precise decimal calculations, always use the decimal.Decimal type.

    from decimal import Decimal
    a = Decimal('0.1')
    b = Decimal('0.03')
    result = a % b
    print(result) # Output: 0.01
    
  3. Type Coercion: The result type of // and % depends on the input types. If both operands are integers, the result is an integer. If either operand is a float, the result is a float. divmod() follows the same rules, returning a tuple of the type that // and % would return individually.

    print(10 // 3)    # <class 'int'>
    print(10.0 // 3)  # <class 'float'>
    quotient, remainder = divmod(10.5, 3)
    print(type(quotient))   # <class 'float'>
    print(type(remainder))  # <class 'float'>
    

Bit Manipulation: &, |, ^, ~, «, »

Bit manipulation involves direct control of the bits that make up integer values. In Python, these operations are primarily performed on integers (int), which are represented as sequences of bits. Understanding these operations requires a grasp of binary representation. Python uses a two’s complement system for representing negative integers, which means the most significant bit (the leftmost bit) acts as the sign bit: 0 for non-negative numbers and 1 for negative numbers.

The Bitwise Operators

Python provides six fundamental bitwise operators. They are true bitwise operators, meaning the operation is performed on each corresponding pair of bits from the two operands.

  • & (Bitwise AND): For each bit position, the result is 1 only if both corresponding bits are 1.

    # 5 is 0101, 3 is 0011
    result = 5 & 3  # 0101 & 0011 = 0001
    print(result)  # Output: 1
    

    A common use case is to check if a specific bit (or flag) is set. For example, to check if the 3rd bit (from the right, 0-indexed) is set in a number: if number & (1 << 2):.

  • | (Bitwise OR): For each bit position, the result is 1 if at least one of the corresponding bits is 1.

    # 5 is 0101, 3 is 0011
    result = 5 | 3  # 0101 | 0011 = 0111
    print(result)  # Output: 7
    

    This is often used to set a specific bit. To set the 2nd bit: number = number | (1 << 1).

  • ^ (Bitwise XOR): For each bit position, the result is 1 only if the corresponding bits are different.

    # 5 is 0101, 3 is 0011
    result = 5 ^ 3  # 0101 ^ 0011 = 0110
    print(result)  # Output: 6
    

    A useful property of XOR is that it can be used to toggle bits. number ^ (1 << n) will flip the n-th bit. Another property is that a ^ a == 0 and a ^ 0 == a, which is the basis for some clever algorithms like finding a single unique number in a list.

  • ~ (Bitwise NOT): This is a unary operator that inverts all the bits of its operand. Due to two’s complement representation, the result is -(x + 1).

    # 5 is ...00000101
    result = ~5  # becomes ...11111010, which is -6 in two's complement
    print(result)  # Output: -6
    

    This is a common source of confusion. Programmers expecting ~5 to be 2 (if they assume a fixed 3-bit width) must remember Python integers are conceptually infinite-width, and ~ flips an infinite number of leading zeros to ones, resulting in a negative number.

  • << (Bitwise Left Shift): Shifts the bits of the first operand to the left by the number of positions specified by the second operand. New bits on the right are filled with 0.

    # 5 is 0101
    result = 5 << 1  # becomes 1010
    print(result)  # Output: 10
    

    This is equivalent to multiplying by 2 ** n. Shifting left by n is a very fast way to compute a * (2 ** n).

  • >> (Bitwise Right Shift): Shifts the bits of the first operand to the right by the number of positions specified by the second operand. For non-negative numbers, new bits on the left are filled with 0. For negative numbers, the sign bit (1) is preserved, a process known as sign extension or an arithmetic right shift.

    # 10 is 1010
    result_pos = 10 >> 1  # becomes 0101
    print(result_pos)  # Output: 5
    
    # -6 is ...11111010
    result_neg = -6 >> 1  # The sign bit (1) is extended, becomes ...11111101, which is -3
    print(result_neg)  # Output: -3
    

    For non-negative numbers, this is equivalent to integer floor division by 2 ** n.

Common Pitfalls and Best Practices

  1. Operator Precedence: Bitwise operators have lower precedence than arithmetic and comparison operators. This can lead to unexpected results. Always use parentheses to make your intent clear.

    # Common mistake
    value = 5 & 3 == 1  # Evaluates as 5 & (3 == 1) -> 5 & False -> 5 & 0 -> 0
    print(value) # Output: 0
    
    # Correct approach
    value = (5 & 3) == 1 # Evaluates as (1) == 1 -> True
    print(value) # Output: True
    
  2. Infinite Width and Negative Numbers: Remember that Python integers are not fixed-width. The result of ~0 is not 0xFF (255 in an 8-bit system) but -1. When working with hardware or file formats that expect fixed-width integers (e.g., a 32-bit unsigned integer), you must manually mask the results to simulate the desired bit length.

    # Simulating a 16-bit unsigned integer after a NOT operation
    original = 0x55AA
    simulated_result = ~original & 0xFFFF
    print(hex(simulated_result)) # Output: 0xaa55
    
  3. Applicability to Other Types: The bitwise operators are defined for integers only. Using them on float, complex, Decimal, or Fraction will raise a TypeError. If you need to manipulate the underlying bits of a float, you must use the struct module to interpret its bytes as an integer first.

    import struct
    a_float = 3.14
    # Pack the float into 4 bytes, then unpack those bytes as an unsigned int
    float_bits = struct.unpack('!I', struct.pack('!f', a_float))[0]
    print(f"Bits of {a_float} as hex: {hex(float_bits)}")
    
  4. Clarity vs. Cleverness: While bit manipulation can be extremely efficient, it often sacrifices code readability. Use it when performance is critical (e.g., in tight loops, cryptographic functions, or compression algorithms), but always favor clarity over cleverness unless there’s a proven need. A well-placed comment explaining the bit logic is essential.

The math and statistics Modules

The math and statistics modules are indispensable tools in the Python standard library, designed to provide efficient and well-tested implementations of common mathematical operations and statistical functions. While the built-in functions and operators can handle basic arithmetic, these modules offer a more comprehensive, precise, and often faster suite of tools for scientific, engineering, and data analysis applications. Understanding their capabilities and limitations is crucial for writing robust numerical code.

Constants and Basic Functions

The math module provides access to fundamental mathematical constants with a high degree of precision, far beyond what you would type manually. math.pi and math.e are stored as the closest possible floating-point approximations to these irrational numbers. For basic operations, the module offers more precise alternatives to built-in functions. For example, math.sqrt(x) is generally preferred over x ** 0.5 for calculating square roots due to its specific optimization for this single task, which can lead to slight differences in precision and edge-case handling (like for negative numbers, where it raises a ValueError instead of producing a complex number).

import math

# Using mathematical constants
radius = 5
circle_area = math.pi * radius ** 2
print(f"Area of circle: {circle_area}")  # Output: Area of circle: 78.53981633974483

# Comparing square root methods
x = 2
sqrt_math = math.sqrt(x)
sqrt_power = x ** 0.5
print(sqrt_math, sqrt_power)  # Output: 1.4142135623730951 1.4142135623730951
# They are functionally identical for positive numbers, but math.sqrt is more explicit.

Exponentiation, Logarithms, and Factorials

This category includes functions for exponential growth, decay, and multiplicative processes. math.exp(x) computes e**x, which is critical in many natural processes and financial calculations. The logarithm functions are equally important: math.log(x[, base]) computes the logarithm of x to the given base (natural logarithm if base is omitted). math.log10(x) is specifically optimized for base-10 logarithms, commonly used in scientific notation. math.factorial(n) is essential for combinatorial calculations but should be used with caution for large n due to its rapid growth and the potential for exceeding recursion limits (though the CPython implementation is iterative).

import math

# Exponential and logarithmic functions
print(math.exp(1))  # ~2.718, e^1
print(math.log(100, 10))  # 2.0, log10(100)
print(math.log10(1000))   # 3.0

# Factorial
n = 5
combinations = math.factorial(n)  # Number of ways to arrange 5 distinct items
print(combinations)  # Output: 120

Trigonometric and Angular Conversion Functions

The math module provides a full suite of trigonometric functions (sin, cos, tan, asin, acos, atan, etc.) that operate in radians. This is a critical detail and a common pitfall for those accustomed to working in degrees. The module provides math.radians(deg) and math.degrees(rad) for seamless conversion. For the inverse tangent function, math.atan2(y, x) is highly recommended over math.atan(y/x) because it automatically handles the correct quadrant of the resulting angle by considering the signs of both x and y individually, thus avoiding division-by-zero errors and providing an angle in the correct range (-pi, pi].

import math

# Working in degrees? Convert to radians first.
angle_deg = 60
angle_rad = math.radians(angle_deg)
sin_val = math.sin(angle_rad)
print(f"sin(60°) = {sin_val:.2f}")  # Output: sin(60°) = 0.87

# Why atan2 is superior to atan
x, y = -1, -1
angle_atan = math.degrees(math.atan(y / x))  # Problem: y/x is positive, result is 45°
angle_atan2 = math.degrees(math.atan2(y, x)) # Correctly returns -135°
print(angle_atan, angle_atan2)  # Output: 45.0 -135.0

Working with Special Values (inf, nan)

Floating-point arithmetic can produce special values like infinity (inf) and “Not a Number” (nan). The math module provides safe and portable ways to check for these values. math.isinf(x) returns True if x is positive or negative infinity, and math.isnan(x) returns True if x is a nan value. Using x == float('nan') to check for nan is unreliable because a nan value, by definition, is not equal to itself.

import math

# Generating special values
result = 1e1000  # This will exceed the float max, resulting in inf
negative_inf = -math.inf
not_a_number = float('nan')

# Checking for special values correctly
print(math.isinf(result))        # True
print(math.isinf(negative_inf))  # True
print(math.isnan(not_a_number))  # True

# Incorrect check for nan (this will not work)
if not_a_number == float('nan'):
    print("This will never print.")

Introduction to the statistics Module

While the math module focuses on deterministic functions, the statistics module is designed for calculating descriptive statistics from data, which involves dealing with sample sets and the inherent uncertainty they represent. It provides functions to compute measures of central tendency (mean, median, mode) and measures of spread (variance, standard deviation). A key best practice is to be aware of the difference between the population variance (pvariance) and the sample variance (variance). The population variance divides by N (the total number of data points), while the sample variance divides by N-1 (Bessel’s correction) to provide an unbiased estimator of the population variance from a sample.

import statistics

data = [1, 2, 2, 3, 4, 5, 5, 5, 6]

# Measures of central tendency
mean = statistics.mean(data)
median = statistics.median(data)
mode = statistics.mode(data)
print(f"Mean: {mean}, Median: {median}, Mode: {mode}")
# Output: Mean: 3.666..., Median: 4, Mode: 5

# Measures of spread (using sample variance/STDEV by default)
variance = statistics.variance(data)  # Uses N-1
stdev = statistics.stdev(data)        # Uses N-1
print(f"Sample Variance: {variance:.2f}, Sample STDEV: {stdev:.2f}")
# Output: Sample Variance: 3.25, Sample STDEV: 1.80

# For the entire population, use:
pop_variance = statistics.pvariance(data) # Uses N
pop_stdev = statistics.pstdev(data)      # Uses N

Best Practices and Pitfalls

  1. Radians, Not Degrees: Always convert angles from degrees to radians before using trigonometric functions in the math module. This is the most common mistake.
  2. Precision of Constants: Remember that math.pi and math.e are finite floating-point approximations. They are extremely precise but not exact.
  3. Handling nan and inf: Always use math.isnan() and math.isinf() to check for these values. Never use equality comparisons.
  4. Integer vs. Float Input: Many math functions, like math.factorial, require integer arguments. Passing a float will raise a ValueError.
  5. Choosing the Right Variance: In the statistics module, consciously choose between variance/stdev (for a sample of a larger population) and pvariance/pstdev (for the entire population). Using the wrong one is a conceptual error in statistical analysis.
  6. Performance: For large datasets, third-party libraries like NumPy and SciPy are vastly more performant than the statistics module, which is designed for simplicity and correctness on smaller datasets rather than raw speed.