8.3 String Operators: Concatenation, Repetition, and in
The Concatenation Operator (+)
The + operator, when used with string operands, performs concatenation. This is the process of joining two or more strings end-to-end to create a new, combined string. It is one of the most fundamental and frequently used operations in string manipulation.
Crucially, the + operator creates a new string object in memory. This is because strings in Python are immutable; their contents cannot be changed after creation. Therefore, any operation that appears to modify a string is, in fact, creating an entirely new one. This has important performance implications, especially within loops.
greeting = "Hello, "
name = "Alice"
full_greeting = greeting + name
print(full_greeting) # Output: Hello, Alice
# Concatenating multiple strings in one expression
sentence = "The answer is " + str(42) + "."
print(sentence) # Output: The answer is 42.
A common pitfall arises when attempting to concatenate a string with a non-string type. The + operator is overloaded; for numbers, it performs addition, so Python will raise a TypeError if you try to add a string and an integer, for example.
# This will cause a TypeError
age = 30
message = "I am " + age + " years old." # TypeError: can only concatenate str (not "int") to str
# The correct approach is to convert the non-string to a string first
correct_message = "I am " + str(age) + " years old."
print(correct_message) # Output: I am 30 years old.
For joining a large number of strings, using the + operator in a loop is highly inefficient due to the immutability of strings. Each concatenation creates a new string, leading to quadratic time complexity. The best practice is to use the str.join() method instead, which is optimized for this specific task.
Inefficient (in a loop):
result = ""
for num in range(1000):
result += str(num) # Creates a new string each time!
Efficient (best practice):
parts = [] # A list to collect the string parts
for num in range(1000):
parts.append(str(num))
result = "".join(parts) # Single, efficient concatenation operation
The Repetition Operator (*)
The * operator, when used with a string on the left and an integer on the right, performs repetition. It creates a new string consisting of the original string repeated the specified number of times. The integer operand is often referred to as the repetition count.
cheer = "Hip hip hooray! "
triple_cheer = cheer * 3
print(triple_cheer) # Output: Hip hip hooray! Hip hip hooray! Hip hip hooray!
# Creating a visual separator
separator = "-" * 40
print(separator) # Output: ----------------------------------------
The repetition count must be a non-negative integer. Using a negative integer results in an empty string, as repeating something a negative number of times is logically undefined and treated as zero. Using a float will raise a TypeError.
empty_string = "hello" * 0 # Output: '' (empty string)
also_empty = "hello" * -5 # Output: '' (empty string)
# This will cause a TypeError
# error = "test" * 3.5
Like concatenation, repetition creates a new string object due to immutability. It is a very fast operation, even for large repetition counts, as it is implemented efficiently in the underlying C code of Python.
The Membership Operator (in and not in)
The in and not in operators test for membership within a string. They return a boolean value (True or False) indicating whether a given substring exists anywhere within the target string. This operation is also known as a substring search.
quote = "The quick brown fox jumps over the lazy dog."
# Checking for the presence of a substring
print("fox" in quote) # Output: True
print("cat" in quote) # Output: False
print("lazy" not in quote) # Output: False
# Case sensitivity is important
print("The" in quote) # Output: True
print("the" in quote) # Output: True (there is a lowercase 'the')
print("THE" in quote) # Output: False
The in operator is implemented using an efficient algorithm (a variant of the Boyer-Moore or similar) which is much faster than a naive character-by-character search you might write yourself. It is the recommended and idiomatic way to check for a substring in Python.
A common point of confusion arises when checking for the existence of an empty string ''. An empty string is technically considered a substring of every string, so this check will always return True.
# The empty string is always a substring
print('' in "anything") # Output: True
print('' in "") # Output: True
This operator is invaluable for conditional checks, filtering data, and validating input. For example, checking if a file path contains a certain extension or if user input contains a forbidden word.
filename = "report.pdf"
if ".pdf" in filename:
print("This is a PDF file.")
user_comment = "This is a great product!"
if "spam" not in user_comment.lower():
print("Comment accepted.")