Sets are unordered, mutable collections of unique, hashable objects. Their primary purpose is membership testing, removing duplicates from sequences, and performing mathematical set operations like union, intersection, difference, and symmetric difference. In Python, there are two primary ways to create sets: using set literals and using the set() constructor. Understanding the distinction between these methods and their appropriate use cases is fundamental.

Using Set Literals

The most common and Pythonic way to create a set with known elements is to use a set literal. A set literal is defined by enclosing a comma-separated sequence of elements within curly braces {}.

# Creating a set of integers
prime_numbers = {2, 3, 5, 7, 11, 13}
print(prime_numbers)  # Output: {2, 3, 5, 7, 11, 13} (order not guaranteed)

# Creating a set of strings
vowels = {'a', 'e', 'i', 'o', 'u'}
print(vowels)  # Output: {'i', 'u', 'e', 'a', 'o'} (order not guaranteed)

# Creating a set with mixed, hashable types
mixed_set = {42, 'hello', 3.14, (1, 2, 3)}
print(mixed_set)

A crucial behavior of sets is the automatic removal of duplicate values. This happens because a set, by its mathematical definition, cannot contain the same element more than once. When a literal contains duplicates, the set creation process silently deduplicates the data.

# Duplicates are automatically removed
duplicate_data = {1, 2, 2, 3, 3, 3, 4, 4, 4, 4}
print(duplicate_data)  # Output: {1, 2, 3, 4}

Why it works this way: The interpreter parses the content inside the curly braces. Each element’s hash value is used to determine its place in the underlying hash table implementation of the set. If two elements are equal (i.e., their hash values are the same and they compare as equal), the latter occurrence effectively replaces the former in the table, resulting in a collection of unique items.

Using the set() Constructor

The set() constructor is a built-in function that creates a set from an iterable object. This is the primary method for creating a set from an existing data structure like a list, tuple, or string, or for creating an empty set.

# Create a set from a list (common use case for removing duplicates)
fruit_list = ['apple', 'banana', 'apple', 'orange', 'banana']
fruit_set = set(fruit_list)
print(fruit_set)  # Output: {'orange', 'banana', 'apple'}

# Create a set from a tuple
coordinates = (1, 1, 2, 2, 3)
coordinate_set = set(coordinates)
print(coordinate_set)  # Output: {1, 2, 3}

# Create a set from a string (each character becomes an element)
hello_chars = set('hello')
print(hello_chars)  # Output: {'h', 'e', 'l', 'o'}

Creating an empty set is a major reason to use the set() constructor. This is a common pitfall because using empty curly braces {} does not create a set; it creates an empty dictionary.

# Correct way to create an empty set
empty_set = set()
print(type(empty_set))  # Output: <class 'set'>
print(empty_set)        # Output: set()

# INCORRECT: This creates a dictionary, not a set.
not_a_set = {}
print(type(not_a_set))  # Output: <class 'dict'>

Why it works this way: The set() constructor is designed to accept a single, optional iterable argument. It iterates over the provided iterable, adds each yielded element to the new set (applying the same deduplication logic as the literal), and returns the result. The syntax {} for an empty dictionary predates the existence of the set type in Python, so for backward compatibility, it was left to mean an empty dict.

Key Differences and Best Practices

  1. Literals for Known Elements, Constructor for Conversions: Use the literal syntax {elem1, elem2} when you are directly specifying all the elements in code. Use the set(iterable) constructor when you need to convert another data structure into a set or create an empty set.
  2. Hashable Requirement: Both methods require all elements to be hashable. This means you cannot have mutable objects like lists, sets, or dictionaries as elements within a set. Attempting to do so will raise a TypeError.
    # This will cause a TypeError: unhashable type: 'list'
    invalid_set = {[1, 2, 3], 4}
    
  3. Performance: For creating a set with pre-defined elements, the literal is generally faster than creating a tuple or list and then passing it to set(), as it avoids the intermediate data structure.
  4. Frozensets are Hashable: While a regular set is mutable and unhashable, its immutable counterpart, frozenset, is hashable. This means you can have a frozenset as an element inside another set or use it as a dictionary key.
    # A set containing frozensets
    set_of_frozensets = {frozenset([1, 2, 3]), frozenset([4, 5, 6])}
    print(set_of_frozensets)  # Output: {frozenset({1, 2, 3}), frozenset({4, 5, 6})}