Skip to content

Set Operations Complexity

The set type is an unordered collection of unique items. It's implemented as a hash table similar to dictionaries in CPython.

Time Complexity

Operation Time Notes
len() O(1) Direct count
add(x) O(1) avg, O(n) worst Amortized; hash collisions cause O(n)
remove(x) O(1) avg, O(n) worst Hash lookup + delete
discard(x) O(1) avg, O(n) worst Hash lookup + delete
pop() O(1) avg Remove arbitrary element
clear() O(n) Deallocate all
x in set O(1) avg, O(n) worst Hash lookup; collisions cause O(n)
copy() O(n) Shallow copy
union(other) O(n+m) n, m = set lengths
intersection(other) O(min(n,m)) Iterate smaller set
difference(other) O(n) n = set length
symmetric_diff(other) O(n+m) Combined set ops
issubset() O(n) Check all elements
issuperset() O(m) m = other length
isdisjoint() O(min(n,m)) Early termination

Space Complexity

Operation Space
add() O(1) amortized, O(n) worst
copy() O(n)
union() O(n+m) for result
intersection() O(min(n,m)) for result
difference() O(n) for result

Implementation Details

Hash Table Implementation

Sets use the same hash table design as dictionaries, but:

  • Only stores keys (no values)
  • More memory efficient than dict
  • Same O(1) average case lookup

Set Operations

# Union: combines both sets
{1, 2} | {2, 3}  # {1, 2, 3} - O(len(s1) + len(s2))

# Intersection: common elements
{1, 2, 3} & {2, 3, 4}  # {2, 3} - O(min(len(s1), len(s2)))

# Difference: elements in first but not second
{1, 2, 3} - {2, 4}  # {1, 3} - O(len(s1))

# Symmetric difference: elements in either but not both
{1, 2} ^ {2, 3}  # {1, 3} - O(len(s1) + len(s2))

Membership Testing

# Very fast - O(1) hash lookup
s = {1, 2, 3, 4, 5}
if 3 in s:  # O(1), not O(n)
    pass

Comparison with Lists

# List membership: O(n) - must scan entire list
numbers_list = [1, 2, 3, 4, 5]
3 in numbers_list  # O(n)

# Set membership: O(1) - hash lookup
numbers_set = {1, 2, 3, 4, 5}
3 in numbers_set  # O(1) - much faster for large collections!

Version Notes

  • All Python 3 versions: Core complexity unchanged
  • Python 3.9+: New set union/intersection operators

Implementation Comparison

CPython

Standard hash table implementation.

PyPy

JIT compilation may provide additional optimization.

Jython

Underlying Java HashSet, same O(1) characteristics.

Best Practices

Do:

  • Use sets for membership testing with large collections
  • Use set operations (|, &, -, ^) for combining sets
  • Use sets to remove duplicates: set(list_with_dups)
  • Use frozenset for hashable unique items

Avoid:

  • Using lists for frequent membership checks
  • Relying on set order (not guaranteed)
  • Unhashable types (lists, dicts) in sets

Common Patterns

Remove Duplicates

# Bad: preserves list, but O(n²)
unique = []
for item in items:
    if item not in unique:
        unique.append(item)

# Good: O(n), but loses order
unique = list(set(items))

# Best: O(n) and preserves order (Python 3.7+)
unique = list(dict.fromkeys(items))

Fast Filtering

# Bad: O(n*m) - checks membership in list for each element
large_list = list(range(1000000))
exclusions = [1, 2, 3, ...]
filtered = [x for x in large_list if x not in exclusions]

# Good: O(n) - fast set lookup
exclusions_set = set(exclusions)
filtered = [x for x in large_list if x not in exclusions_set]