set() Function Complexity¶

The set() function creates sets from iterables or creates empty sets.

Complexity Analysis¶

Case	Time	Space	Notes
Empty set	O(1)	O(1)	set()
From iterable	O(n) avg, O(n²) worst	O(n)	n = iterable length; worst case with hash collisions
From string	O(n)	O(n)	n = string length
Duplicate removal	O(n)	O(n)	Automatic deduplication
Shallow copy	O(n)	O(n)	n = set size

Basic Usage¶

Create Empty Set¶

# O(1)
s = set()  # set()

From List¶

# O(n) - where n = list length
s = set([1, 2, 3])           # {1, 2, 3}
s = set([1, 2, 2, 3, 3, 3])  # {1, 2, 3} - duplicates removed

From String¶

# O(n) - where n = string length
s = set("hello")      # {'h', 'e', 'l', 'o'}
s = set("aabbcc")     # {'a', 'b', 'c'} - each char once

From Other Iterables¶

# O(n) - where n = iterable length
s = set((1, 2, 3))          # {1, 2, 3} (from tuple)
s = set({1, 2, 3})          # {1, 2, 3} (from set - copy)
s = set(range(5))           # {0, 1, 2, 3, 4}
s = set(map(str, [1, 2]))   # {'1', '2'} (from generator)

From Dictionary¶

# O(n) - uses dict keys
d = {"a": 1, "b": 2, "c": 3}
s = set(d)  # {'a', 'b', 'c'} - only keys, not values

Complexity Details¶

Hashing and Insertion¶

# O(n) - hash each item and insert
# Each item:
# 1. Hash - O(1) average
# 2. Check for duplicates - O(1) average
# 3. Insert - O(1) average
# Total: O(n)

items = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
s = set(items)  # O(10)

Duplicate Removal¶

# O(n) - automatic deduplication
items = [1, 1, 2, 2, 3, 3]
s = set(items)  # O(6) - creates {1, 2, 3}

# Useful for removing duplicates while preserving unordered
unique = set("mississippi")  # {'m', 'i', 's', 'p'}

From String¶

# O(n) - iterate through string
short = set("ab")     # O(2) - {'a', 'b'}
long = set("a" * 1000)  # O(1000) - {'a'} (one item!)

Common Patterns¶

Remove Duplicates¶

# O(n) - fastest way to remove duplicates
items = [1, 2, 2, 3, 3, 3, 4]
unique = set(items)  # O(7)
# {1, 2, 3, 4}

# Back to list if needed
unique_list = list(set(items))  # O(n)

Find Unique Items¶

# O(n) - check membership in set
items = [1, 2, 2, 3, 3, 3]
unique_count = len(set(items))  # O(n) - 3 unique items

Set Operations¶

# O(n) - after creating sets
s1 = set([1, 2, 3])
s2 = set([2, 3, 4])

intersection = s1 & s2      # {2, 3} - O(n)
union = s1 | s2            # {1, 2, 3, 4} - O(n)
difference = s1 - s2       # {1} - O(n)
symmetric_diff = s1 ^ s2   # {1, 4} - O(n)

Membership Testing¶

# O(1) - fast lookup
s = set([1, 2, 3, 4, 5])

if 3 in s:  # O(1)
    print("Found")

# vs list - O(n)
lst = [1, 2, 3, 4, 5]
if 3 in lst:  # O(n) - must scan
    print("Found")

Performance Patterns¶

vs List Creation¶

# O(n) - set creation
s = set([1, 2, 3, 4, 5])

# O(n) - list creation
lst = [1, 2, 3, 4, 5]

# Similar complexity, but set has better lookup: O(1) vs O(n)

Batch Operations¶

# O(n * m) - n items, m = avg hash time (usually O(1))
items = range(1000)
s = set(items)  # O(n)

# Then operations are fast
if 500 in s:    # O(1)
    pass

Practical Examples¶

Find Unique Words¶

# O(n) - get unique words from text
text = "the quick brown fox jumps over the lazy dog"
words = text.split()
unique_words = set(words)  # O(n)
# {'the', 'quick', 'brown', 'fox', ...}

Common Elements¶

# O(n) - find common items
list1 = [1, 2, 3, 4, 5]
list2 = [4, 5, 6, 7, 8]

set1 = set(list1)  # O(n)
set2 = set(list2)  # O(m)

common = set1 & set2  # {4, 5} - O(min(n,m))

Unique Count¶

# O(n) - count unique items
data = [1, 1, 2, 2, 2, 3, 3, 3, 3, 4]
unique_count = len(set(data))  # O(n) - 4 unique

# Useful for analytics
log_entries = ["error", "error", "warning", "error"]
unique_types = set(log_entries)  # O(n)

Validate Uniqueness¶

# O(n) - check if all unique
def has_duplicates(items):
    return len(items) != len(set(items))  # O(n)

assert not has_duplicates([1, 2, 3])     # True - unique
assert has_duplicates([1, 2, 2])         # False - duplicates

Edge Cases¶

Empty Set¶

# O(1)
s = set()       # set()
s = set([])     # set()
s = set("")     # set()

Single Item¶

# O(1)
s = set([1])    # {1}
s = set("a")    # {'a'}

All Duplicates¶

# O(n) - still must check all
items = [1, 1, 1, 1, 1]
s = set(items)  # O(5) - results in {1}

Unhashable Items¶

# O(n) - error for unhashable
try:
    s = set([[1, 2], [3, 4]])  # TypeError - lists not hashable
except TypeError:
    pass

# Use frozenset for nested immutable sets
s = set([frozenset([1, 2]), frozenset([3, 4])])  # OK

Set vs frozenset¶

# O(n) - both create from iterable
s = set([1, 2, 3])           # Mutable
fs = frozenset([1, 2, 3])    # Immutable

# frozenset can be in set, set cannot
s = {frozenset([1, 2]), frozenset([3, 4])}  # OK
# s = {{1, 2}, {3, 4}}  # TypeError

Memory Considerations¶

# O(n) - memory proportional to set size
small_set = set(range(10))      # 10 items
medium_set = set(range(10**4))  # 10,000 items
large_set = set(range(10**6))   # 1,000,000 items - uses significant memory

Set Literal vs set()¶

# Both O(n) but different syntax

# Literal - faster, no function call
s1 = {1, 2, 3, 4, 5}  # O(n)

# Function - more flexible
s2 = set([1, 2, 3, 4, 5])  # O(n)
s3 = set(range(5))  # O(n)

# Literal usually preferred for constants

Best Practices¶

✅ Do:

Use sets for membership testing (O(1) vs O(n) for lists)
Use sets to remove duplicates
Use set operations for intersection/union/difference
Use {} literal for constants, set() for iterables

❌ Avoid:

Using sets when order matters (they're unordered)
Putting unhashable items in sets (lists, dicts)
Creating sets of mutable objects (won't work)
Assuming set order is consistent

frozenset() - Immutable set
list() - Convert to list
dict() - Convert to dictionary
tuple() - Convert to tuple

Version Notes¶

Python 2.x: set() available, {} creates dict
Python 3.x: set() and {} syntax available
All versions: Unordered collection with O(1) lookup

set() Function Complexity¶

Complexity Analysis¶

Basic Usage¶

Create Empty Set¶

From List¶

From String¶

From Other Iterables¶

From Dictionary¶

Complexity Details¶

Hashing and Insertion¶

Duplicate Removal¶

From String¶

Common Patterns¶

Remove Duplicates¶

Find Unique Items¶

Set Operations¶

Membership Testing¶

Performance Patterns¶

vs List Creation¶

Batch Operations¶

Practical Examples¶

Find Unique Words¶

Common Elements¶

Unique Count¶

Validate Uniqueness¶

Edge Cases¶

Empty Set¶

Single Item¶

All Duplicates¶

Unhashable Items¶

Set vs frozenset¶

Memory Considerations¶

Set Literal vs set()¶

Best Practices¶

Related Functions¶

Version Notes¶