Skip to content

defaultdict - Dictionary with Default Values Complexity

The defaultdict class from collections provides a dictionary that returns a default value for missing keys instead of raising KeyError.

Complexity Reference

Operation Time Space Notes
defaultdict() O(1) O(1) Create dict
Lookup existing key O(1) avg O(1) O(n) worst case due to hash collisions
Lookup missing key O(1) avg O(1) Creates default; O(n) worst case
Insert O(1) avg O(1) O(n) worst case due to hash collisions
Delete O(1) avg O(1) O(n) worst case due to hash collisions

Basic Usage

from collections import defaultdict

# Create with default factory - O(1)
dd = defaultdict(list)  # O(1)

# Missing key returns default - O(1)
dd['key'].append(1)  # O(1) - creates empty list

# Regular dict behavior for existing keys - O(1)
dd['key'].append(2)  # O(1) - appends to list

# Access - O(1)
print(dd['key'])  # [1, 2]
print(dd['missing'])  # []

Default Factories

Common Factories

from collections import defaultdict

# Default to list - O(1)
dd_list = defaultdict(list)
dd_list['items'].append('a')  # Creates []

# Default to int - O(1)
dd_int = defaultdict(int)
dd_int['count'] += 1  # Creates 0, then += 1

# Default to set - O(1)
dd_set = defaultdict(set)
dd_set['members'].add('alice')  # Creates set()

# Default to dict - O(1)
dd_dict = defaultdict(dict)
dd_dict['nested']['key'] = 'value'  # Creates {}

Custom Factories

from collections import defaultdict

# Lambda function - O(1)
dd_lambda = defaultdict(lambda: 'default')
print(dd_lambda['missing'])  # 'default'

# Callable returning list - O(1)
def default_list():
    return []

dd_custom = defaultdict(default_list)
dd_custom['items'].append(1)

# Callable with arguments (use lambda) - O(1)
dd_tuple = defaultdict(lambda: (0, 0))
print(dd_tuple['point'])  # (0, 0)

Common Patterns

Counting Items

from collections import defaultdict

# Count occurrences - O(n)
words = ['apple', 'banana', 'apple', 'cherry', 'banana', 'apple']

word_count = defaultdict(int)
for word in words:  # O(n)
    word_count[word] += 1  # O(1)

print(word_count)  # {'apple': 3, 'banana': 2, 'cherry': 1}

# Compare with regular dict - more code
word_count_dict = {}
for word in words:  # O(n)
    if word in word_count_dict:  # O(1)
        word_count_dict[word] += 1
    else:
        word_count_dict[word] = 1

Grouping Items

from collections import defaultdict

# Group by key - O(n)
students = [
    ('Alice', 'A'),
    ('Bob', 'B'),
    ('Charlie', 'A'),
    ('David', 'C'),
]

grades = defaultdict(list)
for name, grade in students:  # O(n)
    grades[grade].append(name)  # O(1)

print(grades)  # {'A': ['Alice', 'Charlie'], 'B': ['Bob'], 'C': ['David']}

Building Graphs

from collections import defaultdict

# Build adjacency list - O(E)
edges = [('A', 'B'), ('B', 'C'), ('A', 'C')]

graph = defaultdict(list)
for u, v in edges:  # O(E)
    graph[u].append(v)  # O(1)

print(graph)  # {'A': ['B', 'C'], 'B': ['C']}

Inverse Mapping

from collections import defaultdict

# One-to-many mapping - O(n)
data = {'a': 1, 'b': 2, 'c': 1, 'd': 3, 'e': 2}

inverse = defaultdict(list)
for key, value in data.items():  # O(n)
    inverse[value].append(key)  # O(1)

print(inverse)  # {1: ['a', 'c'], 2: ['b', 'e'], 3: ['d']}

Compared to Regular Dict

from collections import defaultdict

# Regular dict - KeyError on missing key
d = {'a': 1}
# d['missing']  # KeyError!

# defaultdict - returns default
dd = defaultdict(int)
dd['missing']  # Returns 0

# Or handle in regular dict
value = d.get('missing', 0)  # Same result, more verbose

Counter vs defaultdict(int)

from collections import defaultdict, Counter

# defaultdict(int) - O(n)
dd = defaultdict(int)
for x in [1, 2, 2, 3, 3, 3]:  # O(n)
    dd[x] += 1
# Result: {1: 1, 2: 2, 3: 3}

# Counter - O(n) but with more methods
c = Counter([1, 2, 2, 3, 3, 3])  # O(n)
# Same result, but Counter has most_common(), etc.

# Use Counter for frequency counting
# Use defaultdict(int) for general counting

When to Use defaultdict

Good For:

  • Counting occurrences
  • Grouping items
  • Building graphs/networks
  • Nested structures
  • Avoiding KeyError checks

Not Good For:

  • One-to-one mappings (use dict)
  • Frequency counting only (use Counter)
  • When default creation has side effects
  • Performance critical code (dict is slightly faster)

Performance Comparison

from collections import defaultdict
import time

# Regular dict with get - O(n)
d = {}
start = time.time()
for i in range(1000000):
    d[i % 1000] = d.get(i % 1000, 0) + 1  # O(1)
dict_time = time.time() - start

# defaultdict - O(n)
dd = defaultdict(int)
start = time.time()
for i in range(1000000):
    dd[i % 1000] += 1  # O(1)
dd_time = time.time() - start

# defaultdict is typically faster due to C optimization

Version Notes

  • Python 2.x: Available in collections
  • Python 3.x: Same functionality
  • All versions: O(1) average dict operations

Best Practices

Do:

  • Use for grouping/counting
  • Use for nested structures
  • Use with appropriate default factory
  • Provide default factory explicitly

Avoid:

  • Complex default factory logic
  • Side effects in default factory
  • When regular dict with get() is clearer
  • When Counter is more appropriate