Recipes with itertools
The itertools module is one of Python’s most powerful standard library tools. It provides fast, memory-efficient functions for working with iterators. This guide shows real-world patterns you can use immediately.
Why itertools Matters
Every function in itertools returns a lazy iterator. That means values are generated on demand, not stored in memory. For large data sets or infinite sequences, this is essential. You can process streams of data without ever loading them entirely into RAM.
Infinite Sequences
Need a counter that goes forever? Use count(). It is perfect for generating IDs, timestamps, or any sequential numbering:
import itertools
for i in itertools.count(start=100):
print(i)
if i >= 105:
break
# Output: 100, 101, 102, 103, 104, 105
The step parameter controls the increment. You can use floats too:
import itertools
# Generate points for a line
for x in itertools.count(start=0.0, step=0.5):
if x > 3:
break
print(f"x = {x}")
# Output: x = 0.0, x = 0.5, x = 1.0, x = 1.5, x = 2.0, x = 2.5, x = 3.0
The cycle() function repeats an iterable forever. Useful for cycling through colors, states, or any repeating pattern:
import itertools
states = ['idle', 'processing', 'complete']
state_cycle = itertools.cycle(states)
for _ in range(5):
print(next(state_cycle))
# Output: idle, processing, complete, idle, processing
Need the same value repeatedly? repeat() handles that efficiently:
import itertools
# Create a constant iterator
ones = itertools.repeat(1)
print(next(ones)) # 1
print(next(ones)) # 1
# Or limit the repetitions
five_twos = list(itertools.repeat(2, times=5))
print(five_twos) # [2, 2, 2, 2, 2]
Limiting Infinite Iterators
Infinite iterators need limits. The islice() function works like list slicing but for iterators:
import itertools
# First 10 even numbers
evens = list(itertools.islice(itertools.count(step=2), 10))
print(evens) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
# Skip first 5, take next 5
numbers = range(100)
subset = list(itertools.islice(numbers, 5, 10))
print(subset) # [5, 6, 7, 8, 9]
# Every third element, first 10
every_third = list(itertools.islice(range(100), 0, 30, 3))
print(every_third) # [0, 3, 6, 9, 12, 15, 18, 21, 24, 27]
A common pattern is creating a “take while” function:
import itertools
def take_while(iterable, predicate):
"""Take elements while predicate returns True."""
return itertools.takewhile(predicate, iterable)
numbers = range(20)
small = list(take_while(numbers, lambda x: x < 10))
print(small) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Similarly, dropwhile() skips elements until the predicate becomes false:
import itertools
data = [1, 1, 1, 2, 3, 1, 4]
remaining = list(itertools.dropwhile(lambda x: x == 1, data))
print(remaining) # [2, 3, 1, 4]
Combining Iterables
The chain() function connects multiple iterables into one:
import itertools
list1 = [1, 2, 3]
list2 = [4, 5, 6]
tuple1 = ('a', 'b')
combined = list(itertools.chain(list1, list2, tuple1))
print(combined) # [1, 2, 3, 4, 5, 6, 'a', 'b']
Use chain.from_iterable() to flatten nested iterables:
import itertools
nested = [[1, 2], [3, 4], [5]]
flat = list(itertools.chain.from_iterable(nested))
print(flat) # [1, 2, 3, 4, 5]
For uneven-length iterables, zip_longest() pads with a fill value:
import itertools
a = [1, 2, 3]
b = ['a', 'b']
paired = list(itertools.zip_longest(a, b, fillvalue='?'))
print(paired) # [(1, 'a'), (2, 'b'), (3, '?')]
Accumulation Patterns
The accumulate() function computes running totals by default, but accepts any binary function:
import itertools
data = [1, 2, 3, 4, 5]
# Running sum
print(list(itertools.accumulate(data)))
# Output: [1, 3, 6, 10, 15]
# Running maximum
print(list(itertools.accumulate(data, func=max)))
# Output: [1, 2, 3, 4, 5]
Track running statistics:
import itertools
def running_stats():
"""Generator that yields running mean and std dev."""
count = 0
mean = 0
m2 = 0
def update(value):
nonlocal count, mean, m2
count += 1
delta = value - mean
mean += delta / count
delta2 = value - mean
m2 += delta * delta2
variance = m2 / count if count > 1 else 0
return mean, variance ** 0.5
return update
stats = running_stats()
for val in [10, 20, 30, 40]:
mean, std = stats(val)
print(f"value={val}, mean={mean:.2f}, std={std:.2f}")
Grouping with groupby
The groupby() function groups consecutive elements. Remember: the input must be sorted by the key:
import itertools
data = ['a', 'a', 'b', 'b', 'b', 'c', 'a', 'a']
for key, group in itertools.groupby(data):
print(f"'{key}': {list(group)}")
# Output:
# 'a': ['a', 'a']
# 'b': ['b', 'b', 'b']
# 'c': ['c']
# 'a': ['a', 'a']
# Group by length (must sort first)
words = sorted(['hi', 'hello', 'hey', 'there', 'yo'], key=len)
for length, group in itertools.groupby(words, key=len):
print(f"length {length}: {list(group)}")
A practical use: detecting changes in a sequence:
import itertools
def runs(sequence):
"""Yield (value, count) for each consecutive run."""
for key, group in itertools.groupby(sequence):
yield key, sum(1 for _ in group)
data = [1, 1, 1, 2, 2, 1, 1, 1, 1]
print(list(runs(data)))
# Output: [(1, 3), (2, 2), (1, 4)]
Combinatorics for Testing
Generate all combinations for test cases:
import itertools
# All combinations of two dice
dice = [1, 2, 3, 4, 5, 6]
rolls = list(itertools.product(dice, repeat=2))
print(f"Total rolls: {len(rolls)}") # 36
# All possible boolean flags (4 flags = 16 combinations)
flags = [False, True]
states = list(itertools.product(flags, repeat=4))
print(f"Total states: {len(states)}") # 16
Generate permutations for ordering problems:
import itertools
# All orderings of 4 items
items = ['a', 'b', 'c', 'd']
perms = list(itertools.permutations(items))
print(f"Total permutations: {len(perms)}") # 24
Use combinations for selection problems:
import itertools
# All 3-card hands from a deck
ranks = ['2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K', 'A']
hands = list(itertools.combinations(ranks, 3))
print(f"Total 3-card hands: {len(hands)}") # 2860
Real-World Recipes
Pagination
import itertools
def paginate(items, page_size):
"""Yield pages of items."""
it = iter(items)
while True:
page = list(itertools.islice(it, page_size))
if not page:
break
yield page
data = range(25)
for i, page in enumerate(paginate(data, 7), 1):
print(f"Page {i}: {page}")
# Output:
# Page 1: [0, 1, 2, 3, 4, 5, 6]
# Page 2: [7, 8, 9, 10, 11, 12, 13]
# Page 3: [14, 15, 16, 17, 18, 19, 20]
# Page 4: [21, 22, 23, 24]
Sliding Window
import itertools
def sliding_window(sequence, size):
"""Create sliding windows of given size."""
it = iter(sequence)
window = list(itertools.islice(it, size))
if len(window) == size:
yield tuple(window)
for item in it:
window = window[1:] + [item]
yield tuple(window)
numbers = [1, 2, 3, 4, 5]
for window in sliding_window(numbers, 3):
print(window)
# Output: (1, 2, 3), (2, 3, 4), (3, 4, 5)
Round-Robin Scheduling
import itertools
def round_robin(*teams):
"""Schedule games in round-robin format."""
n = len(teams)
for i in range(n - 1):
round_games = []
for j in range(n // 2):
home = teams[j]
away = teams[n - 1 - j]
round_games.append((home, away))
yield round_games
teams = teams[1:] + teams[:1]
teams = ['A', 'B', 'C', 'D']
for i, games in enumerate(round_robin(*teams), 1):
print(f"Round {i}: {games}")
Filtering Duplicates
import itertools
def unique_everseen(iterable):
"""Remove duplicates, keeping first occurrence."""
seen = set()
for item in iterable:
if item not in seen:
seen.add(item)
yield item
data = [1, 2, 2, 3, 1, 4, 3, 5]
print(list(unique_everseen(data))) # [1, 2, 3, 4, 5]
Performance Notes
All itertools functions return iterators, not lists. This means memory usage stays constant regardless of data size. You must consume values with next(), loops, or conversion. Chaining multiple itertools is efficient.
For very large datasets, itertools can be the difference between running out of memory and successful processing. The lazy evaluation is the key.
See Also
- itertools-module — Full reference for all itertools functions
- generators — Python generator functions and yield
- functional-python — Functional programming patterns in Python