heapq
The heapq module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm. A heap is a binary tree where every parent node has a value less than or equal to any of its children (min-heap). The smallest element is always at the root (index 0), making it efficient to access the minimum value in O(1) time while insertions and deletions take O(log n).
Python’s heapq implements min-heaps using lists where heap[k] <= heap[2*k+1] and heap[k] <= heap[2*k+2]. This makes heap[0] the smallest item. The module is useful for priority queues, task scheduling, finding k smallest/largest elements, and merging sorted streams.
Syntax
import heapq
# Create a heap from a list
heap = [3, 1, 4, 1, 5, 9, 2, 6]
heapq.heapify(heap)
# Push and pop
heapq.heappush(heap, 0)
smallest = heapq.heappop(heap)
# Get k smallest/largest
smallest_3 = heapq.nsmallest(3, data)
largest_5 = heapq.nlargest(5, data)
heappush()
Pushes an item onto the heap while maintaining the heap invariant. The heap size increases by one.
Signature
heapq.heappush(heap, item)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
heap | list | — | The list to push the item onto (must be a valid heap) |
item | any | — | The item to push onto the heap |
Examples
Basic push operation
import heapq
# Start with an empty heap
heap = []
# Push items one by one
heapq.heappush(heap, 5)
heapq.heappush(heap, 2)
heapq.heappush(heap, 8)
heapq.heappush(heap, 2)
print(heap)
# [2, 2, 8, 5]
# The smallest is always at index 0
print(heap[0])
# 2
Using with tuples for priority queues
import heapq
# Priority queue: (priority, task)
tasks = []
heapq.heappush(tasks, (3, 'write documentation'))
heapq.heappush(tasks, (1, 'fix critical bug'))
heapq.heappush(tasks, (2, 'code review'))
# Pop returns highest priority (lowest number) first
while tasks:
priority, task = heapq.heappop(tasks)
print(f"Priority {priority}: {task}")
# Output:
# Priority 1: fix critical bug
# Priority 2: code review
# Priority 3: write documentation
heappop()
Pops and returns the smallest item from the heap. Raises IndexError if the heap is empty.
Signature
heapq.heappop(heap)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
heap | list | — | The heap to pop from (must be a valid heap) |
Examples
Basic pop operation
import heapq
heap = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
heapq.heapify(heap)
# Pop all items in sorted order
while heap:
item = heapq.heappop(heap)
print(item, end=' ')
# 0 1 2 3 4 5 6 7 8 9
Peeking without removing
import heapq
heap = [5, 2, 8, 1, 9]
heapq.heapify(heap)
# Access smallest without popping
print(heap[0])
# 1
# Heap remains unchanged
print(len(heap))
# 5
heapify()
Transforms a list into a valid min-heap in-place. This operates in O(n) time, which is more efficient than pushing items one by one.
Signature
heapq.heapify(x)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
x | list | — | The list to transform into a heap (modified in-place) |
Examples
Converting a list to a heap
import heapq
# Random order list
data = [9, 1, 7, 3, 5, 2, 8, 4, 6]
# Convert to heap in-place
heapq.heapify(data)
print(data)
# [1, 3, 2, 4, 5, 7, 8, 9, 6]
# Verify: smallest at index 0
print(data[0])
# 1
Efficient alternative to repeated heappush
import heapq
import time
# Method 1: Push one by one
data1 = list(range(1000, 0, -1))
start = time.time()
heap1 = []
for item in data1:
heapq.heappush(heap1, item)
time1 = time.time() - start
# Method 2: heapify
data2 = list(range(1000, 0, -1))
start = time.time()
heapq.heapify(data2)
time2 = time.time() - start
print(f"Heappush loop: {time1:.4f}s")
print(f"Heapify: {time2:.4f}s")
# heapify is significantly faster for bulk operations
heappushpop()
Pushes an item onto the heap and then pops the smallest item. This combined operation is more efficient than calling heappush() and heappop() separately.
Signature
heapq.heappushpop(heap, item)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
heap | list | — | The heap to operate on |
item | any | — | The item to push before popping |
Examples
More efficient than separate push and pop
import heapq
heap = [1, 2, 3, 4, 5]
heapq.heapify(heap)
# Push 0, then pop smallest (which is 0)
result = heapq.heappushpop(heap, 0)
print(f"Returned: {result}")
# Returned: 0
print(heap)
# [1, 2, 3, 4, 5]
# If item is larger than smallest, returns the smaller
heap2 = [1, 2, 3]
result2 = heapq.heappushpop(heap2, 10)
print(f"Returned: {result2}")
# Returned: 1
print(heap2)
# [2, 3, 10]
Use case: maintaining top-k elements
import heapq
def keep_top_k(stream, k):
"""Keep only the k largest items from a stream"""
heap = []
for item in stream:
heapq.heappushpop(heap, item)
return sorted(heap, reverse=True)
data = [5, 1, 3, 9, 2, 8, 4, 7, 6]
print(keep_top_k(data, 3))
# [9, 8, 7]
heapreplace()
Pops the smallest item from the heap and pushes a new item. The heap size remains the same. Unlike heappushpop(), this pops first, then pushes.
Signature
heapq.heapreplace(heap, item)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
heap | list | — | The heap to operate on (must not be empty) |
item | any | — | The item to push after popping |
Examples
Fixed-size sliding window
import heapq
# Sliding window of size 3 - keep track of 3 smallest
window = [9, 5, 7]
heapq.heapify(window)
new_values = [3, 8, 1, 4, 6]
for val in new_values:
popped = heapq.heapreplace(window, val)
print(f"Replaced {popped} with {val}, window now: {window}")
# Output:
# Replaced 9 with 3, window now: [3, 5, 7]
# Replaced 3 with 8, window now: [5, 7, 8]
# Replaced 5 with 1, window now: [1, 7, 8]
# Replaced 1 with 4, window now: [4, 7, 8]
# Replaced 4 with 6, window now: [6, 7, 8]
Difference from heappushpop
import heapq
heap = [1, 2, 3]
# heappushpop: pushes first, then pops smallest of all
result1 = heapq.heappushpop(heap, 0)
print(f"heappushpop({0}): {result1}, heap: {heap}")
# heappushpop(0): 0, heap: [1, 2, 3]
# heapreplace: pops first, then pushes
heap2 = [1, 2, 3]
result2 = heapq.heapreplace(heap2, 0)
print(f"heapreplace({0}): {result2}, heap: {heap2}")
# heapreplace(0): 1, heap: [0, 2, 3]
nlargest()
Returns a list with the n largest elements from the iterable. Uses a heap internally for efficiency.
Signature
heapq.nlargest(n, iterable, key=None)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
n | int | — | Number of largest elements to return |
iterable | iterable | — | The iterable to search |
key | callable | None | Function to extract comparison key from each element |
Examples
Finding top elements
import heapq
data = [1, 5, 2, 8, 3, 9, 4, 7, 6]
# Get 3 largest
print(heapq.nlargest(3, data))
# [9, 8, 7]
# Get top 5
print(heapq.nlargest(5, data))
# [9, 8, 7, 6, 5]
Using key function
import heapq
# List of dictionaries
products = [
{'name': 'laptop', 'price': 999},
{'name': 'phone', 'price': 699},
{'name': 'tablet', 'price': 449},
{'name': 'watch', 'price': 299},
{'name': 'earbuds', 'price': 149},
]
# Get 2 most expensive
top_expensive = heapq.nlargest(2, products, key=lambda p: p['price'])
print(top_expensive)
# [{'name': 'laptop', 'price': 999}, {'name': 'phone', 'price': 699}]
nsmallest()
Returns a list with the n smallest elements from the iterable. Uses a heap internally for efficiency.
Signature
heapq.nsmallest(n, iterable, key=None)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
n | int | — | Number of smallest elements to return |
iterable | iterable | — | The iterable to search |
key | callable | None | Function to extract comparison key from each element |
Examples
Finding smallest elements
import heapq
data = [9, 1, 4, 7, 3, 2, 8, 5, 6]
# Get 3 smallest
print(heapq.nsmallest(3, data))
# [1, 2, 3]
# Get 5 smallest
print(heapq.nsmallest(5, data))
# [1, 2, 3, 4, 5]
Using key with complex data
import heapq
# Students with grades
students = [
{'name': 'Alice', 'grade': 85},
{'name': 'Bob', 'grade': 92},
{'name': 'Carol', 'grade': 78},
{'name': 'Dave', 'grade': 88},
{'name': 'Eve', 'grade': 95},
]
# Get bottom 2 by grade
bottom_students = heapq.nsmallest(2, students, key=lambda s: s['grade'])
print(bottom_students)
# [{'name': 'Carol', 'grade': 78}, {'name': 'Alice', 'grade': 85}]
Common Patterns
Priority queue with task ordering
When multiple tasks have the same priority, use a counter to maintain insertion order:
import heapq
import itertools
class PriorityQueue:
def __init__(self):
self._heap = []
self._counter = itertools.count()
def push(self, priority, task):
count = next(self._counter)
heapq.heappush(self._heap, (priority, count, task))
def pop(self):
if not self._heap:
raise IndexError("pop from empty queue")
priority, count, task = heapq.heappop(self._heap)
return priority, task
pq = PriorityQueue()
pq.push(2, 'task A')
pq.push(1, 'task B')
pq.push(1, 'task C')
pq.push(3, 'task D')
print(pq.pop()) # (1, 'task B')
print(pq.pop()) # (1, 'task C')
print(pq.pop()) # (2, 'task A')
Merging sorted streams
The merge function efficiently combines multiple sorted iterables:
import heapq
# Multiple sorted log files
log1 = [1, 4, 7, 10]
log2 = [2, 5, 8, 11]
log3 = [3, 6, 9, 12]
# Merge in sorted order
merged = list(heapq.merge(log1, log2, log3))
print(merged)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
# Works with different sources - sorted timestamps from multiple sources
import itertools
timestamps = [
(2024, 1, 15), (2024, 1, 20), (2024, 1, 25),
(2024, 1, 18), (2024, 1, 22),
]
# Simulate sorted iterables
sorted_streams = [iter(ts) for ts in timestamps]
for item in heapq.merge(*sorted_streams):
print(item)
Running median with two heaps
import heapq
def running_median(stream):
"""Track running median using two heaps"""
low = [] # max-heap (negative values)
high = [] # min-heap
for x in stream:
# Push to low heap (negate for max-heap behavior)
heapq.heappush(low, -heapq.heappushpop(high, x))
# Balance heaps
if len(low) > len(high):
heapq.heappush(high, -heapq.heappop(low))
# Calculate median
if len(high) > len(low):
median = float(high[0])
else:
median = (-low[0] + high[0]) / 2
yield median
data = [5, 15, 1, 3, 8, 7, 9, 2]
print(list(running_median(data)))
# [5.0, 10.0, 5.0, 4.0, 5.0, 6.0, 7.0, 5.0]