Understanding Python Generators

· 4 min read · Updated March 6, 2026 · intermediate
generators iterators lazy-evaluation yield

Generators are one of Python’s most powerful features for working with large datasets and building memory-efficient pipelines.

What Is a Generator?

A generator is a function that uses yield instead of return. Each call to next() resumes execution from where the function last yielded.

def countdown(n: int):
    """Yield integers from n down to 1."""
    while n > 0:
        yield n
        n -= 1

for number in countdown(5):
    print(number)

Unlike a list comprehension that builds the entire result in memory, a generator produces values one at a time.

Generator Expressions

Just as list comprehensions create lists, generator expressions create generators:

# List comprehension — builds entire list in memory
squares_list = [x ** 2 for x in range(1_000_000)]

# Generator expression — produces values lazily
squares_gen = (x ** 2 for x in range(1_000_000))

The generator expression uses almost no memory regardless of the range size.

The yield Keyword

When Python encounters yield, it suspends the function’s state and returns the yielded value. On the next call to next(), execution resumes immediately after the yield statement.

def fibonacci():
    """Generate an infinite Fibonacci sequence."""
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

fib = fibonacci()
first_ten = [next(fib) for _ in range(10)]
print(first_ten)
# [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Building Data Pipelines

Generators shine when chained together to process data in stages:

def read_lines(filename: str):
    """Yield lines from a file one at a time."""
    with open(filename) as f:
        for line in f:
            yield line.strip()

def filter_comments(lines):
    """Skip lines that start with #."""
    for line in lines:
        if not line.startswith('#'):
            yield line

def parse_csv(lines):
    """Split each line by comma."""
    for line in lines:
        yield line.split(',')

# Chain the pipeline
pipeline = parse_csv(filter_comments(read_lines('data.csv')))
for row in pipeline:
    print(row)

Each stage processes one item at a time. The entire file is never loaded into memory.

send() and Generator Communication

Generators can also receive values from the caller using send(). The value passed to send() becomes the result of the yield expression inside the generator:

def accumulator():
    """Accept values and yield running totals."""
    total = 0
    while True:
        value = yield total
        if value is None:
            break
        total += value

acc = accumulator()
next(acc)        # prime the generator (advance to first yield)
print(acc.send(10))   # 10
print(acc.send(20))   # 30
print(acc.send(5))    # 35

The first call must be next() (or send(None)) to advance the generator to the first yield. After that, each send(value) resumes the generator with value as the result of the yield expression.

Generator vs Iterator Protocol

Generators automatically implement the iterator protocol — they have __iter__() and __next__() methods. Writing a class-based iterator for the same behavior requires substantially more boilerplate:

# Class-based iterator — verbose
class Countdown:
    def __init__(self, n: int):
        self.n = n

    def __iter__(self):
        return self

    def __next__(self):
        if self.n <= 0:
            raise StopIteration
        self.n -= 1
        return self.n + 1

# Generator — concise
def countdown(n: int):
    while n > 0:
        yield n
        n -= 1

Both produce identical results, but the generator is shorter and easier to read. Prefer generators unless you need to maintain complex internal state or implement additional methods beyond the iterator protocol.

yield from (Delegating Generators)

Python 3.3 introduced yield from, which delegates to a sub-generator and transparently passes values through:

def flatten(nested):
    """Recursively flatten a nested list."""
    for item in nested:
        if isinstance(item, list):
            yield from flatten(item)
        else:
            yield item

data = [1, [2, 3], [4, [5, 6]], 7]
print(list(flatten(data)))
# [1, 2, 3, 4, 5, 6, 7]

Without yield from, you would need an explicit loop: for x in flatten(item): yield x. The yield from syntax is cleaner and also correctly propagates send() and throw() calls to the sub-generator.

When to Use Generators

Use generators when:

  • Processing large files or datasets that don’t fit in memory
  • Building pipelines that transform data in stages
  • Creating infinite sequences (Fibonacci, prime numbers, sensor readings)
  • You need lazy evaluation — computing values only when requested

Use regular functions and lists when:

  • You need to access elements by index
  • You need to iterate over the data multiple times
  • The dataset is small enough to fit comfortably in memory
  • You need the len() of the result