Python Generators

Python generators are a nice feature that simplifies how to create iterators. In this way, you can provide values one at a time as they are being produced instead of computing and storing all values at once. This saves memory, especially when dealing with big datasets or infinite sequences, and improves performance.

Key Characteristics of Generators

  1. Lazy Evaluation: Generators produce values only when required, making them memory-efficient.
  2. Yield Statement: Generators utilize yield instead of return in order to produce values one at a time.
  3. State Retention: Unlike regular functions, generators retain their state between successive calls so that they can “pause” and “resume” execution.
  4. Iterators: All generators are iterators, meaning they support the iterator protocol with methods like iter() and next().

Making a Generator

Generators can be created in two basic ways:

  1. Generator Functions: Functions that use the yield keyword.
  2. Generator Expressions: Similar to list comprehensions but with parentheses instead of square brackets.

1. Generator Functions

Generator functions are defined like regular functions but use the yield keyword to produce values.

Example:

def count_up_to(n):
    count = 1
    while count <= n:
        yield count
        count += 1

Explanation:

  • yield: Pauses the function and sends a value to the caller.
  • State Retention: When resumed, execution starts right after the last yield.

Usage:

gen = count_up_to(5)
print(next(gen))  # Outputs: 1
print(next(gen))  # Outputs: 2
for number in gen:  # Outputs remaining values: 3, 4, 5
    print(number)

2. Generator Expressions

Generator expressions are concise and look like list comprehensions but generate values lazily.

Syntax:

(expression for item in iterable if condition)

Example:

gen = (x**2 for x in range(5))
print(next(gen))  # Outputs: 0
print(next(gen))  # Outputs: 1
for value in gen:  # Outputs remaining values: 4, 9, 16
    print(value)

Benefits of Generators

  1. Memory Efficiency: Only one value is stored in memory at a time, unlike lists which store all values simultaneously.
  2. Infinite sequences: generators can represent infinite sequences (e.g., Fibonacci series) without using too much memory.
  3. Pipelining: Generators may be chained to process data in stages.

Example of Infinite Sequence:

def infinite_counter():
    count = 0
    while True:
        yield count
        count += 1

gen = infinite_counter()
print(next(gen))  # Outputs: 0
print(next(gen))  # Outputs: 1

Generator Methods

Generators provide a few additional methods:

  1. __next__(): Retrieve the next value (or raise StopIteration if exhausted).
print(next(gen))

2. send(value): Resume the generator and optionally send a value to it.

def custom_gen():
    value = yield "start"
    yield f"received: {value}"

g = custom_gen()
print(next(g))        # Outputs: "start"
print(g.send(42))     # Outputs: "received: 42"

3. close(): Stop the generator, raising GeneratorExit.

g.close()

When to Use Generators

  • Processing large data files line by line.
  • Implementing infinite or large sequences.
  • Breaking complex computations into smaller chunks.
  • Pipelining data processing tasks.