[AI Readability Summary]
This article explains how iterators and generators in Python produce data lazily, one item at a time, instead of loading everything into memory at once. It clarifies the relationship between Iterable, Iterator, and Generator, and shows why iter(), next(), and yield matter for memory-efficient traversal and streaming workflows.
Technical Specification Snapshot
| Parameter | Description |
|---|---|
| Language | Python |
| Core Topics | Iterator, Generator, Iterable |
| Protocol | __iter__() / __next__() iteration protocol |
| Star Count | Not provided in the original article |
| Core Dependencies | Python built-ins: iter(), next(), yield |
| Applicable Scenarios | Large-scale data traversal, stream processing, lazy evaluation |
Iterators are the foundation of lazy traversal in Python
The core value of an iterator is not simply that it can traverse data, but that it can traverse data on demand. It does not load all data into memory at once. Instead, it returns only one element each time you request it. This model works especially well for long sequences, file streams, and network data streams.
From an interface perspective, any object that implements both __iter__() and __next__() behaves as an iterator. The former usually returns the object itself, while the latter produces the next value. When the data is exhausted, it must raise StopIteration.
nums = [1, 2, 3]
it = iter(nums) # Convert an iterable object into an iterator
print(next(it)) # Get the 1st element
print(next(it)) # Get the 2nd element
print(next(it)) # Get the 3rd element
This code shows how iter() and next() work together to perform the most basic item-by-item retrieval.
A custom iterator reveals the protocol more clearly
When you implement your own class, it becomes much easier to understand what the iterator protocol actually does: store state, advance step by step, and raise an exception when the sequence ends. In essence, it is a sequential reader that can pause between values.
class MyIterator:
def __init__(self, n):
self.n = n
self.cur = 0
def __iter__(self):
return self # An iterator usually returns itself
def __next__(self):
if self.cur < self.n:
self.cur += 1
return self.cur # Return the current value and advance the state
raise StopIteration # Explicitly stop when the data is exhausted
it = MyIterator(3)
for i in it:
print(i)
This example implements a custom iterator from 1 to n and fully demonstrates state progression and termination control.
AI Visual Insight: This image shows the console output after running the custom iterator. It demonstrates that a for loop keeps calling __next__() until StopIteration is raised, which confirms how the iteration protocol produces values one by one at runtime.
Generators provide a higher-level and easier way to implement iterators
You can think of a generator as an iterator whose state machine Python builds for you automatically. Instead of writing __next__() manually, you only need to use yield inside a function, and the interpreter handles pausing, resuming, and state preservation.
This means generators keep the lazy-evaluation behavior of iterators while significantly reducing implementation complexity. In most application code, generators are usually the preferred option instead of hand-written iterator classes.
def gen():
yield 1 # Yield the first value and pause
yield 2 # Resume and continue execution here
yield 3 # Resume again and return the final value
g = gen()
print(next(g))
print(next(g))
print(next(g))
This code shows that yield pauses the function after producing a value and continues from that exact point on the next next() call.
Generator expressions fit lightweight streaming computation
In addition to generator functions, Python also provides generator expressions. Their syntax looks similar to list comprehensions, but their behavior is completely different: a list creates all results immediately, while a generator computes values only when needed.
lst = [x * x for x in range(5)] # Generate the full list at once
gen = (x * x for x in range(5)) # Generate results on demand
print(lst)
print(next(gen)) # Compute only when a value is requested
This example clearly illustrates the difference between eager evaluation and lazy evaluation.
Iterable, Iterator, and Generator form a progressively narrower relationship
Many beginners confuse iterable objects with iterators. Objects such as list, dict, and str can usually be traversed with for, but they are not iterators themselves. They are iterable objects.
The key property of an iterable object is that it can produce an iterator through iter(). A generator goes one step further: it is both an iterable object and an iterator itself, so you can call next() on it directly.
items = [10, 20, 30] # Iterable object
it = iter(items) # Convert to an iterator
print(next(it)) # Read values one by one
print(hasattr(it, '__next__')) # Check whether it supports iterator behavior
This code demonstrates the conversion path from Iterable to Iterator, which is also the key entry point for understanding Python’s traversal model.
Comparing the three builds a clear mental model quickly
| Comparison Point | Iterable | Iterator | Generator |
|---|---|---|---|
Can be traversed with for |
Yes | Yes | Yes |
Supports direct next() |
No | Yes | Yes |
| Generates values on demand | Depends on the implementation | Yes | Yes |
| Definition method | Implement __iter__() |
Implement __iter__() and __next__() |
Use yield or a generator expression |
| Implementation complexity | Low | High | Low |
This table helps define the boundary of responsibility for each concept: an iterable object is responsible for being traversable, an iterator is responsible for item-by-item retrieval, and a generator is responsible for low-cost lazy generation.
The difference between yield and return determines whether a function can behave like a data stream
The semantics of return are to return once and end the function immediately. In contrast, the semantics of yield are to return a value, pause execution, and preserve the current state. Because it preserves local variables and the execution position, a generator naturally fits data-stream processing.
If a function needs to produce results in batches and avoid occupying too much memory at once, yield is usually a better choice than return. On the other hand, if you only need to return one final result, a regular function is more direct.
def normal_func():
return "Step 1" # The function ends immediately after execution
def generator_func():
yield "Step 1" # Yield and pause
yield "Step 2" # Continue on the next call
yield "Step 3" # Continue until all data has been produced
print(normal_func())
gen = generator_func()
print(next(gen))
print(next(gen))
This example shows the fundamental difference between a regular function that returns once and a generator function that produces values in stages.
Since Python 3.3, return can also terminate a generator
Inside a generator, yield continuously produces values, while return terminates the flow completely. If you write return value, it effectively raises StopIteration(value), although that return value is usually not visible directly in a normal for loop.
So when you think about generators, keep one sentence in mind: yield determines how data flows, and return determines when the data flow ends.
FAQ
1. Why do generators use less memory than lists?
Because a list loads all elements into memory at once, while a generator computes only the current value when next() is called. That makes generators demand-driven by design.
2. What exactly is the relationship between a generator and an iterator?
A generator is essentially a special kind of iterator. It naturally supports __iter__() and __next__(), but Python manages its internal state automatically.
3. When should you prefer a generator?
You should prefer a generator when the dataset is large, the results can be consumed incrementally, or you need to stream files and network responses. It reduces memory usage and keeps the code simpler.
Core summary: This article systematically explains the relationship among iterable objects, iterators, and generators in Python. It clarifies the differences between __iter__, __next__, yield, and return, and uses examples to show their value in memory savings, on-demand computation, and data-stream processing.