Python itertools Module
last modified April 2, 2025
The itertools
module provides a set of fast, memory-efficient tools
for working with iterators. These functions are inspired by constructs from
functional programming languages and are designed to work seamlessly with
Python's iterator protocol. This guide covers all itertools functions with
practical examples, performance considerations, and real-world applications.
Infinite Iterators
itertools provides three functions for creating infinite iterators:
count
, cycle
, and repeat
. These generate
values indefinitely until explicitly stopped. This example demonstrates their
basic usage patterns and common applications.
import itertools # 1. count(start=0, step=1) - infinite arithmetic sequence counter = itertools.count(start=5, step=3) print("Count:", [next(counter) for _ in range(5)]) # [5, 8, 11, 14, 17] # 2. cycle(iterable) - infinitely cycle through an iterable cycler = itertools.cycle('ABC') print("Cycle:", [next(cycler) for _ in range(6)]) # ['A', 'B', 'C', 'A', 'B', 'C'] # 3. repeat(object[, times]) - repeat object indefinitely or fixed times repeater = itertools.repeat('hello', 3) print("Repeat:", list(repeater)) # ['hello', 'hello', 'hello'] # Additional example: Creating sliding windows with count data = [10, 20, 30, 40, 50] windows = zip(itertools.count(), data, data[1:], data[2:]) print("Sliding windows:", list(windows)) # [(0, 10, 20, 30), (1, 20, 30, 40), (2, 30, 40, 50)]
count
generates an infinite sequence of numbers with optional start
and step values. cycle
endlessly repeats the elements of a finite
iterable. repeat
yields the same value either indefinitely or a
specified number of times.
These infinite iterators are memory-efficient as they generate values on-demand.
They're often used with zip
or islice
to create finite
sequences or with functions that need indefinite streams of values.
Combinatoric Iterators
The combinatoric iterators (product
, permutations
,
combinations
, etc.) generate complex sequences from input
iterables. These functions are invaluable for solving problems involving
combinations, permutations, or Cartesian products.
import itertools # 4. product(*iterables, repeat=1) - Cartesian product dice = itertools.product([1, 2, 3], ['a', 'b']) print("Product:", list(dice)) # [(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')] # 5. permutations(iterable, r=None) - r-length permutations letters = itertools.permutations('ABC', 2) print("Permutations:", list(letters)) # [('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')] # 6. combinations(iterable, r) - r-length combinations, no repeats cards = itertools.combinations(['♥A', '♦K', '♣Q'], 2) print("Combinations:", list(cards)) # [('♥A', '♦K'), ('♥A', '♣Q'), ('♦K', '♣Q')] # 7. combinations_with_replacement(iterable, r) - with repeats dice_rolls = itertools.combinations_with_replacement([1, 2, 3], 2) print("Combinations w/replacement:", list(dice_rolls)) # [(1, 1), (1, 2), (1, 3), (2, 2), (2, 3), (3, 3)] # Additional example: Generating truth tables variables = [False, True] truth_table = itertools.product(variables, repeat=2) print("Truth table:") for a, b in truth_table: print(f"{a} AND {b} = {a and b}")
product
computes the Cartesian product of input iterables,
equivalent to nested for-loops. permutations
generates all possible
orderings with no repeated elements. combinations
produces
subsequences where order doesn't matter, while
combinations_with_replacement
allows repeated elements.
These functions are particularly useful in probability, statistics, game development, and algorithm design. They can generate large result sets, so they're often used with other itertools to limit output.
Iterators Terminating on Shortest Input
This group includes functions like chain
, zip_longest
,
and filterfalse
that process multiple iterables until the shortest
is exhausted (except zip_longest
). These are essential for working
with multiple data streams.
import itertools # 8. chain(*iterables) - concatenate iterables merged = itertools.chain('ABC', [1, 2, 3], (True, False)) print("Chain:", list(merged)) # ['A', 'B', 'C', 1, 2, 3, True, False] # 9. zip_longest(*iterables, fillvalue=None) - zip to longest iterable names = ['Alice', 'Bob'] scores = [85, 92, 78] zipped = itertools.zip_longest(names, scores, fillvalue='N/A') print("Zip longest:", list(zipped)) # [('Alice', 85), ('Bob', 92), ('N/A', 78)] # 10. filterfalse(predicate, iterable) - elements where predicate is False numbers = [0, 1, 0, 2, 3, 0, 4] non_zeros = itertools.filterfalse(lambda x: x == 0, numbers) print("Filterfalse:", list(non_zeros)) # [1, 2, 3, 4] # 11. islice(iterable, stop) or islice(iterable, start, stop[, step]) - slice iterator infinite = itertools.count() first_5_evens = itertools.islice(infinite, 0, 10, 2) print("Islice:", list(first_5_evens)) # [0, 2, 4, 6, 8] # Additional example: Processing batches data = range(100) batch_size = 10 for batch in itertools.islice(data, 0, None, batch_size): print("Batch:", list(itertools.islice(data, batch, batch + batch_size)))
chain
is particularly useful for combining disparate data sources.
zip_longest
handles uneven length iterables gracefully.
filterfalse
provides the inverse of the built-in
filter
. islice
enables efficient slicing of iterators
without converting to lists.
These functions shine in data processing pipelines where you need to combine, filter, or window streams of data without loading everything into memory. They're often used with file processing and database queries.
Grouping and Filtering
The groupby
and takewhile
/dropwhile
functions provide powerful tools for organizing and filtering sequential data.
These are particularly valuable for data analysis and preprocessing tasks.
import itertools # 12. groupby(iterable, key=None) - group consecutive elements animals = ['ant', 'bee', 'cat', 'dog', 'eagle', 'flamingo'] grouped = itertools.groupby(animals, key=lambda x: x[0]) print("Groupby:") for key, group in grouped: print(f"{key}: {list(group)}") # a: ['ant'] # b: ['bee'] # c: ['cat'] # d: ['dog'] # e: ['eagle'] # f: ['flamingo'] # 13. takewhile(predicate, iterable) - take until predicate fails numbers = [1, 4, 6, 8, 2, 5, 3] taken = itertools.takewhile(lambda x: x < 7, numbers) print("Takewhile:", list(taken)) # [1, 4, 6] # 14. dropwhile(predicate, iterable) - drop until predicate fails dropped = itertools.dropwhile(lambda x: x < 7, numbers) print("Dropwhile:", list(dropped)) # [8, 2, 5, 3] # Additional example: Processing log files log_lines = [ "INFO: System started", "INFO: User logged in", "ERROR: File not found", "INFO: Request processed", "ERROR: Database timeout" ] # Group by log level get_level = lambda line: line.split(':')[0] for level, lines in itertools.groupby(log_lines, key=get_level): print(f"\n{level} messages:") for line in lines: print(" ", line.split(':', 1)[1].strip())
groupby
groups consecutive elements sharing a key (requires sorted
input for complete grouping). takewhile
yields items until the
predicate fails, while dropwhile
skips items until the predicate
fails then yields the rest.
These functions are invaluable for processing sequential data like logs, time series, or any grouped records. They enable efficient processing without loading entire datasets into memory.
Performance Considerations
While itertools functions are memory-efficient, their performance characteristics vary. This section compares common operations and demonstrates optimization techniques for working with large datasets.
import itertools import timeit import random # 15. Comparing chain methods def test_chain(): list(itertools.chain(range(1000), range(1000, 2000))) def test_concat(): list(range(1000)) + list(range(1000, 2000)) print("Chain vs concat:") print("itertools.chain:", timeit.timeit(test_chain, number=10000)) print("list concatenation:", timeit.timeit(test_concat, number=10000)) # 16. Memory efficiency demonstration large_range = itertools.count() # Infinite, uses almost no memory # Compare with list(range(1000000)) which would consume significant memory # 17. Early termination with islice def process_data(): data = itertools.count() # Infinite stream processed = (x**2 for x in itertools.islice(data, 1000000)) return sum(processed) # Doesn't store all squared values print("\nProcessing 1M numbers:", process_data()) # Additional example: Filtering large datasets def large_dataset(): return (random.random() for _ in range(1000000)) # Memory-efficient filtering positive = itertools.filterfalse(lambda x: x < 0.5, large_dataset()) print("\nCount > 0.5:", sum(1 for _ in itertools.islice(positive, 0, 100000)))
The benchmark shows itertools.chain
is faster than list
concatenation for large iterables. The memory efficiency example demonstrates
how itertools can handle theoretically infinite sequences. The early termination
example processes a large range without materializing it in memory.
Key takeaways: itertools functions excel at memory efficiency and lazy evaluation. They're particularly advantageous when working with large or infinite sequences, but for small datasets, built-in functions may be simpler and equally performant.
Real-World Applications
These examples demonstrate practical applications of itertools in common programming scenarios, from data analysis to algorithm implementation.
import itertools import operator # 18. Running averages def running_avg(data): it = itertools.accumulate(data, operator.add) for i, total in enumerate(it, 1): yield total / i print("Running averages:", list(running_avg([10, 20, 30, 40]))) # [10.0, 15.0, 20.0, 25.0] # 19. Pairwise iteration (Python 3.10+ has itertools.pairwise) def pairwise(iterable): a, b = itertools.tee(iterable) next(b, None) return zip(a, b) print("Pairwise differences:", [(y-x) for x, y in pairwise([1, 3, 6, 10])]) # [2, 3, 4] # 20. Pagination with islice def paginate(items, page_size): page_start = 0 while True: page = list(itertools.islice(items, page_start, page_start + page_size)) if not page: break yield page page_start += page_size data = range(0, 10) print("Paginated data:") for page in paginate(data, 3): print(page) # [0, 1, 2] # [3, 4, 5] # [6, 7, 8] # [9] # Additional example: Cartesian product for parameter grids params = { 'learning_rate': [0.01, 0.1], 'batch_size': [32, 64], 'optimizer': ['adam', 'sgd'] } param_grid = itertools.product(*params.values()) print("\nParameter combinations:") for combo in param_grid: print(dict(zip(params.keys(), combo)))
The running averages example shows how accumulate
can simplify
stateful calculations. The pairwise iteration demonstrates a common pattern in
time series analysis. The pagination example illustrates handling large datasets
in chunks. The parameter grid example is useful in machine learning
hyperparameter tuning.
These patterns are widely applicable in data processing, scientific computing, and web development. The itertools functions help keep the code concise and memory-efficient.
Best Practices
Use itertools for memory-efficient processing of large or infinite sequences.
Combine multiple itertools functions for complex pipelines. Prefer itertools
over manual implementations for common iteration patterns. Remember that many
itertools consume iterators (like tee
), so they can't be reused.
Document complex itertools pipelines for maintainability. Consider generator
expressions for simple cases where they're more readable.
Source References
Learn more from these resources: Python itertools Documentation, and more-itertools Library.
Author
My name is Jan Bodnar, and I am a passionate programmer with extensive programming experience. I have been writing programming articles since 2007. To date, I have authored over 1,400 articles and 8 e-books. I possess more than ten years of experience in teaching programming.
List all Python tutorials.