Python ThreadPoolExecutor
last modified April 19, 2025
This tutorial explores concurrent programming in Python using ThreadPoolExecutor, a powerful tool for managing threads efficiently.
Concurrent programming aims to enhance code efficiency by executing tasks simultaneously. This can be achieved through threading, parallelism, or asynchronous operations. Here, we focus on threading with ThreadPoolExecutor.
A thread is a separate flow of execution within a program. Threads excel at I/O-bound tasks, like file downloads or database queries. Python's Global Interpreter Lock (GIL) limits true parallelism, making threads suitable for I/O-bound tasks rather than CPU-bound ones, where multiprocessing is preferred.
The Global Interpreter Lock (GIL) is a mechanism in Python that ensures only one thread executes Python bytecode at a time, even on multi-core systems. This prevents concurrency issues but restricts parallel execution.
The threading
module offers a foundational interface for creating
and managing threads in Python, enabling concurrent task execution.
ThreadPoolExecutor
The concurrent.futures
module provides a high-level interface for
executing tasks concurrently. ThreadPoolExecutor, part of this module,
simplifies thread management by maintaining a pool of reusable worker threads,
streamlining thread creation, execution, and cleanup.
Creating new threads incurs significant overhead. ThreadPoolExecutor's worker threads are reused after tasks complete, improving performance and efficiency.
Future
A Future object represents the eventual outcome of an asynchronous
operation, which may result in a value or an exception. Its result
method retrieves the operation's outcome once complete.
Python ThreadPoolExecutor submit
The submit
method schedules a callable for execution and returns a
Future object, which tracks the task's progress and result.
#!/usr/bin/python from time import sleep from concurrent.futures import ThreadPoolExecutor import threading def task(id, n): print(f"thread {id} started") print(f"thread {id} : {threading.get_ident()}") sleep(n) print(f"thread {id} completed") with ThreadPoolExecutor() as executor: executor.submit(task, 1, 4) executor.submit(task, 2, 3) executor.submit(task, 3, 2)
This example demonstrates submitting three tasks for concurrent execution using
ThreadPoolExecutor's submit
method.
def task(id, n): print(f"thread {id} started") print(f"thread {id} : {threading.get_ident()}") sleep(n) print(f"thread {id} completed")
The task
function prints the thread's ID, its unique identifier, and
status messages. It uses sleep
to simulate a time-consuming
operation, such as I/O-bound work.
with ThreadPoolExecutor() as executor:
We create a ThreadPoolExecutor instance as a context manager, ensuring it shuts down automatically when tasks are complete.
executor.submit(task, 1, 4) executor.submit(task, 2, 3) executor.submit(task, 3, 2)
Three tasks are submitted using submit
, each with a unique ID and
sleep duration, enabling concurrent execution.
$ ./submitfun.py thread 1 started thread 1 : 140563097032256 thread 2 started thread 2 : 140563088639552 thread 3 started thread 3 : 140563005306432 thread 3 completed thread 2 completed thread 1 completed
Python ThreadPoolExecutor map
The map
method applies a function to each item in one or more
iterables, executing tasks concurrently and returning results in order.
#!/usr/bin/python from time import sleep from concurrent.futures import ThreadPoolExecutor import threading def task(id, n): print(f"thread {id} started") print(f"thread {id} : {threading.get_ident()}") sleep(n) print(f"thread {id} completed") with ThreadPoolExecutor() as executor: executor.map(task, [1, 2, 3], [4, 3, 2])
This example reimplements the previous program using map
, passing
lists of thread IDs and sleep durations for concurrent execution.
Python ThreadPoolExecutor Future.result
A Future object encapsulates the result of a concurrent task. The
result
method retrieves the task's return value, blocking until the
task completes.
#!/usr/bin/python from time import sleep, perf_counter import random from concurrent.futures import ThreadPoolExecutor def task(tid): r = random.randint(1, 5) print(f'task {tid} started, sleeping {r} secs') sleep(r) return f'finished task {tid}, slept {r}' start = perf_counter() with ThreadPoolExecutor() as executor: t1 = executor.submit(task, 1) t2 = executor.submit(task, 2) t3 = executor.submit(task, 3) print(t1.result()) print(t2.result()) print(t3.result()) finish = perf_counter() print(f"It took {finish-start} second(s) to finish.")
This example runs tasks that sleep for random durations, retrieves their results,
and measures total execution time using perf_counter
.
return f'finished task {tid}, slept {r}'
The task returns a string describing its completion, accessible via the
result
method of the Future object.
start = perf_counter()
The perf_counter
function provides high-precision timing to measure
the total duration of the concurrent tasks.
t1 = executor.submit(task, 1) t2 = executor.submit(task, 2) t3 = executor.submit(task, 3) print(t1.result()) print(t2.result()) print(t3.result())
Three tasks are submitted, and their results are retrieved using
result
, which blocks until each task completes, preserving the
submission order.
$ ./resultfun.py task 1 started, sleeping 3 secs task 2 started, sleeping 4 secs task 3 started, sleeping 1 secs finished task 1, slept 3 finished task 2, slept 4 finished task 3, slept 1 It took 4.005295900977217 second(s) to finish.
The program takes as long as the longest task due to concurrent execution. The
result
method's blocking nature ensures results appear in submission
order. The next example addresses this limitation.
Python ThreadPoolExecutor as_completed
The as_completed
function provides an iterator over Future objects,
yielding results as tasks complete, regardless of submission order.
Note that map
cannot be used with as_completed
, as
map
returns results in iterable order.
#!/usr/bin/python from time import sleep, perf_counter import random from concurrent.futures import ThreadPoolExecutor, as_completed def task(tid): r = random.randint(1, 5) print(f'task {tid} started, sleeping {r} secs') sleep(r) return f'finished task {tid}, slept {r}' start = perf_counter() with ThreadPoolExecutor() as executor: tids = [1, 2, 3] futures = [] for tid in tids: futures.append(executor.submit(task, tid)) for res in as_completed(futures): print(res.result()) finish = perf_counter() print(f"It took {finish-start} second(s) to finish.")
This example retrieves task results in the order they complete, using
as_completed
to handle Futures dynamically.
$ ./as_completed.py task 1 started, sleeping 3 secs task 2 started, sleeping 4 secs task 3 started, sleeping 2 secs finished task 3, slept 2 finished task 1, slept 3 finished task 2, slept 4 It took 4.00534593896009 second(s) to finish.
Multiple concurrent HTTP requests
This example uses ThreadPoolExecutor to perform multiple HTTP requests
concurrently, leveraging the requests
library to fetch web page
status codes.
#!/usr/bin/python import requests import concurrent.futures import time def get_status(url): resp = requests.get(url=url) return resp.status_code urls = ['http://webcode.me', 'https://httpbin.org/get', 'https://google.com', 'https://stackoverflow.com', 'https://github.com', 'https://clojure.org', 'https://fsharp.org'] tm1 = time.perf_counter() with concurrent.futures.ThreadPoolExecutor() as executor: futures = [] for url in urls: futures.append(executor.submit(get_status, url=url)) for future in concurrent.futures.as_completed(futures): print(future.result()) tm2 = time.perf_counter() print(f'elapsed {tm2-tm1:0.2f} seconds')
The program concurrently checks the HTTP status codes of multiple websites, demonstrating ThreadPoolExecutor's efficiency for I/O-bound tasks.
$ ./web_requests.py 200 200 200 200 200 200 200 elapsed 0.81 seconds
Concurrent pinging
This example uses ThreadPoolExecutor to ping multiple websites concurrently,
executing the external ping
command via the subprocess
module.
#!/usr/bin/python from time import perf_counter from concurrent.futures import ThreadPoolExecutor, as_completed import subprocess def task(url): ok, _ = subprocess.getstatusoutput( [f'ping -c 3 -w 10 {url}']) return ok == 0, url urls = ['webcode.me', 'clojure.org', 'fsharp.org', 'www.perl.org', 'python.org', 'go.dev', 'raku.org'] start = perf_counter() with ThreadPoolExecutor() as executor: futures = [] for url in urls: futures.append(executor.submit(task, url)) for future in as_completed(futures): r, u = future.result() if r: print(f'OK -> {u}') else: print(f'failed -> {u}') finish = perf_counter() print(f"elapsed {finish-start} second(s)")
The program concurrently pings multiple websites, reporting their availability
based on the ping
command's exit status.
ok, _ = subprocess.getstatusoutput( [f'ping -c 3 -w 10 {url}'])
The getstatusoutput
function captures the exit code and output of
the ping
command, which sends three ICMP packets with a 10-second
timeout to the specified URL.
$ ./pinging.py OK -> go.dev OK -> fsharp.org OK -> www.perl.org OK -> python.org OK -> raku.org OK -> clojure.org OK -> webcode.me elapsed 2.384801392967347 second(s)
Concurrent File Reading
This example uses ThreadPoolExecutor to read multiple text files concurrently, demonstrating its use for I/O-bound file operations.
#!/usr/bin/python from concurrent.futures import ThreadPoolExecutor, as_completed from time import perf_counter def read_file(filename): try: with open(filename, 'r') as f: content = f.read() return f"Read {filename}: {len(content)} chars" except Exception as e: return f"Error reading {filename}: {e}" files = ['file1.txt', 'file2.txt', 'file3.txt'] start = perf_counter() with ThreadPoolExecutor() as executor: futures = [executor.submit(read_file, f) for f in files] for future in as_completed(futures): print(future.result()) finish = perf_counter() print(f"Elapsed {finish-start:.2f} seconds")
This program reads three text files concurrently, reporting the number of characters read from each or any errors encountered.
Concurrent Database Queries
This example demonstrates ThreadPoolExecutor for executing multiple SQLite database queries concurrently, ideal for I/O-bound database tasks.
#!/usr/bin/python import sqlite3 from concurrent.futures import ThreadPoolExecutor, as_completed from time import perf_counter def query_db(query): with sqlite3.connect('example.db') as conn: cursor = conn.cursor() cursor.execute(query) result = cursor.fetchall() return f"Query '{query}' returned {len(result)} rows" queries = [ "SELECT * FROM users WHERE age > 20", "SELECT * FROM orders WHERE total > 100", "SELECT * FROM products" ] start = perf_counter() with ThreadPoolExecutor() as executor: futures = [executor.submit(query_db, q) for q in queries] for future in as_completed(futures): print(future.result()) finish = perf_counter() print(f"Elapsed {finish-start:.2f} seconds")
The program executes three SQLite queries concurrently, reporting the number of rows returned by each query, showcasing efficient database access.
Concurrent Image Downloads
This example uses ThreadPoolExecutor to download multiple images from URLs
concurrently, leveraging the requests
library for HTTP requests.
#!/usr/bin/python import requests from concurrent.futures import ThreadPoolExecutor, as_completed from time import perf_counter def download_image(url): try: resp = requests.get(url) filename = url.split('/')[-1] with open(filename, 'wb') as f: f.write(resp.content) return f"Downloaded {filename}" except Exception as e: return f"Error downloading {url}: {e}" urls = [ 'https://example.com/image1.jpg', 'https://example.com/image2.png', 'https://example.com/image3.jpg' ] start = perf_counter() with ThreadPoolExecutor() as executor: futures = [executor.submit(download_image, u) for u in urls] for future in as_completed(futures): print(future.result()) finish = perf_counter() print(f"Elapsed {finish-start:.2f} seconds")
The program downloads three images concurrently, saving them locally and reporting success or errors, ideal for network-bound tasks.
Concurrent Text Processing
This example uses ThreadPoolExecutor to process multiple text strings concurrently, counting words in each, suitable for lightweight text tasks.
#!/usr/bin/python from concurrent.futures import ThreadPoolExecutor, as_completed from time import perf_counter def count_words(text): words = text.split() return f"Text '{text[:20]}...' has {len(words)} words" texts = [ "Python is a versatile programming language", "Concurrency improves performance in I/O tasks", "ThreadPoolExecutor simplifies thread management" ] start = perf_counter() with ThreadPoolExecutor() as executor: futures = [executor.submit(count_words, t) for t in texts] for future in as_completed(futures): print(future.result()) finish = perf_counter() print(f"Elapsed {finish-start:.2f} seconds")
The program counts words in three text strings concurrently, reporting the word count for each, demonstrating simple concurrent text processing.
Source
Python ThreadPoolExecutor - language reference
This tutorial covered concurrent programming with ThreadPoolExecutor, showcasing its effectiveness for I/O-bound tasks in Python.
Author
List all Python tutorials.