Python Multithreading
This article provides an introduction to multithreading and the concurrent.futures
module in Python, along with examples of how to use them effectively.
Multithreading is a technique that allows multiple threads of execution to run concurrently within a single program. In Python, the threading module provides a way to create and manage threads. Threads are lightweight compared to processes and can help improve the performance of certain types of programs, such as those with I/O-bound tasks or those that benefit from parallelization.
import threading
def print_numbers():
for i in range(10):
print(i)
def print_letters():
for letter in 'abcdefghij':
print(letter)
thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_letters)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
Choosing between ThreadPoolExecutor and ProcessPoolExecutor
When deciding whether to use ThreadPoolExecutor
or ProcessPoolExecutor
, consider the nature of the tasks you want to parallelize:
If your tasks are I/O-bound, such as reading from a file or downloading from the internet,
ThreadPoolExecutor
is generally more appropriate. Threads can efficiently handle I/O-bound tasks by allowing other threads to execute when one thread is waiting for I/O.If your tasks are CPU-bound, such as performing complex computations or processing large amounts of data,
ProcessPoolExecutor
is often the better choice. This is because Python’s Global Interpreter Lock (GIL) prevents multiple threads from executing Python bytecode simultaneously in the same process, limiting the parallelism of CPU-bound tasks. Using processes, which have their own memory space and interpreter, can circumvent this limitation.
Synchronization and Locks
When working with threads, it’s important to ensure that shared resources are accessed in a thread-safe manner. One common technique for achieving this is by using locks. A lock is a synchronization primitive provided by the threading
module in Python, which can be used to enforce mutual exclusion when accessing shared resources.
Lock Example
import threading
counter = 0
lock = threading.Lock()
def increment_counter():
global counter
with lock:
for _ in range(100000):
counter += 1
thread1 = threading.Thread(target=increment_counter)
thread2 = threading.Thread(target=increment_counter)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print(counter)