Python Multithreading

This article provides an introduction to multithreading and the concurrent.futures module in Python, along with examples of how to use them effectively.

Multithreading is a technique that allows multiple threads of execution to run concurrently within a single program. In Python, the threading module provides a way to create and manage threads. Threads are lightweight compared to processes and can help improve the performance of certain types of programs, such as those with I/O-bound tasks or those that benefit from parallelization.

import threading

def print_numbers():
    for i in range(10):
        print(i)

def print_letters():
    for letter in 'abcdefghij':
        print(letter)

thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_letters)

thread1.start()
thread2.start()

thread1.join()
thread2.join()

Choosing between ThreadPoolExecutor and ProcessPoolExecutor

When deciding whether to use ThreadPoolExecutor or ProcessPoolExecutor, consider the nature of the tasks you want to parallelize:

  1. If your tasks are I/O-bound, such as reading from a file or downloading from the internet, ThreadPoolExecutor is generally more appropriate. Threads can efficiently handle I/O-bound tasks by allowing other threads to execute when one thread is waiting for I/O.

  2. If your tasks are CPU-bound, such as performing complex computations or processing large amounts of data, ProcessPoolExecutor is often the better choice. This is because Python’s Global Interpreter Lock (GIL) prevents multiple threads from executing Python bytecode simultaneously in the same process, limiting the parallelism of CPU-bound tasks. Using processes, which have their own memory space and interpreter, can circumvent this limitation.

Synchronization and Locks

When working with threads, it’s important to ensure that shared resources are accessed in a thread-safe manner. One common technique for achieving this is by using locks. A lock is a synchronization primitive provided by the threading module in Python, which can be used to enforce mutual exclusion when accessing shared resources.

Lock Example

import threading

counter = 0
lock = threading.Lock()

def increment_counter():
    global counter
    with lock:
        for _ in range(100000):
            counter += 1

thread1 = threading.Thread(target=increment_counter)
thread2 = threading.Thread(target=increment_counter)

thread1.start()
thread2.start()

thread1.join()
thread2.join()

print(counter)
Chengqi (William) Li
Chengqi (William) Li

My research interests include 3D perception, computer vision, and machine learning.