Multiprocessing in Python
Python’s multiprocessing module lets you run code in parallel by spawning separate processes. Unlike threads, each process has its own Python interpreter and memory space, bypassing the Global Interpreter Lock (GIL). This makes multiprocessing ideal for CPU-bound tasks like mathematical computations, data processing, and machine learning.
When to Use Multiprocessing
Multiprocessing works best for CPU-intensive work: number crunching, image processing, running simulations, or training models. Each process runs in its own interpreter, so multiple processes can actually execute Python bytecode in parallel.
For I/O-bound tasks—downloading files, calling APIs, reading databases—threading or async often works better because processes have higher overhead. Threads share memory, while processes don’t, which affects how you share data between workers.
Creating Processes
The simplest way to create a process is with the Process class:
import multiprocessing
import time
def cpu_task(n):
"""Simulate CPU-intensive work"""
result = sum(i * i for i in range(n))
return result
if __name__ == "__main__":
start = time.time()
# Create processes
p1 = multiprocessing.Process(target=cpu_task, args=(5_000_000,))
p2 = multiprocessing.Process(target=cpu_task, args=(5_000_000,))
# Start them
p1.start()
p2.start()
# Wait for completion
p1.join()
p2.join()
elapsed = time.time() - start
print(f"Completed in {elapsed:.2f} seconds")
Output:
Completed in 0.72 seconds
Both tasks run on separate CPU cores, cutting the total time roughly in half.
Using Process Pools
Creating processes manually is powerful but verbose. Pool manages a pool of workers for you:
import multiprocessing
def square(n):
return n * n
if __name__ == "__main__":
numbers = [1, 2, 3, 4, 5, 6, 7, 8]
with multiprocessing.Pool(processes=4) as pool:
results = pool.map(square, numbers)
print(results) # [1, 4, 9, 16, 25, 36, 49, 64]
The pool distributes the work across 4 processes and collects results. This is much easier than managing processes yourself.
Sharing Data Between Processes
Processes don’t share memory by default. You have several options for sharing data:
Using Manager
A Manager creates shared objects that multiple processes can access:
import multiprocessing
def worker(shared_dict, shared_list):
shared_dict["processed"] = shared_dict.get("processed", 0) + 1
shared_list.append(shared_dict["processed"])
if __name__ == "__main__":
manager = multiprocessing.Manager()
shared_dict = manager.dict()
shared_list = manager.list()
processes = [
multiprocessing.Process(target=worker, args=(shared_dict, shared_list))
for _ in range(4)
]
for p in processes:
p.start()
for p in processes:
p.join()
print(f"Dict: {dict(shared_dict)}") # Dict: {'processed': 4}
print(f"List: {list(shared_list)}") # List: [1, 2, 3, 4]
Managers are flexible but slower than other approaches because they use proxies.
Using Shared Memory
For simple values, Value and Array provide faster shared memory:
import multiprocessing
def worker(counter, arr):
with counter.get_lock():
counter.value += 1
arr[0] = arr[0] * 2
if __name__ == "__main__":
counter = multiprocessing.Value("i", 0)
arr = multiprocessing.Array("i", [1, 2, 3, 4])
processes = [
multiprocessing.Process(target=worker, args=(counter, arr))
for _ in range(4)
]
for p in processes:
p.start()
for p in processes:
p.join()
print(f"Counter: {counter.value}") # Counter: 4
print(f"Array: {list(arr)}") # Array: [16, 2, 3, 4]
The get_lock() method ensures atomic updates to the shared counter.
Process Pools with map and starmap
Pool.map() applies a function to each item in a list. Pool.starmap() does the same but passes multiple arguments:
import multiprocessing
def power(base, exponent):
return base ** exponent
if __name__ == "__main__":
# map: one argument
squares = multiprocessing.Pool().map(lambda x: x**2, [1, 2, 3])
print(squares) # [1, 4, 9]
# starmap: multiple arguments
results = multiprocessing.Pool().starmap(power, [(2, 3), (3, 2), (10, 2)])
print(results) # [8, 9, 100]
Using apply_async for More Control
For fire-and-forget or custom result handling, use apply_async:
import multiprocessing
import time
def slow_task(n):
time.sleep(n)
return n * n
if __name__ == "__main__":
with multiprocessing.Pool(2) as pool:
# Submit multiple async tasks
results = [
pool.apply_async(slow_task, args=(i,))
for i in [3, 1, 2]
]
# Get results as they complete
for r in results:
print(f"Result: {r.get()}")
# Output order depends on completion time:
# Result: 1
# Result: 4
# Result: 9
This lets you submit many jobs and process results in any order.
Pool with Initializer
Use an initializer to set up each worker process once:
import multiprocessing
# Global variable in each worker
worker_config = None
def init(debug_mode):
global worker_config
worker_config = {"debug": debug_mode, "initialized": True}
def process(item):
return f"{item} (debug={worker_config['debug']})"
if __name__ == "__main__":
with multiprocessing.Pool(
processes=2,
initializer=init,
initargs=(True,)
) as pool:
results = pool.map(process, ["a", "b", "c"])
print(results) # ['a (debug=True)', 'b (debug=True)', 'c (debug=True)']
This is useful for loading expensive resources once per worker.
ProcessPoolExecutor
The concurrent.futures module provides a higher-level API:
from concurrent.futures import ProcessPoolExecutor
import math
def is_prime(n):
if n < 2:
return False
for i in range(2, int(math.sqrt(n)) + 1):
if n % i == 0:
return False
return True
if __name__ == "__main__":
numbers = [10**6 + i for i in range(100)]
with ProcessPoolExecutor(max_workers=4) as executor:
results = list(executor.map(is_prime, numbers))
prime_count = sum(results)
print(f"Found {prime_count} primes")
ProcessPoolExecutor is similar to ThreadPoolExecutor but uses processes instead of threads.
Common Pitfalls
- Forgetting
if __name__ == "__main__": On Windows, this is required for multiprocessing to work - Pickling issues: Functions passed to Pool must be picklable
- Too many processes: More processes mean more overhead; start with CPU count
- Shared state: Avoid sharing objects between processes when possible
- Deadlocks: Don’t call methods that wait on the calling process
- Copy-on-write: Large objects get copied to each process; use shared memory for big data
Best Practices
- Use Pool for batch work: It’s easier than managing processes manually
- Let the pool decide workers: Use
multiprocessing.cpu_count()as a default - Chunk large data: Split big inputs into chunks for better performance
- Avoid shared state: Pass data through arguments and return values
- Handle exceptions: Wrap
get()calls in try/except - Consider
spawnon Windows: The default fork behavior differs across platforms
See Also
multiprocessingmodule — Full module referenceconcurrent.futures— High-level process pool APIthreadingmodule — Thread-based concurrency- Threading in Python — When to use threads instead of processes