Multiprocessing in Python

· 5 min read · Updated March 13, 2026 · intermediate
multiprocessing concurrency parallelism processes pool

Python’s multiprocessing module lets you run code in parallel by spawning separate processes. Unlike threads, each process has its own Python interpreter and memory space, bypassing the Global Interpreter Lock (GIL). This makes multiprocessing ideal for CPU-bound tasks like mathematical computations, data processing, and machine learning.

When to Use Multiprocessing

Multiprocessing works best for CPU-intensive work: number crunching, image processing, running simulations, or training models. Each process runs in its own interpreter, so multiple processes can actually execute Python bytecode in parallel.

For I/O-bound tasks—downloading files, calling APIs, reading databases—threading or async often works better because processes have higher overhead. Threads share memory, while processes don’t, which affects how you share data between workers.

Creating Processes

The simplest way to create a process is with the Process class:

import multiprocessing
import time

def cpu_task(n):
    """Simulate CPU-intensive work"""
    result = sum(i * i for i in range(n))
    return result

if __name__ == "__main__":
    start = time.time()
    
    # Create processes
    p1 = multiprocessing.Process(target=cpu_task, args=(5_000_000,))
    p2 = multiprocessing.Process(target=cpu_task, args=(5_000_000,))
    
    # Start them
    p1.start()
    p2.start()
    
    # Wait for completion
    p1.join()
    p2.join()
    
    elapsed = time.time() - start
    print(f"Completed in {elapsed:.2f} seconds")

Output:

Completed in 0.72 seconds

Both tasks run on separate CPU cores, cutting the total time roughly in half.

Using Process Pools

Creating processes manually is powerful but verbose. Pool manages a pool of workers for you:

import multiprocessing

def square(n):
    return n * n

if __name__ == "__main__":
    numbers = [1, 2, 3, 4, 5, 6, 7, 8]
    
    with multiprocessing.Pool(processes=4) as pool:
        results = pool.map(square, numbers)
    
    print(results)  # [1, 4, 9, 16, 25, 36, 49, 64]

The pool distributes the work across 4 processes and collects results. This is much easier than managing processes yourself.

Sharing Data Between Processes

Processes don’t share memory by default. You have several options for sharing data:

Using Manager

A Manager creates shared objects that multiple processes can access:

import multiprocessing

def worker(shared_dict, shared_list):
    shared_dict["processed"] = shared_dict.get("processed", 0) + 1
    shared_list.append(shared_dict["processed"])

if __name__ == "__main__":
    manager = multiprocessing.Manager()
    shared_dict = manager.dict()
    shared_list = manager.list()
    
    processes = [
        multiprocessing.Process(target=worker, args=(shared_dict, shared_list))
        for _ in range(4)
    ]
    
    for p in processes:
        p.start()
    for p in processes:
        p.join()
    
    print(f"Dict: {dict(shared_dict)}")  # Dict: {'processed': 4}
    print(f"List: {list(shared_list)}")  # List: [1, 2, 3, 4]

Managers are flexible but slower than other approaches because they use proxies.

Using Shared Memory

For simple values, Value and Array provide faster shared memory:

import multiprocessing

def worker(counter, arr):
    with counter.get_lock():
        counter.value += 1
    arr[0] = arr[0] * 2

if __name__ == "__main__":
    counter = multiprocessing.Value("i", 0)
    arr = multiprocessing.Array("i", [1, 2, 3, 4])
    
    processes = [
        multiprocessing.Process(target=worker, args=(counter, arr))
        for _ in range(4)
    ]
    
    for p in processes:
        p.start()
    for p in processes:
        p.join()
    
    print(f"Counter: {counter.value}")  # Counter: 4
    print(f"Array: {list(arr)}")         # Array: [16, 2, 3, 4]

The get_lock() method ensures atomic updates to the shared counter.

Process Pools with map and starmap

Pool.map() applies a function to each item in a list. Pool.starmap() does the same but passes multiple arguments:

import multiprocessing

def power(base, exponent):
    return base ** exponent

if __name__ == "__main__":
    # map: one argument
    squares = multiprocessing.Pool().map(lambda x: x**2, [1, 2, 3])
    print(squares)  # [1, 4, 9]
    
    # starmap: multiple arguments
    results = multiprocessing.Pool().starmap(power, [(2, 3), (3, 2), (10, 2)])
    print(results)  # [8, 9, 100]

Using apply_async for More Control

For fire-and-forget or custom result handling, use apply_async:

import multiprocessing
import time

def slow_task(n):
    time.sleep(n)
    return n * n

if __name__ == "__main__":
    with multiprocessing.Pool(2) as pool:
        # Submit multiple async tasks
        results = [
            pool.apply_async(slow_task, args=(i,))
            for i in [3, 1, 2]
        ]
        
        # Get results as they complete
        for r in results:
            print(f"Result: {r.get()}")

# Output order depends on completion time:
# Result: 1
# Result: 4
# Result: 9

This lets you submit many jobs and process results in any order.

Pool with Initializer

Use an initializer to set up each worker process once:

import multiprocessing

# Global variable in each worker
worker_config = None

def init(debug_mode):
    global worker_config
    worker_config = {"debug": debug_mode, "initialized": True}

def process(item):
    return f"{item} (debug={worker_config['debug']})"

if __name__ == "__main__":
    with multiprocessing.Pool(
        processes=2,
        initializer=init,
        initargs=(True,)
    ) as pool:
        results = pool.map(process, ["a", "b", "c"])
    
    print(results)  # ['a (debug=True)', 'b (debug=True)', 'c (debug=True)']

This is useful for loading expensive resources once per worker.

ProcessPoolExecutor

The concurrent.futures module provides a higher-level API:

from concurrent.futures import ProcessPoolExecutor
import math

def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(math.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True

if __name__ == "__main__":
    numbers = [10**6 + i for i in range(100)]
    
    with ProcessPoolExecutor(max_workers=4) as executor:
        results = list(executor.map(is_prime, numbers))
    
    prime_count = sum(results)
    print(f"Found {prime_count} primes")

ProcessPoolExecutor is similar to ThreadPoolExecutor but uses processes instead of threads.

Common Pitfalls

  1. Forgetting if __name__ == "__main__": On Windows, this is required for multiprocessing to work
  2. Pickling issues: Functions passed to Pool must be picklable
  3. Too many processes: More processes mean more overhead; start with CPU count
  4. Shared state: Avoid sharing objects between processes when possible
  5. Deadlocks: Don’t call methods that wait on the calling process
  6. Copy-on-write: Large objects get copied to each process; use shared memory for big data

Best Practices

  1. Use Pool for batch work: It’s easier than managing processes manually
  2. Let the pool decide workers: Use multiprocessing.cpu_count() as a default
  3. Chunk large data: Split big inputs into chunks for better performance
  4. Avoid shared state: Pass data through arguments and return values
  5. Handle exceptions: Wrap get() calls in try/except
  6. Consider spawn on Windows: The default fork behavior differs across platforms

See Also