47.8 Combining Multiprocessing with asyncio

Combining the process-based parallelism of the multiprocessing module with the cooperative concurrency of asyncio is a powerful technique for building highly scalable applications in Python. This hybrid approach allows you to bypass the Global Interpreter Lock (GIL) for CPU-intensive tasks while simultaneously managing thousands of I/O-bound operations. The core challenge lies in orchestrating communication between the synchronous, process-isolated world of multiprocessing and the asynchronous, single-threaded event loop of asyncio.

The Event Loop and Process Isolation

The fundamental reason these two worlds don’t seamlessly integrate is process isolation. An asyncio event loop exists within a single process and thread. When you launch a separate process using multiprocessing.Process, it receives a complete copy of the parent’s memory space and, crucially, its own Python interpreter and a separate event loop. The child process knows nothing about the parent’s event loop, and vice versa. Therefore, you cannot directly await a function running in another process; the event loops are not connected. The solution is to use the multiprocessing module’s communication primitives (like Queue, Pipe, or shared memory) to send messages and synchronize between the asynchronous parent and its synchronous worker processes, treating the workers as independent entities.

The run_in_executor Pattern for CPU-bound Tasks

The most straightforward and common method for combining asyncio with multiprocessing is to use the loop.run_in_executor() method. This allows you to offload a blocking, CPU-intensive function to a ProcessPoolExecutor, freeing the main event loop thread to handle other asynchronous tasks while the computation runs in the background.

import asyncio
import concurrent.futures
import multiprocessing
import time

def cpu_bound_task(n):
    """A simulated CPU-intensive task."""
    start = time.time()
    # e.g., Factorial calculation, image processing, etc.
    result = sum(i * i for i in range(n))
    duration = time.time() - start
    return (result, duration, multiprocessing.current_process().name)

async def main():
    n = 10_000_000
    loop = asyncio.get_running_loop()
    
    # Create a ProcessPoolExecutor with 2 workers
    with concurrent.futures.ProcessPoolExecutor(max_workers=2) as executor:
        # Offload the synchronous function to the process pool.
        # The await yields control back to the event loop until the result is ready.
        result = await loop.run_in_executor(executor, cpu_bound_task, n)
        print(f"Result: {result[0]}, computed in {result[1]:.2f}s by {result[2]}")
        
        # You can launch multiple tasks concurrently on the executor
        tasks = [loop.run_in_executor(executor, cpu_bound_task, n) for _ in range(4)]
        results = await asyncio.gather(*tasks)
        for res in results:
            print(f"Result from {res[2]}: {res[1]:.2f}s")

if __name__ == "__main__":
    asyncio.run(main())

Why this works: The event loop manages the Future object returned by run_in_executor. It registers the future with the executor and then suspends the coroutine. A thread within the event loop’s thread pool (separate from the processes) blocks, waiting for the result from the multiprocessing.Queue that the ProcessPoolExecutor uses internally. Once the worker process finishes and places the result on the queue, the waiting thread unblocks, and the event loop marks the future as complete, rescheduling the coroutine to process the result. This keeps the main thread responsive.

Direct Process Management with Asyncio

For more complex scenarios involving long-lived worker processes that need bidirectional communication, you can manage multiprocessing.Process objects directly and use an asyncio Queue to facilitate communication back to the event loop. This requires careful synchronization, often involving a dedicated listener thread to monitor a multiprocessing.Queue and relay messages to an asyncio Queue.

import asyncio
import multiprocessing
import queue
import threading
from multiprocessing import Queue as MpQueue

def worker_process(mp_queue: MpQueue, data):
    """Synchronous worker process."""
    result = data * 2  # Some processing
    # Send result back to parent via multiprocessing.Queue
    mp_queue.put(f"Processed by {multiprocessing.current_process().name}: {result}")

async def async_listener(async_queue: asyncio.Queue):
    """Async coroutine that listens for results on the asyncio queue."""
    while True:
        result = await async_queue.get()
        if result is None:  # Sentinel value to shut down
            break
        print(f"Main loop received: {result}")
        async_queue.task_done()

def queue_listener_thread(mp_queue: MpQueue, async_queue: asyncio.Queue):
    """Target for a thread that bridges the mp queue to the asyncio queue."""
    while True:
        try:
            item = mp_queue.get(timeout=0.1)
            # Schedule putting the item onto the async queue in the event loop thread.
            asyncio.run_coroutine_threadsafe(async_queue.put(item), loop)
        except queue.Empty:
            continue

async def main():
    mp_queue = MpQueue()
    async_queue = asyncio.Queue()
    loop = asyncio.get_running_loop()

    # Start the thread that bridges the multiprocessing and asyncio queues
    listener_thread = threading.Thread(target=queue_listener_thread, args=(mp_queue, async_queue), daemon=True)
    listener_thread.start()

    # Start the async consumer
    listener_task = asyncio.create_task(async_listener(async_queue))

    # Launch worker processes
    processes = []
    for i in range(3):
        p = multiprocessing.Process(target=worker_process, args=(mp_queue, i))
        p.start()
        processes.append(p)

    # Wait for all processes to finish
    for p in processes:
        p.join()

    # Signal the async listener to stop and wait for it
    await async_queue.put(None)
    await listener_task

if __name__ == "__main__":
    asyncio.run(main())

Common Pitfalls and Best Practices

Pickling Errors: Functions and objects passed to a ProcessPoolExecutor or multiprocessing.Process must be picklable. Lambda functions and locally defined functions are often not picklable. Define functions at the top level of your module.
Overhead: Creating processes has significant overhead. This pattern is only beneficial for tasks substantial enough to outweigh the cost of process creation and inter-process communication (IPC). Avoid using it for very small, frequent tasks.
Shared State: Avoid sharing state between processes. If you must share state, use explicit multiprocessing mechanisms like Value, Array, or Manager, not global variables. Remember, processes have independent memory spaces.
Deadlocks: Be extremely cautious with queues. If a child process puts data on a queue and the parent never reads it, the child will block forever once the queue fills up. Always ensure there is a consumer for every producer.
if __name__ == "__main__": Guard: This guard is mandatory for Windows and good practice on Unix when using multiprocessing. It prevents the child processes from recursively executing the main module code, which can cause runtime errors or infinite loops.
Resource Limits: The operating system imposes limits on the number of processes and threads. Creating thousands of processes is neither efficient nor possible. Use a ProcessPoolExecutor to manage a controlled pool of workers.