91.7 ARQ: Async Job Queue with Redis

Right, so you’ve built a lovely, fast async web server with FastAPI or Litestar. It hums along beautifully until you, the genius you are, decide you need to send a welcome email, process a giant image, or crunch a massive dataset. You slap that async def into your request handler and… congratulations, you’ve just tied up one of your precious few worker processes for ten seconds doing something that has absolutely nothing to do with serving the user’s request. The user gets a timeout, and your app’s latency graph now looks like the Himalayas.

This is why we have job queues. They let you say, “Hey, this needs to happen, but not right this millisecond. Someone else handle it.” You put a message in a bottle, throw it into the sea (Redis), and a fleet of dedicated worker boats (your workers) pick them up and get the work done. For the async world, ARQ is your fleet of very fast, very modern boats.

ARQ (Asynchronous RQ) is a job queue built from the ground up for Python’s async/await syntax, using Redis as its backbone. It’s the spiritual async successor to the venerable RQ (Redis Queue), and it fits into the modern async ecosystem like a glove. No more monkey-patching or threading oddities; it’s just pure, unadulterated async.

The Basic Anatomy: Kicking Off a Job

First, you define your functions. These are the tasks your workers will know how to perform. No magic here, just plain old async functions.

# tasks.py
async def send_email(ctx, to_address, subject, body):
    # ctx is the job context, more on that in a bit
    fake_email_server = ctx["email_server"]
    print(f"Pretending to send '{subject}' to {to_address}...")
    await asyncio.sleep(2)  # Simulate IO delay
    print("Email sent! (Not really)")

async def process_image(ctx, image_path):
    print(f"Crunching pixels for {image_path}...")
    await asyncio.sleep(5)
    print("Image processed!")

Next, you need a client, something running inside your web app to enqueue these jobs. This is where you create the ArqRedis client.

# main.py (e.g., in your FastAPI app)
from arq import ArqRedis, create_pool
from redis.asyncio import Redis
from .tasks import send_email

@app.post("/signup")
async def sign_user_up(user_data: dict):
    # ... your logic to create a user ...
    redis: ArqRedis = await create_pool()  # Uses default Redis settings
    await redis.enqueue_job(
        "send_email",  # Name of the function to call
        to_address=user_data['email'],
        subject="Welcome!",
        body="Thanks for signing up."
    )
    return {"status": "ok", "message": "User created, email queued."}

See? enqueue_job is non-blocking. It just fires a message into Redis and returns immediately. Your request handler is free to get on with its life.

The Worker: The Muscle Behind the Operation

The functions in tasks.py don’t run by magic. You need a worker process (or ten) that’s actively listening to the Redis queue for jobs. You run this in a separate terminal, or better yet, managed by a process supervisor like Systemd or Docker.

$ arq tasks.WorkerSettings

Wait, what’s WorkerSettings? Ah, right. This is ARQ’s slightly opinionated way of doing things. You define a Settings class that tells the worker everything it needs to know.

# tasks.py (appended)
class WorkerSettings:
    functions = [send_email, process_image]
    redis_settings = Redis.from_url("redis://localhost:6379")

The worker imports this class, finds the functions list, and now knows which jobs it’s allowed to run. This is a security feature; it prevents someone from maliciously enqueuing a job to os.system('rm -rf /') if you haven’t explicitly allowed it.

Why the `ctx`? The Context Object

You noticed the ctx (context) parameter in every job function. This is ARQ’s killer feature. It’s a dictionary that’s created once when the worker starts up and is passed to every job. This is the perfect place to put expensive-to-create resources like database connection pools, AI model clients, or authenticated SDK clients. You avoid the overhead of setting them up for every single job.

You define what goes into the ctx by adding an async def on_startup(ctx) function to your settings.

# tasks.py (appended to WorkerSettings)
class WorkerSettings:
    functions = [send_email, process_image]
    redis_settings = Redis.from_url("redis://localhost:6379")

    async def on_startup(self, ctx):
        # This runs once when the worker starts
        ctx["db_pool"] = await get_database_connection_pool()
        ctx["s3_client"] = get_aws_client()
        print("Worker has started up, resources ready.")

    async def on_shutdown(self, ctx):
        # This runs once when the worker shuts down
        await ctx["db_pool"].close()
        await ctx["s3_client"].close()
        print("Worker shutting down, cleaned up resources.")

This pattern is brilliantly efficient and something the synchronous predecessors always struggled with.

Pitfalls and Sharp Edges

Now, let’s be honest. ARQ is fantastic, but it’s not magic fairy dust.

1. The One-Way Ticket: Enqueuing a job is a fire-and-forget operation. You get a job ID back, but if you want to check its status or get a result, you have to poll for it. This is by design (it’s a queue, not an RPC system), but it catches people off guard. For results, you often need to use a different channel (like storing them in a database yourself).

2. Serialization Shenanigans: Your job arguments and return values get pickled and stored in Redis. This means everything you pass must be pickleable. Say goodbye to passing fancy ORM model instances or database connections. Pass IDs and primary keys instead, and let the job re-fetch the data from the database using the ctx['db_pool'] you so wisely provided.

3. Watch Your Timeouts: ARQ has a default job timeout of 300 seconds. If your job runs longer than that, the worker will kill it and mark it as failed. This is a good thing—it prevents zombie jobs. But if you have legitimately long-running tasks, you must remember to set the job_timeout parameter on enqueue_job or in the job function itself using the @job decorator. Conversely, don’t set it to 5 hours without understanding the consequences.

4. The Redis Single Point of Failure: This isn’t ARQ’s fault, but it’s your problem. Your entire queueing system is now tied to the health of your Redis instance. If Redis goes down, no new jobs are enqueued, and workers can’t process them. You need to have a highly available Redis setup (like Redis Sentinel or a managed cloud offering) for anything mission-critical.

The Basic Anatomy: Kicking Off a Job

The Worker: The Muscle Behind the Operation

Why the ctx? The Context Object

Pitfalls and Sharp Edges

Why the `ctx`? The Context Object