41.5 pg_cron: Scheduling Jobs Inside PostgreSQL

Right, so you’ve decided you need your database to do things on a schedule. Maybe you’re tired of setting up a separate cron daemon on your server, or you’re deploying to a managed service and don’t have low-level OS access. Enter pg_cron, the extension that lets you hand your recurring tasks directly to PostgreSQL. It’s like giving your butler a day planner. Sometimes this is brilliant; sometimes you’re just giving your butler more opportunities to fail spectacularly in new and exciting ways.

The core idea is simple: pg_cron runs a lightweight background worker inside your Postgres server that periodically checks a cron.job table. When the scheduled time for a job arrives, the worker simply connects to the database (like any other client) and executes the SQL command you’ve specified. This is both its greatest strength and its most significant weakness. The strength is that it’s incredibly simple and everything happens within the transaction log, making it easy to monitor. The weakness is that your job is now competing for resources with your actual database workload. You wouldn’t ask a Michelin-star chef to also scrub the floors during the dinner rush, so think carefully before scheduling a massive VACUUM ANALYZE for 12:00 PM on a Tuesday.

Getting pg_cron Running

First, you need to get this thing installed. On most systems, it’s a package (postgresql-16-cron or similar). The crucial step is adding pg_cron to the shared_preload_libraries in your postgresql.conf. This is non-negotiable, as it needs to start its background worker when the server boots.

# In postgresql.conf
shared_preload_libraries = 'pg_cron'

After a restart, connect to your database (typically the postgres database, as pg_cron metadata is stored there) and create the extension.

CREATE EXTENSION pg_cron;

And just like that, you’re in the scheduling business.

Scheduling Your First Job

The syntax is deliberately familiar to anyone who’s seen a crontab file. Let’s say you want to vacuum a specific table every night at 3 AM.

SELECT cron.schedule(
    'nightly-vacuum',  -- A name for the job. Useful for later management.
    '0 3 * * *',       -- The cron schedule: Minute 0, Hour 3, every day.
    'VACUUM ANALYZE my_important_table;'
);

This call returns a bigint which is your job ID. Hold onto that if you didn’t give it a name. The cron syntax is standard: minute hour day-of-month month day-of-week. You can use those asterisks for “every” or define ranges like '30 2 * * 1-5' for 2:30 AM every weekday.

Why It’s Not All Sunshine and Rainbows

Here’s where I have to be direct: pg_cron is fantastic for simple, fast, idempotent tasks. It is not a full-fledged job scheduler. Its error handling is… minimal. If your job fails (e.g., you have a syntax error, a table doesn’t exist, it deadlocks), pg_cron will log the error to postgresql.log and then just… run it again at the next scheduled time. There are no retry logic, no alerts, no failure callbacks. You absolutely must monitor your logs.

Another classic pitfall is getting the search_path wrong. The job runs with the permissions and search_path of the user who scheduled it, not the one who owns the table. This leads to a lot of “relation not found” errors that will make you question your sanity. Always use fully qualified names (schema_name.table_name) in your pg_cron commands. It’s a cheap insurance policy.

-- Do this
SELECT cron.schedule('vacuum-job', '0 4 * * *', 'VACUUM ANALYZE analytics.fact_sales');

-- Not this
SELECT cron.schedule('vacuum-job', '0 4 * * *', 'VACUUM ANALYZE fact_sales'); -- Will it work? Who knows!

Managing and Monitoring Jobs

You’re not flying blind. The extension provides a few ways to see what’s going on. The cron.job table shows you what you’ve scheduled.

SELECT * FROM cron.job;

More importantly, the cron.job_run_details table is your best friend. It records the last run of each job, its duration, and whether it succeeded or failed. You should build a monitoring check around this table to look for recent failures.

-- Check the status of recent job runs
SELECT jobid, jobname, status, return_message, start_time
FROM cron.job_run_details
ORDER BY start_time DESC
LIMIT 10;

When you inevitably screw up a schedule or need to stop a job, it’s easy.

-- Unschedule a job by name
SELECT cron.unschedule('nightly-vacuum');

-- Or by jobid (the bigint returned from cron.schedule)
SELECT cron.unschedule(42);

Best Practices: Don’t Be a Cowboy

Keep Jobs Short and Sweet: This is for maintenance tasks, not for hour-long ETL processes. If a job runs too long, it might overlap with its next run.
Embrace Idempotence: Design your SQL commands so that running them multiple times is safe. VACUUM, REFRESH MATERIALIZED VIEW, deleting old data—these are all great. An INSERT without conflict handling is not.
Use Fully Qualified Names: I said it before, I’ll say it again. Schema-qualify everything.
Monitor the Logs: The cron.job_run_details table is a godsend. Check it. Alert on it. Make it part of your morning routine, right after your coffee.

pg_cron is a sharp tool. It solves the “I need a simple cron” problem perfectly, but it will happily cut you if you try to use it for something it was never meant to do. Use it for its strengths, and you’ll wonder how you ever lived without it.