35.2 pg_dumpall: Dumping Globals and All Databases

Right, so you’ve mastered pg_dump for a single database. Good for you. But a PostgreSQL instance is more than just a collection of databases; it’s a little ecosystem with users, permissions, and settings that live outside any one database. This is where pg_dumpall comes in. Think of it as the over-caffeinated, slightly chaotic cousin of pg_dump that tries to back up everything in one go. It’s indispensable, but you have to understand its quirks, or it will happily give you a false sense of security.

The core job of pg_dumpall is to dump “globals”—the cluster-wide objects like roles (users) and tablespaces—and then sequentially dump every single database in the cluster. Its output is a single, massive SQL script. This is its greatest strength and its most glaring weakness. It’s a logical dump, meaning it’s meant for migration, cloning environments, or for those of you who just want a simple, human-readable SQL file of your entire universe. It is categorically not your tool for point-in-time recovery or large-scale, high-availability backups. We’ll get to the tools for that later.

What It Actually Backs Up (And What It Doesn’t)

Let’s be precise. When I say “globals,” I mean:

Roles and Role Properties: Every user, group, and their passwords (hashed, unless you force it otherwise), privileges, and ALTER ROLE settings.
Tablespaces: The definitions of your tablespaces (where they point on disk), but crucially, not the data files within them. You’ll have to ensure those directories exist and are accessible on the target system.

What it doesn’t* back up from the globals is just as important. It completely ignores the postgresql.conf, pg_hba.conf, and any other configuration file. These live on the filesystem, and it’s your job to back them up separately. Forgetting this is a classic rookie mistake that will leave you with a perfectly restored cluster that can’t start or accept connections.

The Basic, No-Frills Command

The most common invocation is painfully simple. You run it as a user with sufficient privileges (usually the postgres superuser) and redirect the output to a file.

pg_dumpall > /path/to/backup/my_full_cluster_backup.sql

That’s it. To restore, you’d feed that SQL file back into psql. But please, for the love of all that is holy, don’t do this on a live production system without reading the next part.

psql -f my_full_cluster_backup.sql postgres

Why You Should Almost Always Use `-g` and `--clean`

Running the raw command above is like moving houses by throwing all your possessions into one giant, unlabeled box. It’ll work, but you’ll probably break something. The biggest issue? The default output script does not include commands to drop existing objects. If you try to restore over an existing cluster that has roles or databases with the same names, you’ll get a waterfall of “already exists” errors.

This is where the --clean flag comes in. It tells pg_dumpall to include commands like DROP DATABASE and DROP ROLE before the CREATE commands. This is essential for creating an idempotent script—one you can run multiple times to get to the same clean state.

However, and this is a big however, mixing --clean with a full cluster restore on a live system is a terrifyingly bad idea. Imagine the script dropping the database your application is currently connected to. Chaos.

The professional move, and my strong recommendation, is to split the operation. Use the -g (or --globals-only) flag to dump just the globals, and then use individual pg_dump commands for each database. This gives you fine-grained control.

# Dump ONLY the globals (roles, tablespaces) with --clean
pg_dumpall -g --clean > /path/to/backup/globals_with_clean.sql

# Dump each database individually with pg_dump's better options
pg_dump -Fc -f myapp_db.backup myapp_db
pg_dump -Fc -f otherapp_db.backup otherapp_db

Now, to restore, you can first safely load the globals script to recreate your roles and tablespaces, and then use pg_restore on each custom-format dump for the databases. This is cleaner, safer, and more flexible.

The Password Problem and `--no-role-passwords`

Here’s a quirk that trips everyone up. By default, pg_dumpall includes ALTER ROLE ... ENCRYPTED PASSWORD '...'; for every role. If you’re restoring a backup to a staging or development environment, you probably don’t want to overwrite the passwords your team uses to connect to those environments with the production passwords. It’s a mess.

The --no-role-passwords flag is your friend here. It omits the ALTER ROLE commands that set passwords. The roles will be created, but their passwords will remain whatever they were on the target system before the restore (or undefined if it’s a new role). You can then manage passwords separately, which is almost always what you want for anything other than a disaster recovery scenario.

When To Use It (And When To Avoid It)

Use pg_dumpall -g for:

Capturing all roles and tablespace definitions as part of a broader backup strategy.
Migrating a small cluster to a new version or server.
Cloning a development environment from production.

Avoid the full pg_dumpall (without -g) for:

Large clusters: The resulting SQL file is monolithic and slow to restore.
Point-in-time recovery (PITR): It’s a snapshot of a single moment, not a continuous backup. You can’t rewind to 10:32 AM yesterday.
Minimizing downtime: The dump and restore process will take time, during which your data is changing.

So, in summary: lean on pg_dumpall -g --clean to reliably manage your globals. For your actual database data, you’re almost always better off with individual pg_dump jobs. It’s less convenient upfront but saves you from a world of pain later. Trust me, I’ve been there.

What It Actually Backs Up (And What It Doesn’t)

The Basic, No-Frills Command

Why You Should Almost Always Use -g and --clean

The Password Problem and --no-role-passwords

When To Use It (And When To Avoid It)

Why You Should Almost Always Use `-g` and `--clean`

The Password Problem and `--no-role-passwords`