62.8 Connection Pooling Strategies
Right, let’s talk about connection pooling. This is one of those things that separates the dabblers from the pros. You see, opening a new database connection is a shockingly expensive operation. It’s not just a network handshake; it’s process forking, memory allocation, authentication—it’s a whole dramatic opera just to say “hello.” If your app tries to do this on every single request, you’re going to spend more time introducing yourselves than actually getting work done. Connection pooling solves this by creating a pool of persistent, reusable connections that your application can just grab, use, and return. It’s the difference between building a new car for every errand and having a garage of cars ready to go.
The Brutal Cost of a New Connection
Let’s get specific, because “expensive” is a weasel word. When your app connects to PostgreSQL, for instance, here’s a simplified version of the horror show:
- Network Latency: The TCP handshake (
SYN,SYN-ACK,ACK). - SSL Negotiation: If you’re smart and using TLS, more back-and-forth.
- PostgreSQL Startup Packet: Your client sends its version, user, database, and parameters.
- Authentication: The server challenges, you respond. Maybe it’s SCRAM-SHA-256, which involves multiple rounds of hashing.
- Backend Process Fork: PostgreSQL
fork()s a new process (or on modern versions, might use a thread) to handle your connection. This allocates memory for the entire session’s state. - Parameter Setting:
timezone,search_path,client_encoding—all this gets set per connection.
This can easily take tens of milliseconds. For a single web request aiming for under 100ms, that’s a death sentence. A pool avoids this by doing all this nonsense once, upfront.
PgBouncer: The Specialist
For PostgreSQL, the gold standard is PgBouncer. It’s a lightweight, dedicated connection pooler that sits between your app and your database. Its sole job is to manage connections efficiently. You point your application to PgBouncer (which looks just like a PostgreSQL server), and PgBouncer talks to the real PostgreSQL server with a much smaller set of persistent connections.
The magic is in its pooling modes. This is critically important and often misunderstood:
- Session pooling: A client borrows a connection for the entire duration of its session. It’s simple but less efficient. This is what most application-level pools do.
- Transaction pooling: A client borrows a connection only for the duration of a single transaction. It returns to the pool immediately after a
COMMITorROLLBACK. This is where you get insane efficiency gains. Your application can have thousands of short-lived HTTP requests, but they all share a small pool of backend connections.
Warning: Transaction pooling is brilliant, but it breaks anything that relies on session state. You cannot use SET statements to change connection parameters for longer than a single transaction. PREPARE statements are also transaction-scoped. If your ORM or framework is sloppy with session state, it will explode. You’ve been warned.
Here’s a quick docker example to get PgBouncer running. You’ll configure it via a pgbouncer.ini file.
; pgbouncer.ini
[databases]
myapp = host=localhost port=5432 dbname=my_production_db
[pgbouncer]
pool_mode = transaction
listen_port = 6432
listen_addr = *
auth_type = md5
auth_file = userlist.txt
max_client_conn = 1000
default_pool_size = 20
You’d then point your application to localhost:6432 instead of localhost:5432.
Application-Level Pooling (The Built-In Kind)
Most drivers and ORMs have built-in pooling. In Node.js with the pg library, it’s the Pool class. In Python’s psycopg2, it’s ThreadedConnectionPool. These are fine, but they have a fundamental limitation: they’re local to your application process.
If you have 10 application servers, each with a pool of 20 connections, you’re looking at 200 connections to your database. PgBouncer, being a central point, lets you have 10 app servers each making thousands of short-lived connections to the pooler, which then only maintains, say, 50 actual connections to PostgreSQL. It’s a much more efficient multiplexing layer.
Here’s a Node.js example using the excellent pg pool. Note the config options—these are the knobs you need to understand.
// app.js
const { Pool } = require('pg');
// Create the pool. This is usually a global singleton.
const pool = new Pool({
host: 'localhost',
port: 6432, // Pointing to PgBouncer!
user: 'myuser',
password: 'supersecret',
database: 'myapp',
max: 20, // max connections in this app's pool
idleTimeoutMillis: 30000, // close idle connections after 30s
connectionTimeoutMillis: 2000, // fail fast if can't connect in 2s
});
// Using the pool for a query
const result = await pool.query('SELECT * FROM users WHERE id = $1', [userId]);
// The connection is automatically returned to the pool after the query.
MongoDB and Redis: The Simpler Reality
Now, for a palate cleanser. MongoDB and Redis are much simpler here, and it’s a testament to their design for modern workloads.
MongoDB: The driver (e.g., Node.js mongodb driver) is itself a connection pool. You don’t typically need a separate pooler. When you create a MongoClient, it manages a pool of connections under the hood. You just call client.db().collection().find() and the driver handles grabbing and releasing a connection from its internal pool. The connection string even has options like maxPoolSize and minPoolSize. It just works, and it works well.
Redis: Redis is famously single-threaded. It’s designed for an insane number of lightweight connections. While connection pooling can be used, it’s often less critical than with a heavy RDBMS like PostgreSQL. The Redis client libraries are so efficient that for most use cases, creating a new connection per request (if you had to) would be less punishing. But of course, you still reuse connections! The standard best practice is to maintain a connection pool (or a single persistent connection in a simple app) and just reuse it. The Redis client in Python (redis-py) has a connection pool built into its core that manages this for you automatically.
# redis_python_example.py
import redis
# This creates a connection pool under the hood
pool = redis.ConnectionPool(host='localhost', port=6379, max_connections=10, decode_responses=True)
r = redis.Redis(connection_pool=pool)
# Every command uses a connection from the pool
r.set('my_key', 'some_value')
value = r.get('my_key')
The bottom line? For PostgreSQL, respect the connection overhead and use PgBouncer in transaction mode for high-scale web workloads. For MongoDB and Redis, trust your driver—it’s probably got this covered. And always, always monitor your connection counts. It’s the first place you look when things get slow.