Right, so you want to talk to your PostgreSQL database from Python. You’ve probably heard of psycopg2. It’s the undisputed heavyweight champion for this job, the database adapter that’s been battle-tested for decades. It’s not the only one (asyncpg is a fantastic contender if you’re all-in on async), but it’s the most ubiquitous, stable, and feature-complete. Think of it as the trusty old Leatherman multitool in your backend kit: it might not be the shiniest, but it has every tool you’ll actually need, and it works.

First things first, let’s get it installed. Don’t just pip install psycopg2; that will likely fail miserably on your machine because it tries to compile the C bindings and you probably don’t have the libpq dev libraries just lying around. This is the first “gotcha,” and it’s a rite of passage. Instead, use the pre-compiled binary wheel:

pip install psycopg2-binary

“Yes, yes,” I hear the purists cry, “the binary distribution is not recommended for production!” They’re technically right. The -binary package is a trade-off: it’s slightly less secure and might not be optimized for your specific OS. But for 99.8% of development, testing, and even mid-scale production use, it’s absolutely fine and saves you a world of pain. If you’re running a massive, security-hardened financial system, by all means, install the system dependencies and use the pure psycopg2 package. For the rest of us, psycopg2-binary is the pragmatic choice. Let’s move on.

The Connection: Your Lifeline to the Database

Everything in psycopg2 starts with a connection. You can’t do squat without it. A connection is a relatively expensive thing to establish; it’s a network handshake, authentication, and a stateful session. This is why we use connection pools in production (more on that later). For now, let’s make one.

You’ll use a Connection String, a URI that contains all the details PostgreSQL needs. Never hardcode this. Use environment variables. I cannot stress this enough. You will accidentally commit your password to GitHub someday, and I’d rather you not spam all of our inboxes with API key alerts.

import psycopg2
import os

# Get your credentials from the environment, like a proper professional.
DB_HOST = os.getenv('DB_HOST', 'localhost')
DB_NAME = os.getenv('DB_NAME', 'my_database')
DB_USER = os.getenv('DB_USER', 'postgres')
DB_PASSWORD = os.getenv('DB_PASSWORD', 'password')

conn = None
try:
    # The classic way: with keyword arguments.
    conn = psycopg2.connect(
        host=DB_HOST,
        dbname=DB_NAME,
        user=DB_USER,
        password=DB_PASSWORD
    )
    print("Connection established!")
except psycopg2.OperationalError as e:
    print(f"Oops, couldn't connect: {e}")
finally:
    if conn:
        conn.close()

Alternatively, you can use a DSN string, which is often cleaner:

# Or use a full DSN string. Often easier for complex connections.
DSN = f"dbname={DB_NAME} user={DB_USER} password={DB_PASSWORD} host={DB_HOST}"
conn = psycopg2.connect(DSN)

The With Statement: Context is Everything

The previous example is fine, but manually closing the connection in a finally block is what our grandparents did. We’re modern developers. We use context managers (with statements). This is non-negotiable. It automatically handles closing the connection, committing transactions, or rolling back on errors. It’s cleaner and safer.

# The correct way to handle a connection in the modern era.
try:
    with psycopg2.connect(DSN) as conn:
        # Inside this block, the connection is open.
        # When this block ends, the connection is automatically closed.
        print(conn.get_dsn_parameters())
except psycopg2.OperationalError as e:
    print(f"Connection failed: {e}")

Cursors: The Workhorses

A connection is your lifeline, but a cursor is your mouthpiece. It’s the object you use to send SQL commands and retrieve results. The name is a holdover from database history, referring to the blinking cursor on an old terminal. You should almost always use a cursor as a context manager as well, ensuring it’s properly closed.

with psycopg2.connect(DSN) as conn:
    with conn.cursor() as cur:  # This creates a cursor
        cur.execute("SELECT version();")
        version = cur.fetchone()  # Fetches the next row
        print(f"You're connected to: {version[0]}")

Notice fetchone(). That’s because execute() just runs the query; it doesn’t automatically retrieve the data. You have to ask for it. Other options are fetchall() (grabs everything, careful with big results!) and fetchmany(size=10).

Parameterized Queries: The Hill I Will Die On

This is the single most important concept in database interaction. You must use parameterized queries. Never, ever, ever use string formatting (f"" or .format()) to insert variables into a SQL string. I’m not yelling, I’m just passionate about not getting your database pwned by SQL injection attacks.

psycopg2 uses the %s placeholder, regardless of the data type. It feels weird coming from other languages that use ?, but you get used to it.

# WRONG. BAD. DO NOT DO THIS.
user_id = "123'; DROP TABLE users; --"
cur.execute(f"SELECT * FROM users WHERE id = {user_id}") # Kiss your data goodbye.

# CORRECT. THIS IS THE WAY.
user_id = 123
cur.execute("SELECT * FROM users WHERE id = %s", (user_id,)) # Note the tuple!

# For multiple parameters:
username = "alice"
cur.execute(
    "SELECT * FROM users WHERE id = %s AND username = %s",
    (user_id, username)  # A tuple of parameters
)

Why the comma in (user_id,)? That’s Python syntax to create a single-element tuple. It’s a common trip-up. The second argument to execute() must be a sequence (like a tuple or list), even if it’s just one parameter.

Transactions: It’s All or Nothing

PostgreSQL transactions are explicit. This is a feature, not a bug. By default, every statement you execute is inside a transaction. If you don’t commit it, it gets rolled back. The with conn: context manager actually commits when it exits successfully or rolls back if an exception is raised. This is brilliant because it maps perfectly to the concept of a unit of work.

try:
    with psycopg2.connect(DSN) as conn:
        with conn.cursor() as cur:
            cur.execute("INSERT INTO accounts (user, balance) VALUES (%s, %s)", ('Bob', 100))
            cur.execute("INSERT INTO accounts (user, balance) VALUES (%s, %s)", ('Alice', 200))
            # At the end of this block, if no exceptions, the transaction is COMMITTED.
except Exception:
    # If any exception is raised above, the transaction is automatically ROLLED BACK.
    # Bob and Alice won't be inserted. The database remains consistent.
    print("Transaction failed!")

You can manually control this with conn.commit() and conn.rollback(), but the context manager is so foolproof it’s usually the better choice.

Connection Pooling: For the Real World

Opening a new connection for every web request or function call is a recipe for a slow, sad application. The overhead will kill you. In production, you use a connection pool. psycopg2 itself doesn’t include a pool, but it’s the foundation most pooling libraries are built on. The standard is psycopg2.pool for simple setups or a dedicated library like SQLAlchemy (which uses psycopg2 under the hood) for more complex scenarios. This is a topic for its own chapter, but know that it’s the next step after you get the basics down.

And that’s the tour. It’s not magic, it’s just a very well-designed, slightly quirky tool. Master these fundamentals—the connection, the cursor, parameterized queries, and transactions—and you’ll have a rock-solid foundation for everything else you want to build.