61.2 Parameterized Queries and Preventing SQL Injection
Right, let’s talk about the one security mistake that will get your app featured on the evening news for all the wrong reasons: SQL injection. It’s the digital equivalent of leaving your front door wide open with a sign that says “All my valuables are in the living room, please don’t steal them.” It’s absurdly common, devastatingly effective, and, frankly, embarrassingly easy to prevent. I’m going to show you how. Not because some security manual told me to, but because I’ve seen the cleanup, and it’s not pretty.
The core problem is simple: you’re trying to mix untrusted, user-provided data (like a search term from a form) with the trusted, structured language of SQL. The classic, naive way to do this is by string concatenation. It feels intuitive, but it’s a catastrophic error.
# DO NOT DO THIS. EVER.
user_id = input("Enter your user ID: ")
query = f"SELECT * FROM users WHERE id = {user_id}"
See that? We’re just smashing the user’s input directly into the SQL string. If a user enters 1, this works fine. It becomes SELECT * FROM users WHERE id = 1. But if a malicious user enters 1; DROP TABLE users; --, your query becomes:
SELECT * FROM users WHERE id = 1; DROP TABLE users; --
Congratulations, your users table is now gone. The -- comments out the rest of any trailing SQL you might have had, making the attack clean. This isn’t theoretical; it’s how countless systems have been pwned.
The Only Sanctioned Escape: Parameterized Queries
The solution is not to try and “sanitize” the input yourself with regex or string replacement. You will fail. The winning move is to never let the user’s data touch the SQL string in the first place. You use parameterized queries (or prepared statements).
You write your SQL query with placeholders (usually ? for sqlite3, %s for many others). This query is sent to the database first. The database parses it, understands the structure (“ah, this is a SELECT with a WHERE clause looking for an ID”), and prepares an execution plan. Then, and only then, do you send the user’s data values separately. The database knows these values are data, not SQL commands. It inserts them safely into the already-compiled query structure.
import sqlite3
conn = sqlite3.connect('app.db')
cursor = conn.cursor()
# The correct way. Note the ? placeholder.
user_id = input("Enter your user ID: ")
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
# For multiple parameters, the order matters.
cursor.execute(
"INSERT INTO users (name, email) VALUES (?, ?)",
('Alice', 'alice@example.com')
)
The magic is that the database driver handles the escaping for you, in a way that is specific to the database’s own rules. It’s foolproof.
How This Looks in SQLAlchemy Core (and why it’s better)
While the ? syntax works, it’s a bit clunky. SQLAlchemy Core provides a much cleaner, more Pythonic interface for the exact same security benefit. It uses named parameters (:name) and expects a dictionary.
from sqlalchemy import create_engine, text
engine = create_engine('sqlite:///app.db')
with engine.connect() as conn:
# Using named parameters. Much more readable.
result = conn.execute(
text("SELECT * FROM users WHERE id = :user_id"),
{"user_id": 123}
)
# Or for an insert, with multiple values. Notice the clarity.
conn.execute(
text("INSERT INTO users (name, email) VALUES (:name, :email)"),
{"name": 'Bob', "email": 'bob@builder.com'}
)
conn.commit()
The reason this is objectively better isn’t just readability. If you need to use the same value multiple times in a query, with the ? style you have to pass it multiple times in the tuple. With SQLAlchemy’s named parameters, you define it once in the dictionary and reference it everywhere. It reduces errors.
The ORM Handles It For You (Mostly)
If you’re using the SQLAlchemy ORM, you’re almost certainly safe by default. When you say session.query(User).filter(User.name == user_input_name), the ORM is building a parameterized query behind the scenes. You’re not building strings, so you’re not vulnerable to injection through the standard query API.
The big, glaring “BUT” here is when you get clever and use text() constructs or raw SQL within your ORM queries. The moment you write session.query(User).filter(text(f"name = '{user_input_name}'")), you’ve just thrown all that protection out the window and are back to string concatenation. If you must use text(), use it with parameters: filter(text("name = :name"), {"name": user_input_name}).
Common Pitfalls and the “Almost” Edge Case
The biggest pitfall is thinking you’re smarter than the system. Don’t. Just use parameters. Always.
A more subtle edge case involves “IN” clauses. You might be tempted to dynamically build a string of placeholders. Don’t. Use the SQLAlchemy tuple_ function or a similar safe constructor provided by your toolkit.
The one thing parameterized queries can’t do is protect dynamic table or column names. You can’t parameterize an identifier (like a table name). If you need to make that dynamic (which is often a design smell), you must whitelist the possible values in your code and map the user input to a known-safe value.
# UNSAFE: Can't parameterize the table name.
table_name = input("Which table? ") # user enters 'users; DROP TABLE payments; --'
query = f"SELECT * FROM {table_name}"
# SAFE: Use a whitelist mapping.
allowed_tables = {'users': 'users', 'products': 'products'}
table_key = input("Which table? ")
table_name = allowed_tables.get(table_key, 'users') # defaults to 'users' on bad input
query = f"SELECT * FROM {table_name}"
It’s not elegant, but it’s safe. The rule is simple: if it’s data, use a parameter. If it’s part of the structure of the query itself, it must be hardcoded or rigorously whitelisted. There is no third option. Now go forth and don’t get hacked.