82.8 OWASP Python Security Cheat Sheet
Right, let’s talk about securing your Python applications. This isn’t about slapping a helmet on a hamster and calling it a day. Security is a process, a mindset, and frankly, it’s about understanding that the world is full of people with more free time and worse intentions than you can possibly imagine. The OWASP Cheat Sheet is a fantastic starting point, but I’m here to give you the color commentary—the “why” behind the “what.”
The Cardinal Sin: Rolling Your Own Crypto
I need to get this out of the way first because it’s the most important rule. Do not, under any circumstances, try to invent your own cryptography. I don’t care how clever your algorithm for shuffling bits seems after three coffees. You are not Auguste Kerckhoffs. You are not Claude Shannon. The world’s most brilliant cryptographers have dedicated their lives to creating and breaking these systems. Your custom SuperSecretEncoder class will be broken by a bored intern in under five minutes. Use the well-vetted, high-level libraries that have withstood decades of brutal public scrutiny. This is the one area where you absolutely must be a sheep and follow the herd.
Use cryptography, Not The Standard Library’s ssl or hashlib (Directly)
Python’s standard library has modules like hashlib and ssl. They’re… fine. But they’re also footguns waiting to happen because they’re too low-level. You have to know exactly what you’re doing to use them correctly. Instead, use the cryptography library (pip install cryptography). It’s a binding to OpenSSL, but it provides a fantastic “hazmat” (hazardous materials) layer for when you need raw power and, more importantly, lovely, safe high-level recipes for the things you actually need.
For example, need to hash a password? Don’t just call hashlib.sha256("mypassword".encode()).hexdigest(). That’s terrible. It’s fast, which is bad for passwords, and offers no protection against rainbow tables. Here’s how you do it right with the cryptography library’s high-level API:
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
from cryptography.hazmat.backends import default_backend
import os
def hash_password(password: str) -> bytes:
# Generate a fresh, random salt. This is non-negotiable.
salt = os.urandom(16)
# Configure our Key Derivation Function: 100,000 iterations of SHA256
kdf = PBKDF2HMAC(
algorithm=hashes.SHA256(),
length=32,
salt=salt,
iterations=100000,
backend=default_backend()
)
# Derive the key (the hashed password)
key = kdf.derive(password.encode())
# Return the salt and the key concatenated so you can store them together
return salt + key
def verify_password(password: str, hashed: bytes) -> bool:
# Extract the salt from the stored value (first 16 bytes)
salt = hashed[:16]
stored_key = hashed[16:]
# Recreate the KDF with the same parameters and the original salt
kdf = PBKDF2HMAC(
algorithm=hashes.SHA256(),
length=32,
salt=salt,
iterations=100000,
backend=default_backend()
)
# Use verify() instead of derive() to avoid timing attacks
try:
kdf.verify(password.encode(), stored_key)
return True
except exceptions.InvalidKey:
return False
# Usage:
hashed = hash_password('my_secure_password')
# Store `hashed` in your database.
is_correct = verify_password('my_secure_password', hashed)
Why is this better? The random salt means identical passwords don’t hash to the same value, thwarting precomputed attacks. The high iteration count (100,000 is a good starting point as of 2023) makes brute-forcing computationally expensive. And verify is used to check the password in a constant-time manner, preventing timing attacks that could reveal information about the correct hash.
Secrets Management: Or, How to Stop Hardcoding Your Demise
You’ve seen it in terrible tutorials. You’ve probably done it yourself. password = "supersecret123" right there in the source code. This is a spectacularly bad idea. The moment you commit that to a git repository, your secret is forever in the history. The solution is environment variables and a .env file (load it with the python-dotenv package), but more importantly, a proper secrets manager like AWS Secrets Manager, HashiCorp Vault, or even Azure Key Vault in production. For development, a .env file that is explicitly listed in your .gitignore is the bare minimum.
# .gitignore
# Add this line to ensure you don't commit secrets
.env
# app.py
from dotenv import load_dotenv
import os
load_dotenv() # Loads variables from .env into the environment
# Get your secret key from the environment
SECRET_KEY = os.getenv('MY_APP_SECRET_KEY')
if not SECRET_KEY:
raise ValueError("No MY_APP_SECRET_KEY set for the application. Check your .env file.")
Input Validation: Your First and Best Line of Defense
Assume every single byte of data coming into your application—from a user, from an API, from a database, from a file you read—is malicious until proven otherwise. This isn’t paranoia; it’s professionalism. Use a robust validation library like Pydantic. It forces you to define a schema and will coerce and validate data for you, throwing clear errors when something doesn’t fit. This stops a huge class of injection attacks, XSS, and other nastiness before they even get started.
from pydantic import BaseModel, constr, EmailStr, validator
class UserCreate(BaseModel):
# This defines the contract. Nothing else is allowed.
username: constr(min_length=3, max_length=20, regex="^[a-zA-Z0-9_]+$") # Alphanumeric only
email: EmailStr # Built-in email validation
password: constr(min_length=12) # Enforce minimum length
@validator('password')
def password_complexity(cls, v):
# Add your own complexity rules
if not any(char.isdigit() for char in v):
raise ValueError('Password must contain a digit')
if not any(char.isupper() for char in v):
raise ValueError('Password must contain an uppercase letter')
return v
# Later, in your endpoint:
try:
user_data = UserCreate(**request.json) # Validation happens here!
except ValidationError as e:
return {"errors": e.errors()}, 400
This code does the heavy lifting for you. It ensures the username is a safe format, the email is actually an email, and the password is complex. An attacker can’t send a malicious payload like {"username": "admin'; DROP TABLE users; --"} because the validation regex will reject it outright.