24.3 Repeatable Read: Snapshot Isolation
Alright, let’s pull back the curtain on Repeatable Read, which is arguably the most misunderstood and misnamed isolation level in the whole bunch. The SQL standard calls it “Repeatable Read,” but what most modern databases (think PostgreSQL, MySQL with InnoDB, SQL Server) actually give you under this setting is something far more powerful and useful: Snapshot Isolation.
The name “Repeatable Read” suggests a modest superpower: the guarantee that once you read a value in a transaction, you’ll get that same value back if you read it again. No disappearing acts. And that is true. But Snapshot Isolation goes much further. It gives your entire transaction its own personal, frozen-in-time view of the database. The moment your transaction begins, it takes a snapshot of the committed data. From that point on, for the entire life of your transaction, it’s like you’re the only person in the world using the database. You are blissfully isolated from the chaotic, messy changes everyone else is making. This is the database equivalent of putting on noise-canceling headphones.
How It Actually Works: Your Private Universe
When you start a transaction under REPEATABLE READ (Snapshot Isolation), the database engine notes the current transaction ID, which acts as a timestamp. Any row that was committed by a transaction with an ID less than or equal to yours is visible to you. Any row committed by a transaction with a later ID, or not yet committed, is invisible. It’s your own private universe, built from the state of the universe the moment you arrived.
This is fundamentally different from READ COMMITTED. In READ COMMITTED, every new statement in your transaction might see a slightly different world because it picks up the latest committed changes. Under Snapshot Isolation, your world is static. This is why you get the “repeatable read”: your SELECT at the start of your transaction and your SELECT at the end will return the exact same data, even if another transaction has changed and committed those rows a hundred times in the interim.
Let’s see this in action. We’ll use a simple bank_accounts table.
-- Session 1: Your transaction
START TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM bank_accounts WHERE account_id = 101;
-- Returns: 500
-- Session 2: Someone else comes along and updates the balance
UPDATE bank_accounts SET balance = 400 WHERE account_id = 101;
COMMIT; -- The change is now committed and visible to new transactions
Now, back in your original transaction:
-- Session 1: Still in your transaction
SELECT balance FROM bank_accounts WHERE account_id = 101;
-- Still Returns: 500 (Not 400!)
-- The change in Session 2 is completely invisible to you.
UPDATE bank_accounts SET balance = balance - 100 WHERE account_id = 101;
-- This update operates on the *version* of the row you can see (500).
-- It will set the balance to 400.
COMMIT;
Wait, what? If the other transaction already set it to 400, and you subtracted 100 from your version (500), did you just overwrite their change and set it to 400, effectively losing their update? In a standard REPEATABLE READ implementation, absolutely not. This is where the magic of Multi-Version Concurrency Control (MVCC) kicks in.
The Write-Skew Anomaly: The Kryptonite
Here’s the part the SQL standard committee probably didn’t fully anticipate, and it’s the biggest “gotcha” with Snapshot Isolation. It prevents dirty reads, non-repeatable reads, and even lost updates (via row-level locking). But it can’t inherently prevent write-skew.
Write-skew is a classic concurrency anomaly. It occurs when two transactions read data based on a logical constraint (e.g., “there must always be at least one doctor on call”), each makes a change that seems valid in its own private snapshot, but when both commit, the constraint is violated globally.
Imagine a table for on-call doctors:
CREATE TABLE on_call (
doctor_id INT PRIMARY KEY,
on_call BOOLEAN NOT NULL
);
INSERT INTO on_call VALUES (1, true), (2, true);
-- Rule: At least one doctor must be on call at all times.
Now, two transactions run simultaneously:
-- Transaction 1 (Dr. 1 going off duty)
START TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM on_call WHERE on_call = true;
-- Sees 2 doctors on call. Okay for me to leave.
UPDATE on_call SET on_call = false WHERE doctor_id = 1;
-- Transaction 2 (Dr. 2 going off duty)
START TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM on_call WHERE on_call = true;
-- Also sees 2 doctors on call. Okay for me to leave.
UPDATE on_call SET on_call = false WHERE doctor_id = 2;
Both transactions commit. Now, SELECT * FROM on_call; shows both doctors are off duty. The rule is broken. Snapshot Isolation ensured each transaction had a consistent view, but that view was stale regarding the other’s uncommitted changes. The database’s built-in locking prevented a direct write-write conflict on the same row, but the logical constraint involved multiple rows, and that’s where it failed.
Best Practices and When to Use It
So, when do you wield this powerful but slightly dangerous tool?
For Long-Running Reporting Queries: This is the killer app. You need a consistent view of the data for a complex, multi-table report that takes minutes to run. You can’t have the numbers shifting underneath you. Snapshot Isolation is perfect for this.
When Your Business Logic Requires Consistency: If you have a workflow where you need to make decisions based on a set of values that must not change during your transaction, this is your isolation level.
And How to Handle Write-Skew: You have two main weapons:
- Pessimistic Locking: Use
SELECT ... FOR UPDATEon the rows that are part of the logical constraint. In the doctor example, both transactions would have doneSELECT ... FROM on_call WHERE on_call = true FOR UPDATE;. This would have forced the second transaction to wait for the first to commit, allowing it to see the new count of 1 and (hopefully) abort its change. - Serializable Isolation: For the most complex constraints, sometimes you just have to bite the bullet and move up to the
SERIALIZABLElevel, which is specifically designed to sniff out and prevent these sorts of anomalies, often using predicate locking.
- Pessimistic Locking: Use
Snapshot Isolation is a workhorse. It provides an incredible balance of performance and consistency for the vast majority of applications. Just remember it’s not a silver bullet. Understand its model—your private, frozen snapshot—and be hyper-aware of the edge cases where logical constraints between rows can be violated. It’s not a bug; it’s a fundamental characteristic of the isolation level. Your job is to code accordingly.