[BEE-8002] Isolation Levels and Their Anomalies
INFO
Isolation is the most nuanced ACID property. Understanding which anomalies each level prevents -- and which it allows -- is a prerequisite for writing correct concurrent code.
Context
The SQL standard defines four transaction isolation levels. Every database defaults to one of them. Most engineers accept the default without understanding what it means for concurrent behavior -- and then spend weeks debugging race conditions that the database was never asked to prevent.
The anomalies each level permits are not theoretical. Dirty reads have caused double charges. Non-repeatable reads have caused inventory oversells. Write skew has permitted users to take actions that should have been mutually exclusive. Understanding the matrix of "level vs. anomaly" is not academic preparation -- it is a prerequisite for writing correct code under concurrency.
Related
For the foundational ACID model, see BEE-8001. For distributed transaction coordination, see BEE-8003. For concurrency control strategies, see BEE-11006.
The Four SQL Standard Isolation Levels
The SQL-92 standard defines four levels in order of increasing strictness:
- Read Uncommitted -- weakest; no protection against any anomaly
- Read Committed -- prevents dirty reads; default in PostgreSQL, Oracle
- Repeatable Read -- prevents dirty and non-repeatable reads; default in MySQL InnoDB
- Serializable -- strongest; prevents all standard anomalies including phantoms
The Anomalies
Dirty Read
A transaction reads data written by a concurrent transaction that has not yet committed. If that transaction later rolls back, the first transaction has read data that never officially existed.
Example:
T1: BEGIN
T1: UPDATE accounts SET balance = 0 WHERE id = 1 -- not yet committed
T2: BEGIN
T2: SELECT balance FROM accounts WHERE id = 1
-- returns 0 (T1's uncommitted write)
T2: -- makes decision based on 0 balance
T1: ROLLBACK -- balance reverts to original
-- T2's decision was based on data that never existedRisk level: High. Can cause double charges, incorrect balance displays, wrong business decisions.
Non-Repeatable Read
A transaction reads the same row twice and gets different values because another transaction committed a change between the two reads.
Example:
T1: BEGIN
T1: SELECT price FROM products WHERE id = 42
-- returns 100
T2: BEGIN
T2: UPDATE products SET price = 200 WHERE id = 42
T2: COMMIT
T1: SELECT price FROM products WHERE id = 42
-- returns 200 (different from first read)
T1: COMMIT
-- T1 operated on two different values for the same rowRisk level: Medium. Breaks "read your premise" logic -- a transaction that reads a value, makes a decision, then re-reads to verify can be surprised.
Phantom Read
A transaction re-executes a query with a range condition and finds new rows that were inserted by another committed transaction.
Example:
T1: BEGIN
T1: SELECT COUNT(*) FROM bookings WHERE room_id = 5 AND date = '2026-04-07'
-- returns 0
T2: BEGIN
T2: INSERT INTO bookings (room_id, date, user_id) VALUES (5, '2026-04-07', 99)
T2: COMMIT
T1: SELECT COUNT(*) FROM bookings WHERE room_id = 5 AND date = '2026-04-07'
-- returns 1 (phantom row appeared)
T1: INSERT INTO bookings (room_id, date, user_id) VALUES (5, '2026-04-07', 42)
-- double booking!
T1: COMMITRisk level: Medium to high. Classic source of double-booking and inventory-count bugs.
Write Skew
Two transactions each read an overlapping set of rows, make decisions based on what they read, then write to disjoint rows -- producing a result that would not be allowed if either had run first. Neither transaction overwrites the other's data, so neither is detected as a conflict by simple locking.
Example: A system requires at least one doctor on call. Both doctors are currently on call.
T1 (Doctor Alice): BEGIN
T1: SELECT COUNT(*) FROM on_call WHERE shift = 'tonight'
-- returns 2 (Alice and Bob)
T1: -- "2 > 1, safe to remove myself"
T2 (Doctor Bob): BEGIN
T2: SELECT COUNT(*) FROM on_call WHERE shift = 'tonight'
-- returns 2 (Alice and Bob)
T2: -- "2 > 1, safe to remove myself"
T1: DELETE FROM on_call WHERE doctor = 'Alice' AND shift = 'tonight'
T2: DELETE FROM on_call WHERE doctor = 'Bob' AND shift = 'tonight'
T1: COMMIT
T2: COMMIT
-- No doctor on call. The invariant is broken.
-- Neither transaction wrote to the same row as the other.Risk level: High and insidious. Cannot be prevented by row-level locking alone. Requires Serializable isolation or explicit SELECT FOR UPDATE on all rows that inform the decision.
Isolation Levels vs. Anomalies: The Matrix
| Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read | Write Skew |
|---|---|---|---|---|
| Read Uncommitted | possible | possible | possible | possible |
| Read Committed | prevented | possible | possible | possible |
| Repeatable Read | prevented | prevented | prevented* | possible |
| Serializable | prevented | prevented | prevented | prevented |
* PostgreSQL's Repeatable Read (snapshot isolation) prevents phantom reads. MySQL's Repeatable Read prevents them via gap locks. The SQL-92 standard does not require this at this level.
How Real Databases Implement Isolation
PostgreSQL
PostgreSQL uses Multi-Version Concurrency Control (MVCC) throughout. Rather than locking rows for reads, it maintains multiple committed versions of each row and gives each transaction a consistent snapshot view.
| SQL Level | PostgreSQL Internal Behavior |
|---|---|
| Read Uncommitted | Treated as Read Committed (PostgreSQL never shows uncommitted data) |
| Read Committed | Default. Each statement gets a fresh snapshot of committed data. |
| Repeatable Read | Snapshot isolation. Transaction sees committed data as of BEGIN. |
| Serializable | Serializable Snapshot Isolation (SSI). Detects cycle conflicts. |
Key points:
- PostgreSQL's Repeatable Read is actually snapshot isolation -- stronger than the SQL-92 spec requires. Phantom reads do not occur.
- Write skew is still possible at Repeatable Read. Two transactions can both read a shared condition and write to non-overlapping rows without conflict.
- Serializable uses SSI (introduced in PostgreSQL 9.1): an optimistic algorithm that detects dangerous read-write dependency cycles and aborts one transaction. Readers do not block writers.
-- Set isolation for a single transaction
BEGIN ISOLATION LEVEL REPEATABLE READ;
-- or
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;MySQL InnoDB
MySQL InnoDB defaults to Repeatable Read and uses a combination of MVCC and locking.
| SQL Level | MySQL InnoDB Behavior |
|---|---|
| Read Uncommitted | Reads are non-locking; dirty reads allowed. |
| Read Committed | Each statement reads the latest committed snapshot. |
| Repeatable Read | Default. Consistent snapshot from first read. Gap locks prevent phantoms. |
| Serializable | All plain SELECTs become SELECT ... FOR SHARE. Full locking. |
Key points:
- MySQL prevents phantom reads at Repeatable Read via gap locks and next-key locks -- locks on the gap between index values, not just individual rows. This is a locking approach rather than MVCC-only.
- Gap locks can cause deadlocks under concurrent inserts. They are a common source of production incidents in MySQL workloads.
- MySQL Serializable uses two-phase locking (pessimistic), not SSI. Readers block writers.
-- Check current isolation level
SELECT @@transaction_isolation;
-- Set session-level
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;Snapshot Isolation and MVCC
Snapshot isolation is not a named SQL level -- it sits between Repeatable Read and Serializable in practice. Most modern databases implement their "Repeatable Read" as snapshot isolation.
How MVCC works:
- Each row version carries a creation transaction ID and a deletion transaction ID.
- When a transaction starts, it records a snapshot horizon: the highest committed transaction ID visible to this transaction.
- Reads return the most recent row version whose creation ID is within the snapshot and whose deletion ID (if any) is beyond it.
- Writes create new row versions rather than overwriting in place.
Benefits:
- Reads never block writes; writes never block reads.
- Long-running reads (e.g., analytics queries) get a stable, consistent view without locking other writers.
- Eliminates dirty reads and non-repeatable reads without acquiring any read locks.
Limitations:
- Write skew is still possible (two transactions read the same snapshot, then write to disjoint rows).
- MVCC requires periodic garbage collection of old row versions. In PostgreSQL this is
VACUUM; in MySQL it is the purge thread. Neglected maintenance under high write load causes table bloat and performance degradation.
Serializable Snapshot Isolation (SSI)
SSI, introduced in academic literature by Cahill et al. (2005) and shipped in PostgreSQL 9.1 (2011), extends snapshot isolation to full serializability without the heavy locking of two-phase locking (2PL).
The key insight: Every serialization anomaly (including write skew) arises from a specific pattern of read-write dependencies between concurrent transactions. SSI tracks these dependencies and aborts one transaction when a dangerous cycle is detected.
Trade-offs vs. 2PL:
- Readers never block writers (better read throughput than 2PL).
- False-positive aborts are possible -- a transaction may be aborted even if it would not have caused an anomaly. The application must retry.
- Lower throughput than snapshot isolation under high-conflict workloads, but usually much better than 2PL under read-heavy loads.
When to use Serializable/SSI:
- Financial systems where correctness is non-negotiable (account transfers, ledger operations).
- Inventory or reservation systems where double-allocation must be impossible.
- Any "check-then-act" pattern (read a condition, make a decision, write based on it).
Choosing the Right Isolation Level
| Scenario | Recommended Level | Rationale |
|---|---|---|
| Read-heavy analytics, reporting | Read Committed | No anomalies matter for reads; minimize overhead |
| User-facing reads, simple CRUD | Read Committed (default) | Sufficient for most cases; PostgreSQL and Oracle default |
| Read-then-write with same rows (e.g., counters) | Read Committed + SELECT FOR UPDATE | Prevents lost updates on specific rows without full serializable |
| Inventory check-and-reserve | Serializable or SELECT FOR UPDATE | Prevents phantom reads and write skew |
| Financial transfers (debit/credit same account set) | Serializable | Prevents write skew on account balance invariants |
| Consistent reporting snapshot (long read) | Repeatable Read | Snapshot isolation gives stable view across multiple statements |
Practical rule: Start with Read Committed. Upgrade to Serializable for any transaction that reads a condition and writes based on it, where a concurrent transaction could invalidate that condition.
Performance Cost of Stronger Isolation
Isolation is not free. The cost increases with the strictness of the level.
| Level | Overhead | Blocking behavior | Abort rate |
|---|---|---|---|
| Read Committed | Low | Writers block writers on same row only | Very low |
| Repeatable Read | Medium | + Gap locks in MySQL; snapshot tracking in PostgreSQL | Low |
| Serializable (SSI) | Medium-high | Readers never block writers (PostgreSQL) | Medium (retries needed) |
| Serializable (2PL) | High | Readers block writers, writers block readers | Low but high latency |
Key cost levers:
- At Serializable, SSI (PostgreSQL) has much better read throughput than 2PL (MySQL).
- Long-running transactions at any level hold their snapshot or locks longer, increasing conflict probability.
- At high concurrency with Serializable, expect a fraction of transactions to receive serialization errors (
SQLSTATE 40001) and require application-level retry.
Common Mistakes
1. Accepting the default isolation level without understanding it
PostgreSQL defaults to Read Committed. MySQL defaults to Repeatable Read. These have different anomaly profiles. Code written assuming one database's defaults can exhibit different bugs when migrated or when the team switches databases.
-- Always check what you are running on
SHOW transaction_isolation; -- PostgreSQL
SELECT @@transaction_isolation; -- MySQL2. Assuming Repeatable Read prevents all anomalies
Repeatable Read prevents dirty and non-repeatable reads. It does not prevent write skew. Two transactions can each read a shared condition, decide it is safe to proceed, and write to non-overlapping rows -- invalidating the invariant for both.
-- This write skew is possible even at Repeatable Read
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM on_call WHERE shift = 'tonight'; -- reads 2
-- concurrent transaction also reads 2 and removes the other doctor
DELETE FROM on_call WHERE doctor = 'me' AND shift = 'tonight';
COMMIT;
-- now zero doctors on callFix: use SELECT FOR UPDATE on all rows that inform the decision, or upgrade to Serializable.
3. Using Serializable everywhere without retry logic
Serializable prevents all anomalies but introduces serialization failures that the application must handle. An application that does not retry on SQLSTATE 40001 will surface database errors to users under concurrent load.
# Required pattern for Serializable isolation
import time
MAX_RETRIES = 3
for attempt in range(MAX_RETRIES):
try:
with db.transaction(isolation="serializable"):
# ... do the work
break
except SerializationFailure:
if attempt == MAX_RETRIES - 1:
raise
time.sleep(0.01 * (2 ** attempt)) # exponential backoff4. Testing with a single connection
Isolation anomalies only manifest with multiple concurrent transactions. Unit tests that open one connection and run sequentially will never reproduce dirty reads, phantom reads, or write skew -- even if the production code is vulnerable. Concurrency bugs require concurrent test clients.
5. Not knowing which level your ORM uses
ORMs often manage connection pools and transactions transparently. Some ORMs open a new transaction per-request; others reuse connections. The isolation level may be set at the connection, session, or transaction level, and the ORM's default may differ from the database's default.
Principle
Know your isolation level and know what it does not prevent. Default to Read Committed for ordinary reads. Upgrade to Serializable for any transaction that reads a condition and makes a write decision based on it. Implement retry logic whenever using Serializable, because serialization failures are not errors -- they are the mechanism working correctly.
Related BEPs
- BEE-8001: ACID Properties -- foundational transaction model
- BEE-8003: Distributed Transactions and Two-Phase Commit -- cross-service coordination
- BEE-11006: Optimistic vs. Pessimistic Concurrency -- choosing a locking strategy
References
- PostgreSQL Documentation: 13.2 Transaction Isolation -- authoritative source for PostgreSQL behavior including SSI
- MySQL 8.4 Reference Manual: 17.7.2.1 Transaction Isolation Levels -- InnoDB isolation and gap locking
- Martin Kleppmann, Designing Data-Intensive Applications, Chapter 7: Transactions, O'Reilly Media, 2017 -- comprehensive coverage of weak isolation, snapshot isolation, and SSI
- Jepsen: Consistency Models -- formal hierarchy of consistency and isolation models
- Wikipedia: Isolation (database systems) -- SQL standard anomaly definitions
- A. Fekete et al., "Making Snapshot Isolation Serializable", ACM TODS, 2005 -- SSI algorithm foundations