Technology & EngineeringSql226 lines

Transactions Isolation

Transaction management, ACID properties, isolation levels, and concurrency control in relational databases

Quick Summary27 lines

You are an expert in SQL transaction management, ACID guarantees, and isolation levels for building correct concurrent applications on relational databases.

## Key Points

- **Dirty read** — reading data written by an uncommitted transaction.
- **Non-repeatable read** — re-reading a row yields different values because another transaction committed an update.
- **Phantom read** — re-executing a query yields new rows because another transaction committed an insert.
- **Write skew** — two transactions read overlapping data, make disjoint updates, and the combined result violates a constraint that neither transaction alone would break.
- **Lost update** — two transactions read the same row, both compute a new value, and the second commit overwrites the first.
1. **Keep transactions short** — long transactions hold locks, increase contention, and bloat the MVCC version store.
2. **Use the lowest sufficient isolation level** — READ COMMITTED is the default in PostgreSQL and suitable for most OLTP workloads. Use SERIALIZABLE only when write skew is a real risk.
3. **Always handle serialization failures** — transactions at REPEATABLE READ or SERIALIZABLE can fail with error 40001. Your application must retry.
4. **Lock in a consistent order** — if multiple transactions lock the same set of rows, always acquire locks in the same order (e.g., by primary key) to avoid deadlocks.
5. **Prefer optimistic concurrency for read-heavy workloads** — use version columns (`WHERE version = :expected`) instead of SELECT FOR UPDATE.
6. **Avoid user interaction inside transactions** — never wait for user input between BEGIN and COMMIT.
7. **Monitor lock waits** — use `pg_stat_activity` and `pg_locks` in PostgreSQL, or `SHOW ENGINE INNODB STATUS` in MySQL.

## Quick Example

```sql
SELECT ... FOR UPDATE;           -- exclusive row lock, block if locked
SELECT ... FOR UPDATE NOWAIT;    -- error immediately if locked
SELECT ... FOR UPDATE SKIP LOCKED; -- skip locked rows (job queue pattern)
SELECT ... FOR SHARE;            -- shared lock (allow other readers, block writers)
```

skilldb get sql-skills/Transactions IsolationFull skill: 226 lines

Paste into your CLAUDE.md or agent config

Transactions and Isolation — SQL

You are an expert in SQL transaction management, ACID guarantees, and isolation levels for building correct concurrent applications on relational databases.

Overview

A transaction is a logical unit of work that groups one or more SQL statements into an all-or-nothing operation. The ACID properties (Atomicity, Consistency, Isolation, Durability) define the guarantees a database provides. Isolation levels control how concurrent transactions interact, trading correctness for throughput. Understanding these tradeoffs is essential for avoiding data corruption, lost updates, and phantom reads.

Core Concepts

ACID Properties

Property	Guarantee
Atomicity	All statements in a transaction succeed or all are rolled back
Consistency	The database moves from one valid state to another (constraints enforced)
Isolation	Concurrent transactions do not observe each other's intermediate states
Durability	Once committed, data survives crashes (written to durable storage)

Isolation Levels (SQL Standard)

Level	Dirty Read	Non-Repeatable Read	Phantom Read	Serialization Anomaly
READ UNCOMMITTED	Possible	Possible	Possible	Possible
READ COMMITTED	No	Possible	Possible	Possible
REPEATABLE READ	No	No	Possible*	Possible
SERIALIZABLE	No	No	No	No

*PostgreSQL's REPEATABLE READ also prevents phantoms (it uses snapshot isolation under the hood).

Concurrency Phenomena

Dirty read — reading data written by an uncommitted transaction.
Non-repeatable read — re-reading a row yields different values because another transaction committed an update.
Phantom read — re-executing a query yields new rows because another transaction committed an insert.
Write skew — two transactions read overlapping data, make disjoint updates, and the combined result violates a constraint that neither transaction alone would break.
Lost update — two transactions read the same row, both compute a new value, and the second commit overwrites the first.

MVCC (Multi-Version Concurrency Control)

PostgreSQL and MySQL InnoDB use MVCC: each transaction sees a consistent snapshot of the database. Writers do not block readers, and readers do not block writers. Conflicts are detected at commit time (optimistic) or via explicit locks (pessimistic).

Implementation Patterns

Basic Transaction Control

BEGIN;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- Check business rule
DO $$
BEGIN
  IF (SELECT balance FROM accounts WHERE id = 1) < 0 THEN
    RAISE EXCEPTION 'Insufficient funds';
  END IF;
END $$;

COMMIT;

Savepoints for Partial Rollback

BEGIN;

INSERT INTO orders (customer_id, total) VALUES (1, 250.00);

SAVEPOINT before_items;

INSERT INTO order_items (order_id, product_id, qty) VALUES (currval('orders_id_seq'), 99, 1);
-- If this fails (e.g., FK violation), rollback just this part
ROLLBACK TO SAVEPOINT before_items;

-- Continue with alternative logic
INSERT INTO order_items (order_id, product_id, qty) VALUES (currval('orders_id_seq'), 100, 1);

COMMIT;

Setting Isolation Levels

-- Per-transaction (PostgreSQL)
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT ... ;
UPDATE ... ;
COMMIT;

-- Per-session (MySQL)
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
START TRANSACTION;
...
COMMIT;

Preventing Lost Updates with SELECT FOR UPDATE

BEGIN;

-- Lock the row so no other transaction can modify it
SELECT balance FROM accounts WHERE id = 1 FOR UPDATE;

-- Application computes new balance
UPDATE accounts SET balance = balance - 100 WHERE id = 1;

COMMIT;

Variants:

SELECT ... FOR UPDATE;           -- exclusive row lock, block if locked
SELECT ... FOR UPDATE NOWAIT;    -- error immediately if locked
SELECT ... FOR UPDATE SKIP LOCKED; -- skip locked rows (job queue pattern)
SELECT ... FOR SHARE;            -- shared lock (allow other readers, block writers)

Advisory Locks for Application-Level Coordination

-- PostgreSQL: named lock using a bigint key
SELECT pg_advisory_lock(12345);

-- ... do work that requires exclusive access ...

SELECT pg_advisory_unlock(12345);

-- Non-blocking version
SELECT pg_try_advisory_lock(12345);  -- returns true/false

Serializable Isolation to Prevent Write Skew

-- Example: enforce "at least one doctor on-call" without explicit locks
BEGIN ISOLATION LEVEL SERIALIZABLE;

SELECT count(*) FROM doctors WHERE on_call = true AND shift_date = '2024-03-15';
-- Application checks count > 1 before allowing opt-out

UPDATE doctors SET on_call = false WHERE id = 42 AND shift_date = '2024-03-15';

COMMIT;
-- If another transaction concurrently does the same, one will get a serialization failure
-- Application must retry on error code 40001

Retry Logic for Serialization Failures

# Application-side retry (Python pseudocode)
MAX_RETRIES = 3

for attempt in range(MAX_RETRIES):
    try:
        with db.transaction(isolation='serializable'):
            do_work()
            break
    except SerializationFailure:
        if attempt == MAX_RETRIES - 1:
            raise
        time.sleep(0.01 * (2 ** attempt))  # exponential backoff

Deadlock-Safe Ordering

-- Always lock rows in a consistent order to prevent deadlocks
BEGIN;

-- Lock both accounts in ID order
SELECT * FROM accounts WHERE id IN (1, 2) ORDER BY id FOR UPDATE;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

COMMIT;

Core Philosophy

Transactions are the mechanism that turns a sequence of SQL statements into an atomic, all-or-nothing unit of work. Without transactions, a failure partway through a multi-statement operation leaves the database in an inconsistent state — money debited from one account but never credited to another, an order created without its line items. The ACID guarantees are not abstract theory; they are the concrete properties that make relational databases trustworthy for business-critical operations.

Isolation levels are a knob that trades correctness for throughput. READ COMMITTED is the default in PostgreSQL because it provides a good balance for most OLTP workloads: each statement sees only committed data, and the overhead is minimal. SERIALIZABLE provides the strongest guarantee — transactions behave as if they executed one at a time — but at the cost of serialization failures that the application must detect and retry. The right isolation level depends on the specific operation: most reads and simple writes are fine at READ COMMITTED, while operations that check a condition and act on it (like "ensure at least one doctor is on call") need SERIALIZABLE to prevent write skew.

The most dangerous concurrency bugs are invisible in development and testing because they require specific timing between concurrent transactions. Lost updates, write skew, and phantom reads only manifest under load. A balance check that works perfectly in a single-threaded test can lose money in production when two transfers execute simultaneously. The only reliable way to find these bugs is to reason carefully about which isolation level your operation requires, and to test under realistic concurrency — not just functional correctness.

Anti-Patterns

Holding transactions open during external calls — making an HTTP request, sending an email, or waiting for a queue response inside a transaction keeps locks held for the duration of the external call. This can last seconds or minutes, causing lock escalation and blocking other transactions.
Assuming READ COMMITTED prevents all anomalies — READ COMMITTED still allows non-repeatable reads and phantoms. A transaction that reads a value, makes a business decision, and then writes based on that decision may find the value changed between the read and write steps.
Ignoring serialization failures — transactions at REPEATABLE READ or SERIALIZABLE can fail with error code 40001. Applications that do not catch and retry these failures will surface database errors to users during normal concurrent operation.
Inconsistent lock ordering across transactions — transaction A locking row 1 then row 2, while transaction B locks row 2 then row 1, guarantees eventual deadlocks under load. Always acquire locks in a deterministic order (e.g., ascending primary key).
Using long-running transactions for batch processing — a transaction that updates millions of rows holds locks, generates massive WAL, and prevents autovacuum from reclaiming space. Break batch work into smaller committed chunks with explicit progress tracking.

Best Practices

Keep transactions short — long transactions hold locks, increase contention, and bloat the MVCC version store.
Use the lowest sufficient isolation level — READ COMMITTED is the default in PostgreSQL and suitable for most OLTP workloads. Use SERIALIZABLE only when write skew is a real risk.
Always handle serialization failures — transactions at REPEATABLE READ or SERIALIZABLE can fail with error 40001. Your application must retry.
Lock in a consistent order — if multiple transactions lock the same set of rows, always acquire locks in the same order (e.g., by primary key) to avoid deadlocks.
Prefer optimistic concurrency for read-heavy workloads — use version columns (WHERE version = :expected) instead of SELECT FOR UPDATE.
Avoid user interaction inside transactions — never wait for user input between BEGIN and COMMIT.
Monitor lock waits — use pg_stat_activity and pg_locks in PostgreSQL, or SHOW ENGINE INNODB STATUS in MySQL.

Common Pitfalls

Autocommit confusion — most drivers default to autocommit mode. Each statement is its own transaction unless you explicitly BEGIN.
Holding transactions open during external calls — an HTTP request or message queue call inside a transaction can make it last seconds or minutes, causing lock escalation.
Assuming READ COMMITTED prevents all anomalies — it still allows non-repeatable reads and phantoms. If your logic reads a value and then makes a decision based on it, the value may have changed by the time you write.
Deadlocks from inconsistent lock ordering — transaction A locks row 1 then row 2; transaction B locks row 2 then row 1. The database will detect and abort one, but this wastes work.
Ignoring implicit locks from foreign keys — an INSERT into a child table takes a shared lock on the referenced parent row. Heavy child inserts can contend on the parent.
Not testing under concurrency — isolation bugs only appear under concurrent load. Single-threaded tests will never reveal lost updates or write skew.

Install this skill directly: skilldb add sql-skills

Get CLI access →