Deep dive15 min read← Back to crisp

CAP theorem

CAP is a narrow impossibility result. PACELC is the design tool. This is the long version with proofs, real systems, and how to tune.

The actual theorem and its limits

Brewer conjectured CAP in 2000. Gilbert and Lynch proved it in 2002 for the asynchronous model with no clocks. The proof is short: assume two nodes N1 and N2, both holding value v=0. Partition them. Client writes v=1 to N1. Client reads from N2. N2 cannot know about the write, so it either returns 0 (not consistent) or refuses (not available).

That is the whole proof. It is a worst-case theorem about a specific failure mode. It does not say anything about normal operation, latency, throughput, or tunable consistency.

The definitions are also strict:

Consistency = linearizability. Reads see the most recent committed write in real time.
Availability = every request to a non-failing node returns within a finite time without error.
Partition tolerance = the system continues to operate when arbitrary messages are dropped between nodes.

Most systems violate at least one of these definitions in normal use. DynamoDB returns errors during throttling (not strictly A). Postgres returns stale data from replicas (not strictly C). The theorem still applies to the strict versions, but the labels are looser in practice.

Why partition tolerance is non-negotiable

You can build a non-partition-tolerant system on a single machine. The moment you have two nodes communicating over a network, partitions are possible. Coda Hale put it well: "you cannot choose CA because you cannot wave a wand and make the network reliable."

What you can do is reduce partition probability. Same rack, same AZ, redundant switches. AWS publishes that intra-AZ partitions are rare but not zero. Cross-region partitions happen monthly somewhere on the planet.

So in practice, P is a given. The choice is C vs A during the partition, and L vs C the rest of the time.

CAP forced choice during partition

PACELC: the design tool

Daniel Abadi's PACELC says: if Partitioned, choose A or C. Else, choose Latency or Consistency.

The "else" clause is what matters 99.9% of the time. Every replica adds latency to strongly consistent writes because you have to round-trip to a quorum. If you replicate across regions, that round-trip is 80-200ms. So even when nothing is broken, strong consistency costs latency.

System	Partition mode	Normal mode
DynamoDB	PA	EL
Cassandra	PA	EL
MongoDB (default)	PA	EC
Spanner	PC	EC
CockroachDB	PC	EC
HBase	PC	EC
Postgres (single primary)	CA-ish	EC

Spanner gets to PC/EC by using TrueTime, which bounds clock skew with GPS and atomic clocks. That lets it commit transactions globally with a few hundred ms of latency. Without TrueTime, you would need explicit consensus rounds across regions, which is slower.

Tunable consistency

Cassandra and DynamoDB let you tune per query. In Cassandra:

W + R > N gives strong consistency, where W is write replicas, R is read replicas, N is total replicas.
W=1, R=1, N=3: fast, eventually consistent.
W=2, R=2, N=3 (QUORUM): strong consistency, survives 1 node failure.
W=3, R=1: read fast, write slow, no write tolerance.

DynamoDB has ConsistentRead=true which uses the leader and pays ~2x latency. Default eventual reads hit any replica.

This is the real answer to "is it CP or AP?" - the question is wrong. The right question is "what consistency does this operation give me?"

Common pitfalls

The "2 of 3" misconception. Brewer himself wrote a follow-up in 2012 saying the 2-of-3 framing oversimplifies. You always need P. The trade is per-operation, not per-system.

Confusing eventual consistency with weak consistency. Eventual just means "if writes stop, replicas converge." That is a very weak guarantee. Causal consistency, read-your-writes, monotonic reads are all stronger and often available.

Assuming a single-master Postgres is CA. It is not. If the primary partitions from clients, it is unavailable. If a replica partitions, it serves stale reads. The label CA only works if you assume zero partitions, which is not a real assumption.

Real-world failure case: GitHub 2018

GitHub had a 43-second network partition between East Coast and West Coast Orchestrator MySQL. They were PC/EC by configuration. The failover system promoted a West Coast replica that was behind. When the partition healed, they had two primaries with divergent writes. Recovery took 24 hours because they had to manually reconcile.

This is the canonical case for "consistency vs availability matters." They chose C in design but their failover broke that guarantee under partition. Outcome: 24 hours of degraded service.

When to memorize what

For interviews, memorize:

Definition (Gilbert-Lynch strict version)
The 2-line proof
PACELC and how to apply it
One real example each for PA/EL, PC/EC
One real outage where CAP mattered (GitHub 2018 is the easiest)

If you can say all five in 90 seconds, you are above the bar for senior. If you can also tune Cassandra consistency levels by hand and explain why W+R>N works, you are above the bar for staff.

Learn more

Article
Designing Data-Intensive Applications, Chapter 9Martin Kleppmann
Paper
Brewer's original CAP conjectureEric Brewer
Paper
Gilbert and Lynch proofMIT
Paper
PACELC paperDaniel Abadi
Article
Jepsen analysesKyle Kingsbury

Deep dive15 min read← Back to crisp

CAP theorem

CAP is a narrow impossibility result. PACELC is the design tool. This is the long version with proofs, real systems, and how to tune.

The actual theorem and its limits

That is the whole proof. It is a worst-case theorem about a specific failure mode. It does not say anything about normal operation, latency, throughput, or tunable consistency.

The definitions are also strict:

Consistency = linearizability. Reads see the most recent committed write in real time.
Availability = every request to a non-failing node returns within a finite time without error.
Partition tolerance = the system continues to operate when arbitrary messages are dropped between nodes.

Why partition tolerance is non-negotiable

So in practice, P is a given. The choice is C vs A during the partition, and L vs C the rest of the time.

CAP forced choice during partition

PACELC: the design tool

Daniel Abadi's PACELC says: if Partitioned, choose A or C. Else, choose Latency or Consistency.

System	Partition mode	Normal mode
DynamoDB	PA	EL
Cassandra	PA	EL
MongoDB (default)	PA	EC
Spanner	PC	EC
CockroachDB	PC	EC
HBase	PC	EC
Postgres (single primary)	CA-ish	EC

Tunable consistency

Cassandra and DynamoDB let you tune per query. In Cassandra:

W + R > N gives strong consistency, where W is write replicas, R is read replicas, N is total replicas.
W=1, R=1, N=3: fast, eventually consistent.
W=2, R=2, N=3 (QUORUM): strong consistency, survives 1 node failure.
W=3, R=1: read fast, write slow, no write tolerance.

DynamoDB has ConsistentRead=true which uses the leader and pays ~2x latency. Default eventual reads hit any replica.

This is the real answer to "is it CP or AP?" - the question is wrong. The right question is "what consistency does this operation give me?"

Definition (Gilbert-Lynch strict version)
The 2-line proof
PACELC and how to apply it
One real example each for PA/EL, PC/EC
One real outage where CAP mattered (GitHub 2018 is the easiest)

If you can say all five in 90 seconds, you are above the bar for senior. If you can also tune Cassandra consistency levels by hand and explain why W+R>N works, you are above the bar for staff.

Learn more

Article
Designing Data-Intensive Applications, Chapter 9Martin Kleppmann
Paper
Brewer's original CAP conjectureEric Brewer
Paper
Gilbert and Lynch proofMIT
Paper
PACELC paperDaniel Abadi
Article
Jepsen analysesKyle Kingsbury

CAP theorem

The actual theorem and its limits

Why partition tolerance is non-negotiable

PACELC: the design tool

Tunable consistency

Common pitfalls

Real-world failure case: GitHub 2018

When to memorize what

What to read next

Learn more

CAP theorem

The actual theorem and its limits

Why partition tolerance is non-negotiable

PACELC: the design tool

Tunable consistency

Common pitfalls

Real-world failure case: GitHub 2018

When to memorize what

What to read next

Learn more