CAP theorem
CAP is a narrow impossibility result. PACELC is the design tool. This is the long version with proofs, real systems, and how to tune.
The actual theorem and its limits
Brewer conjectured CAP in 2000. Gilbert and Lynch proved it in 2002 for the asynchronous model with no clocks. The proof is short: assume two nodes N1 and N2, both holding value v=0. Partition them. Client writes v=1 to N1. Client reads from N2. N2 cannot know about the write, so it either returns 0 (not consistent) or refuses (not available).
That is the whole proof. It is a worst-case theorem about a specific failure mode. It does not say anything about normal operation, latency, throughput, or tunable consistency.
The definitions are also strict:
- Consistency = linearizability. Reads see the most recent committed write in real time.
- Availability = every request to a non-failing node returns within a finite time without error.
- Partition tolerance = the system continues to operate when arbitrary messages are dropped between nodes.
Most systems violate at least one of these definitions in normal use. DynamoDB returns errors during throttling (not strictly A). Postgres returns stale data from replicas (not strictly C). The theorem still applies to the strict versions, but the labels are looser in practice.
Why partition tolerance is non-negotiable
You can build a non-partition-tolerant system on a single machine. The moment you have two nodes communicating over a network, partitions are possible. Coda Hale put it well: "you cannot choose CA because you cannot wave a wand and make the network reliable."
What you can do is reduce partition probability. Same rack, same AZ, redundant switches. AWS publishes that intra-AZ partitions are rare but not zero. Cross-region partitions happen monthly somewhere on the planet.
So in practice, P is a given. The choice is C vs A during the partition, and L vs C the rest of the time.
PACELC: the design tool
Daniel Abadi's PACELC says: if Partitioned, choose A or C. Else, choose Latency or Consistency.
The "else" clause is what matters 99.9% of the time. Every replica adds latency to strongly consistent writes because you have to round-trip to a quorum. If you replicate across regions, that round-trip is 80-200ms. So even when nothing is broken, strong consistency costs latency.
| System | Partition mode | Normal mode |
|---|---|---|
| DynamoDB | PA | EL |
| Cassandra | PA | EL |
| MongoDB (default) | PA | EC |
| Spanner | PC | EC |
| CockroachDB | PC | EC |
| HBase | PC | EC |
| Postgres (single primary) | CA-ish | EC |
Spanner gets to PC/EC by using TrueTime, which bounds clock skew with GPS and atomic clocks. That lets it commit transactions globally with a few hundred ms of latency. Without TrueTime, you would need explicit consensus rounds across regions, which is slower.
Tunable consistency
Cassandra and DynamoDB let you tune per query. In Cassandra:
- W + R > N gives strong consistency, where W is write replicas, R is read replicas, N is total replicas.
- W=1, R=1, N=3: fast, eventually consistent.
- W=2, R=2, N=3 (QUORUM): strong consistency, survives 1 node failure.
- W=3, R=1: read fast, write slow, no write tolerance.
DynamoDB has ConsistentRead=true which uses the leader and pays ~2x latency. Default eventual reads hit any replica.
This is the real answer to "is it CP or AP?" - the question is wrong. The right question is "what consistency does this operation give me?"
Common pitfalls
The "2 of 3" misconception. Brewer himself wrote a follow-up in 2012 saying the 2-of-3 framing oversimplifies. You always need P. The trade is per-operation, not per-system.
Confusing eventual consistency with weak consistency. Eventual just means "if writes stop, replicas converge." That is a very weak guarantee. Causal consistency, read-your-writes, monotonic reads are all stronger and often available.
Assuming a single-master Postgres is CA. It is not. If the primary partitions from clients, it is unavailable. If a replica partitions, it serves stale reads. The label CA only works if you assume zero partitions, which is not a real assumption.
Real-world failure case: GitHub 2018
GitHub had a 43-second network partition between East Coast and West Coast Orchestrator MySQL. They were PC/EC by configuration. The failover system promoted a West Coast replica that was behind. When the partition healed, they had two primaries with divergent writes. Recovery took 24 hours because they had to manually reconcile.
This is the canonical case for "consistency vs availability matters." They chose C in design but their failover broke that guarantee under partition. Outcome: 24 hours of degraded service.
When to memorize what
For interviews, memorize:
- Definition (Gilbert-Lynch strict version)
- The 2-line proof
- PACELC and how to apply it
- One real example each for PA/EL, PC/EC
- One real outage where CAP mattered (GitHub 2018 is the easiest)
If you can say all five in 90 seconds, you are above the bar for senior. If you can also tune Cassandra consistency levels by hand and explain why W+R>N works, you are above the bar for staff.
What to read next
- DDIA chapter 9 (Consistency and Consensus). The best single source.
- Jepsen analyses for specific databases. Kyle Kingsbury breaks real systems and shows what they actually do under partition.
- Spanner paper for how TrueTime sidesteps the latency cost.
Learn more
- ArticleDesigning Data-Intensive Applications, Chapter 9Martin Kleppmann
- PaperBrewer's original CAP conjectureEric Brewer
- Paper
- PaperPACELC paperDaniel Abadi
- ArticleJepsen analysesKyle Kingsbury