CAP Theorem
Introduction to the CAP Theorem
When building a massive distributed database across multiple servers, engineers desperately want three things:
- They want every server to have the exact same data instantly.
- They want the database to always answer every query immediately.
- They want the database to survive if the network cables between the servers are cut.
In 2000, computer scientist Eric Brewer proposed the CAP Theorem (later mathematically proven). The theorem states a brutal, unbreakable law of physics and computer science:
In any distributed data store, you can only mathematically guarantee TWO out of the following THREE properties simultaneously:
- Consistency (C)
- Availability (A)
- Partition Tolerance (P)
The Three Properties Explained
1. Consistency (C)
Consistency means that every single read request receives the most recent write. If a user updates their password on Server A, and one millisecond later tries to log in using Server B, Server B must know the new password. If Server B hasn’t received the update yet, it must refuse to answer until it is perfectly synchronized.
2. Availability (A)
Availability means that every request receives a non-error response, regardless of the state of the system. The database must answer, even if the answer is slightly outdated.
3. Partition Tolerance (P)
A “Partition” is a network failure. Server A and Server B are both online, but the internet connection between them is severed. They cannot talk to each other. Partition Tolerance means the database as a whole continues to function despite this network failure.
The Brutal Choice: CP vs. AP
Because network failures (Partitions) are a physical certainty in the real world (routers crash, cables break), a distributed system must be Partition Tolerant (P).
Therefore, the CAP theorem dictates that when a network failure occurs, the database engineer must make a painful choice between Consistency (C) and Availability (A).
CP Databases (Consistency + Partition Tolerance)
- Example: MongoDB, HBase, traditional Relational Databases configured for strict clustering.
- The Scenario: The network breaks. Server A cannot talk to Server B.
- The Choice: The database chooses Consistency. Because Server B cannot verify if Server A just received new data, Server B simply shuts down and returns an Error to the user. It sacrifices Availability to guarantee that it never returns stale or incorrect data. This is critical for banking and financial applications.
AP Databases (Availability + Partition Tolerance)
- Example: Apache Cassandra, Amazon DynamoDB.
- The Scenario: The network breaks. Server A cannot talk to Server B.
- The Choice: The database chooses Availability. If a user asks Server B for their account balance, Server B happily returns the balance it has on file, even though it knows it might be outdated (stale). It sacrifices Consistency to guarantee the user’s web page doesn’t crash. This is critical for social media feeds and e-commerce shopping carts.
Conclusion
The CAP Theorem is the foundational mental model for data engineering architecture. It proves that there is no such thing as a “perfect” distributed database. Every database choice is a deliberate trade-off. Architects must understand the business requirements—whether the business can tolerate the website going down (CP), or whether it can tolerate displaying a slightly outdated number (AP)—before selecting the underlying storage technology.
Deepen Your Knowledge
Ready to take the next step in mastering the Data Lakehouse? Dive deeper with my authoritative guides and practical resources.
Explore Alex's Books