One of the major difficulties in distributed systems is how to synchronize the states between nodes, and CAP is a proven law in the field of distributed systems.

The meaning of each letter is as follows.

  • Consistency
  • Availability
  • Partition tolerance

Here we use some simple examples to illustrate.

cap

Suppose our system is composed of two services:G1 and G2.

G1 and G2 maintain the same record V0. G1 and G2 can communicate with each other, and an external client (Client) can invoke either service.

When a client initiates a read|write request to any of the services, then the requested server processes and responds to the result according to the client’s request.

For example, the client initiates a read operation to G1, and the client G1 responds to the request.

cap

Another example is that the client initiates a write operation to G1 to modify V0 to V1.

cap

Consistency

There is a difference between this consistency and the C in ACID in a transaction. C in a transaction means that the integrity of the database is not broken before the transaction starts and after the transaction ends. Integrity here includes some foreign key constraints, uniqueness of keys, etc.

While C in distributed transaction means that the read operation after the write operation must return the value, which is equivalent to all nodes accessing the same copy of the latest data.

Here is an example of non-consistency.

cap

The client successfully requests the G1 server to write V1. Due to network partitioning, the G1 data cannot be synchronized to G2, so the client reads data from G2 and still returns V0.

Let’s see an example of satisfying consistency.

cap

In this system, G1 receives a write V1 operation from the client, and G1 synchronizes the V1 data to G2 while modifying its own data. g1 responds to the client with the result only after receiving the response from G2. In this way, no matter which node the client reads data from afterwards, the V1 value will be available.

Availability

Availability refers to the fact that every request received by a non-faulty node in the system must be responded to.

In an available system, if a client sends a request to the server, then the server must respond to every request from the client and is not allowed to ignore client requests.

Partition Tolerance

What partitioning?

Network partitioning.

How do you understand network partitioning?

Suppose there are two servers A and B. They are communicating normally, but for some reason, the network link between them is broken, so they cannot communicate normally. At this point, AB, which was in the same network, has a network partition. Became A where the A network and B where the B network. This is network partitioning.

What does tolerance mean?

When the above network partitioning occurs for multiple servers of a service, the system is still able to provide the service normally.

Misconceptions about CAP

Many articles on the Internet state that.

A distributed system can only satisfy at most two of three things at the same time - Consistency, Availability, and Partition tolerance - which is known as the CAP law.

It seems that CAP is interpreted as a kind of two out of three law.

I came across an article CAP Twelve Years Later: How the "Rules" Have Changed with a full statement of CAP:

The CAP theorem asserts that any networked shared-data system can have only two of three desirable properties.

According to the formulation, this statement on the Internet is found to be somewhat misleading.

The CAP theorem asserts that P (network partitioning), and the choice of CA comes only after the occurrence of P.

When P occurs, and our system is directly out of service, then there is no choice of what CA.

Therefore the normal understanding of CAP should be: when P(network partition) occurs, if we want to continue to provide service, then C(strong consistency) and A(availability) can only be 2 choices 1.

The conflict between Consistency and Availability

Why CA can only choose one or the other when P happens?

It’s the same simple example as before.

Suppose that network partitioning occurs at G1 and G2 at this time.

cap

Next our client requests G1 to write V1 data. Due to partitioning, G1 cannot synchronize data to G2.

cap

cap

If we guarantee the availability of G2, then when the client requests G2 data, G2 can respond to V0 data normally and the data is inconsistent.

If we guarantee the consistency of G2, then when G1 write operation, it needs to lock G2 read and write operation, and G2 is not available at this time.

Therefore, there is a contradiction between Consistency and Availability.