Community Blog Learning about Distributed Systems – Part 9: An Exploration of Data Consistency

Learning about Distributed Systems – Part 9: An Exploration of Data Consistency

Part 9 of this series introduces the replica mechanism for high availability and discusses data consistency.

By Qinxia

Disclaimer: This is a translated work of Qinxia's 漫谈分布式系统. All rights reserved to the original author.

Important Consistency Issues

In the previous article, we finally brought up the second core problem of distributed systems-availability. It also mentioned how replication is the only way to high availability.

As mentioned at the end of the previous article, in addition to providing high availability, replication may bring serious consequences.

  • For example, the asynchronous data replication to the slave replica may fail due to network jitter. At this time, if there is a request to read the slave replica, the latest data cannot be read.
  • For another example, in a multi-master scenario, two masters may receive requests to modify the same data at the same time. At this time, when two write operations are successfully returned to the client and copied to each other, the update may fail due to data conflicts.

Similar problems can be collectively referred to as data consistency problems.

In the previous article, we focused on the two features of replication-master-slave and timeliness. The combination of the two features may cause the following data consistency risks:

Data Synchronization Multi-Master Consistency Risks
sync single master None
sync multi-master Yes
async single master Yes
async multi-master Yes

The multi-master and asynchronous modes may bring data consistency risks. (Leaderless replication can be regarded as full-master replication.)

  • Asynchronous replication causes replication lag, which causes data synchronization to be delayed.
  • Multi-master mode brings concurrent writes, resulting in data conflicts.

Once data consistency is not guaranteed, many practical problems may occur both inside and outside the system. For example:

  1. A user pays for a concert ticket and then refreshes the page. As the data replica that the request is forwarded to has not been updated, it is found there is no ticket in the account.
  2. A user receives a push that indicates there is a new message for his article, but after clicking in, he finds that there is no new message because the accessed replica has not been updated.
  3. A user asks a question, and another user answers it. However, for a data replica, the answer may be copied before the question, and the third user will see the strange phenomenon of having the answer first and the question second.

These practical problems make the system untrustworthy at the application layer. The cost of losing trust is very high.

Therefore, solving the consistency problem has become a major issue for distributed systems.

There are two main types of solutions:

  1. Prevention avoids consistency problems and provides the strongest consistency guarantee.
  2. Treatment allows inconsistency problems and provides a weak consistency guarantee.

From the point of view of convergence, the first type of method forces the real-time convergence of inconsistent data, while the second type of method allows inconsistent data to diverge first and gradually converge.

From the perspective of message order, the strong consistency of the prevention method ensures that for any node, the data generated before will not be incorrectly placed behind the data generated later due to problems (such as replication lag). The message linearizability of the entire system is maintained in the first type of method. The second type of method is non-linearizable. (The order of messages is very important and will be discussed in later articles.)

Preventive Solutions to Data Inconsistency

As the saying goes, nip it in the bud. Avoiding problems from the source is naturally the ideal goal.

In particular, data inconsistency is such a serious and poorly resolved problem that should be avoided by all means.

So let's look at the preventive solutions to data inconsistency.

Single-Master Synchronous Replication

The simplest solution is the single leader + synchronous replication mode mentioned earlier.

  • The single leader ensures that all data is processed by a single node, avoiding write conflicts.
  • The synchronous replication ensures that all replicas are updated before they are returned to the client, avoiding data loss caused by standalone failures.

This way, we can achieve the strong consistency we want, and the entire distributed system looks like a standalone system with no replicas. Users can get a consistent experience when accessing the system from anywhere at any time. Therefore, this consistency is also called single-copy consistency.

However, if we delve into it, it seems there are still some corner cases.

Example A:

  1. After the master receives a request from the client, it persists it and then sends it to the slave.
  2. After the slave receives the forwarded request, it persists the request and then returns an ACK to the master.
  3. After the master receives the ACK from the slave, it crashes before returning the ACK to the client.

In this case, the client thinks the system has not successfully processed the request, but both the master and slave have persisted data, so the client and server have different perceptions.

Example B:

  1. Due to network jitter, the master is misjudged as being disconnected.
  2. The system performs failover, and the slave becomes a new master.
  3. The network is restored, and the original master goes back to normal.

At this time, there will be two masters (the so-called split-brain phenomenon), and even the premise of single-master has been destroyed.

In this case, the system unexpectedly becomes multi-leader. It is still difficult to guarantee strong consistency with a carefully designed multi-leader system, let alone in such exceptions.

Example C:

  1. In the case of three replicas, the master synchronizes data to the other two replicas, such as deducting 1 yuan from an account.
  2. One of the replicas successfully obtains the data and persists it locally and then sends an ACK to the master.
  3. However, after the other replica performs data persistence, the ACK sent to the master is lost due to the network jitter.
  4. Since the master does not receive the ACK from the second replica, it decides that the task has failed and resends the request.

This way, the data between replicas is inconsistent. 1 yuan will be deducted from the account on the first replica, while 2 yuan will be deducted from the account on the second replica.

Therefore, the single-master synchronous replication method does not provide absolute strong consistency but only consistency under the best-effort guarantee in normal cases.

(The corner cases above are also related to the exactly-once problem. Subsequent articles in this series will be devoted to it, which is not discussed here.)


Several corner cases mentioned above (such as the split-brain problem) seem to be very special, but there may be a very common fact behind them.

What causes the node to be misjudged as dead, resulting in a split-brain?

  • Network Jitter
  • GC Pauses

Such reasons cause the communication between nodes to be inaccessible, at least in the short term.

More professionally, it is called network partition, which means a cluster is divided into several partitions with no network connection.

This leads to the famous CAP theorem.

Consistency (C), availability (A), and partition tolerance (P), at most two of which can be satisfied at the same time. 

We have talked a lot about consistency and availability. We want availability, so we introduce the replica mechanism, which leads to a consistency crisis. Now, there is a network partition problem that may need to be solved.

However, the CAP theorem tells us we can't solve the problem.

Then let's deduce it.

  • Assume that C and A are satisfied. At this time, if a network partition occurs, the data replication cannot be completed using the single-master synchronous replication method, so P cannot be satisfied.
  • Assume that C and P are satisfied. If the network partition occurs, to ensure data consistency, only one of the partitions can work normally, and the service of the other partitions must be suspended. Then, these partitions are completely unavailable, and A cannot be satisfied.
  • Assume that A and P are satisfied. If the network partition occurs and each partition can work normally, data cannot be synchronized between partitions when data is written. After communication is restored, there may be unsolvable data conflicts, which means C cannot be satisfied.

In this analysis, the three cannot be satisfied at the same time.

In addition, in the analysis above, each case is based on the initial condition, if network partition occurs, which reveals its difference.

C, A, and P are not at the same level. C and A are targets, while P is an unavoidable precondition, although Partition-Tolerance is also a target. Numerous production accidents have told us that network partition occurs anytime and anywhere.

Therefore, a system without P does not have real high availability.

However, when CAP is implemented in the design of production-level distributed systems, it is more about making a trade-off between C and A on the premise of P.


We have introduced the replica mechanism for high availability, but the side effect of the replica mechanism is that it will cause data consistency problems.

  • Replication lag may cause data synchronization to be delayed.
  • Multi-master concurrent writes may cause data conflicts.
  • The problem of data consistency causes many practical problems at the application level, making the system untrustworthy to the outside world. Therefore, it must be solved.
  • The solution to the data consistency problem can be divided into two categories: prevention and treatment.
  • The most basic method of prevention is single-master synchronous replication, but it can only achieve consistency under the best-effort guarantee and cannot solve some corner cases.
  • Behind these corner cases lies a more fundamental puzzle, the CAP theorem.

For the example C above, we can interpret it differently: Data is copied from the master to multiple slaves, which can be regarded as several independent events of writing data to different nodes. It is the partial success of these events that leads to data inconsistency.

If all events fail, try again. However, if some events succeed and some fail, the retry may cause data inconsistency.

There is already a reliable solution-transactions (the essence of the problem) to avoid partial success in multiple events or maintain the atomicity (either all succeed or all fail) of multiple events.

Specifically, what we need are distributed transactions.

In the next article, let's learn about distributed transactions.

This is a carefully conceived series of 20-30 articles. I hope to give everyone a core grasp of the distributed system in a storytelling way. Stay tuned for the next one!

0 1 0
Share on

Alibaba Cloud_Academy

60 posts | 47 followers

You may also like