By Chen Jianbin (Funkye), Github ID: a364176773
Seata is an open-source distributed transaction solution with more than 18,100 stars. The Seata community is highly active. It is committed to providing high-performance and easy-to-use distributed transaction services under the microservice architecture. This article will analyze the implementation principle of Seata-AT so users can have a deeper understanding of the AT mode.
Seata defines the framework of global transactions. A global transaction is defined as the overall coordination with several branch transactions:
The processing of global transaction by Seata is divided into two phases:
The Seata transaction mode refers to the behavior pattern of branch transactions running under the global transaction framework of Seata. For accuracy, it should be called branch transaction mode.
Different transaction modes differ in that branch transactions use different methods to achieve the goals of the two phases in the global transaction. That answers the following two questions:
First, let's take a look at how TCC transactions are integrated into the Seata transaction framework:
It looks very similar to the Seata transaction framework diagram. The differences are that the RM is responsible for the try execution of the first phase and the confirm and cancel operations of the second phase. Both transaction modes use TM to begin transactions. RM executes the first phase try method after being called by TM. When the call procedure is completed, TM informs TC of the second phase execution. Then, the TC drives the RM to execute the second phase This process gives a notification to RM to execute the confirm or cancel operation.
As shown in the figure, the XA Mode means that the Seata underlying layer uses XA interfaces for automatic processing at the first and second phases. For example, in the first phase, the RM of XA creates XAConnection through a proxy user data source. Then, it performs the XA transaction (XA start) and XA-prepare. Any XA operations will be persisted during this period, and they can be restored even in the downtime. In the second phase, the TC notifies the RM to perform the commit or rollback operation on the XA branch.
Structured Query Language (SQL) statement: update product set name = 'GTS' where name = 'TXC'.
The first phase execution process is unperceivable to users. The user-side SQL statements remain unchanged. What happens in the first phase of the AT mode? The question will be answered below:
According to the preceding example, the AT pattern is an automatic compensation transaction. The following part will describe what the AT mode does.
Let's take a look at this figure:
Many people may have questions when they see the preceding figure. This is the schematic diagram of the non-intrusive AT mode. First, the user enters from the interface to the transaction beginner. For the business developer, the beginner entry is just a business interface. The SQL statements and returning of response information to the client are the same. The difference is that the users' SQL statements are managed by Seata proxy. The Seata-AT mode perceives and operates all users' SQL statements to ensure consistency.
How does Seata-AT achieve non-intrusive mode?
As shown in the figure, Seata will automatically act as the proxy for the user's DataSource when the application starts. Users that are familiar with Java Database Connectivity (JDBC) operations are usually familiar with the DataSource. Knowing the DataSource makes it easier to master the data source connections. Therefore, through some little tricks, there is no perception of invasion for users.
After that, when a business request is sent, and the SQL statement is executed, Seata parses the user's SQL statements and extracts the table metadata. Then, Seata generates pre-images and saves post-images after SQL statement executions through running the business SQL statements. The post-image will be described later in this article. After the row lock is generated, Seata carries the row lock to the Seata-Server, which is the TC side during the branch registration.
So far, the first phase operation on the client has been completed, and it is unperceivable and non-intrusive. Now, there is a row lock. What is the purpose of a row lock? This explains how Seata-AT ensures transaction isolation in distributed mode. Here, we can use the official website as an example.
Two global transactions, the tx1 and tx2, update the m field of table a, respectively. The initial value of m is 1000.
The tx1 starts first. It starts a local transaction, obtains the local lock, and updates the operation of m = 1000 - 100 = 900. Before submitting the local transaction, the global lock with this operation record is obtained first and the local lock is committed and released by the local. The tx2 starts after. It starts a local transaction, obtains the local lock, and updates the operation of m = 900 - 100 = 800. Before submitting the local transaction, the tx2 tries to obtain the global lock with the operation record. The global lock with the operation record is held by the tx1, and the tx2 needs to retry for the global lock until the tx1 is committed globally.
At the second phase of the tx1, the global lock is committed and released globally. The tx2 obtains the global lock and commits the local transaction.
If the second phase of the tx1 global transaction is rolled back, the tx1 needs to reacquire the local lock of the data to perform the update operation of the reverse compensation to roll back the branch.
If the tx2 is still waiting for the global lock of the data and holding the local lock, the tx1 branch rollback will fail. The branch rollback retries until the overtime waiting for the tx2 global lock. Then, the global lock is discarded, and the local transaction is rolled back to release the local lock. Finally, the tx1 branch rolls back successfully.
Since the global lock is held by the tx1throughout the process until the tx1 is completed, no dirty writing occurs.
Next, we will continue to analyze the second phase.
As shown in the preceding figure, during the second phase commitment, TC only issues a notification. The notification is deletions of the undoLog recorded in the previous phase as well as related transaction information, such as the row lock. These deletions allow blocked transactions due to the competition for locks to proceed smoothly.
More processing is needed when the rollback is in the second phase.
First, when the Client is informed by TC that the second stage is rollback, it will find the undoLog of the corresponding transaction. Then, it takes the post-image to make comparisons with the current data. Since Seata-AT protects distributed transactions at the service application level, isolation is not available in the row lock of Seata-AT if information in the database is modified directly. If the data is modified outside of a global transaction, it is considered dirty writing. Since Seata cannot perceive how dirty writing occurs, it can only print logs and trigger exception notifications. The users will be informed that manual intervention is required. You can avoid dirty writing by specifying the data modification entry.
However, if no dirty writing occurs, the case is relatively simple. First, take out the pre-image. Transactions need to be atomic. They are either triggered together or not triggered at all. At this point, the pre-image records the data before the transaction is triggered. After the rollback, atomicity similar to the local transactions is achieved. Then, the transaction-related information, such as the undoLog and row lock, is deleted. The second phase of rollback is over.
After the introduction to principles of the first and second phases of the AT pattern, how does AT mode work under the Seata distributed transaction framework?
In the Seata-AT transaction framework, there is one more undoLog table from comparing with the other transaction modes. This is the intrusion point compared with other modes. However, for business, it is almost non-intrusive, which is why AT mode is widely used in Seata.
First of all, to be clear, so far, no distributed transaction can meet the requirements of all scenarios.
The AT, TCC, and Saga modes were proposed because the XA specification cannot meet the requirements of some scenarios.
So far, there are three differences among the three modes:
The AT mode uses the global lock to ensure the basic writing isolation, which locks the data. The locks are centrally managed on the TC side, and the unlocking is very effective without blocking problems.
The TCC mode does not have a lock. With the exclusive lock of local transactions, the resources can be reserved, and the corresponding operations can be performed after the global transaction is determined.
In XA mode, data is locked, and read/write is constrained by the isolation level before the entire transaction ends.
When the XA mode is prepared, the branch transaction is blocked. In old versions of the database, the preparation is sent after the XA ends, which is the origin of three phases. It requires block waiting before receiving XA commit or XA rollback.
AT supports degradation. Since the locks are stored on the TC side, if Seata has bugs or other problems, the degradation can be performed without any impact on the subsequent transaction call procedure.
This problem does not occur in TCC mode.
Performance damage mainly comes from two aspects. Firstly, transaction-related processing and coordination increase the RT of a single transaction. Secondly, lock conflict of concurrent transaction data reduces the throughput. This is mainly caused by the preceding protocol blocking and the data locking.
The first phase XA mode is not submitted. In high-concurrency scenarios, locks are stored on multiple resource sides, such as databases, which exacerbate performance damage.
In AT mode, the lock granularity is as fine as row level, which requires a primary key. All transaction locks are stored on the TC side, which brings efficient and quick unlocking.
The TCC mode has optimal performance. It only requires a small amount of RPC overhead and a performance overhead for two local transactions. However, in TCC mode, the requirements for resource reservation scenarios must be met. This process is relatively more invasive to the business. The business developer is required to divide each interface into three types: one try, one confirm, and one cancel. The last two are used in the second phase.
If you are not familiar with the locks and protocol blocking of the XA and AT mode, you can take a look at the following figure:
Do you know which one XA is? The figure below is XA. Due to the larger lock granularity and the longer lock time, the concurrency performance is much worse than the AT transaction model. Therefore, nowadays, the XA model is not very popular.
The Seata console is a long-standing issue for Seata users. The lack of a visual interface leads to the uncertainty that users have towards the reliability of Seata. What's worse, since there is no console, manual interventions in distributed transactions on Seata are also restricted. Therefore, the console will be added in the 1.5.0, and you are welcomed to join us!
The reason for the Raft integration may not be very clear to most users. First, we need to know that the transaction information at the TC end is stored in external storage, such as databases, Redis, and MongoDB in the PR stage. This causes the Seata-Server cluster to be unavailable if the external storage goes down. It is unacceptable that ten or more nodes become unavailable even if the Server is deployed in a cluster.
Therefore, the Raft is introduced to achieve consistency in the transaction information of each Seata-Server. This way, even if a node goes down, the accuracy of the transaction information is not destroyed. This ensures the consistency of distributed transactions. The implementation of Seata-Server with Raft will be discussed in subsequent articles.
This is a major performance optimization for the AT mode of 1.5.0. Due to the large amount of data to be operated at the first phase, there may also be a large amount of the undoLog information inserted by Seata. This may cause slow warehousing. Therefore, the undoLog should be compressed so the insertion of undolog will no longer take a huge overhead when the branch data volume of AT transactions is large.
The AT is designed to serve as a proxy for resource operations. It also records the original and changed states and uses locks to ensure data isolation. When an exception occurs in the call procedure, all branch data is restored to achieve the atomicity of distributed transactions.
The core value of Seata is to build a standardized platform that solves distributed transaction problems comprehensively.
Based on Seata, appropriate distributed transaction solutions can be chosen flexibly for the upper-layer application architecture according to the requirements of real-world scenarios. You are welcome to participate in the project construction to help build a standardized distributed transaction platform.
Stone Doyle - January 28, 2021
Alibaba Clouder - January 30, 2019
vic - August 30, 2019
Alipay Technology - November 12, 2019
ApsaraDB - February 15, 2021
ApsaraDB - July 3, 2019
A ledger database that provides powerful data audit capabilities.Learn More
A financial-grade distributed relational database that features high stability, high scalability, and high performance.Learn More
ApsaraDB for ClickHouse is a distributed column-oriented database service that provides real-time analysis.Learn More
Mitigate the scalability problem of single machine relational databases for large-scale online databases.Learn More
More Posts by Alibaba Developer