Use Transactional Messages to Understand RocketMQ Service Messages

This article introduces the scenarios, basic principles, implementation details, and actual use of RocketMQ transactional messages to help you use RocketMQ transactional messages.

By Hebo

A common problem exists in the distributed system call scenario. Multiple downstream businesses need to be called for business processing while executing a core business logic. Multiple downstream businesses and the current core business must succeed or fail at the same time, thus avoiding the inconsistency between partial successes and failures. In short, the transaction in the message queue mainly solves the data consistency problem between message producers and consumers. This article introduces the scenarios, basic principles, implementation details, and actual use of RocketMQ transactional messages to help you understand and use RocketMQ transactional messages.

Scenario: Why Are Transactional Messages Needed?

When a user places an order in an e-commerce scenario, downstream systems are triggered to make changes accordingly. For example, the logistics system must initiate a shipment, the credit system must update the user's credit points, and the shopping cart system must clear the user's shopping cart. The processing branches include:

The ordering system changes the order status from unpaid to paid.
The logistics system adds a to-be-shipped record and creates an order logistics record.
The credit system updates the credit points of the user.
The shopping cart system clears the shopping cart and updates the user's shopping cart records.

Distributed system calls are characterized by the execution of one core business logic and the need to call multiple downstream services for processing. Therefore, ensuring the consistency of the execution results between the core business and the downstream businesses is the biggest challenge that needs to be solved for distributed transactions.

Traditional XA Solution: Poor Performance

The typical method used to ensure result consistency among the branches is using a distributed transaction system based on the XA protocol. Encapsulate four call branches into large transactions that contain four independent transaction branches. The XA-based distributed transaction solution can satisfy the correctness of business processing results. However, the biggest disadvantage is that the resource locking range is large, and the concurrency is low in a multi-branch environment. As the downstream branches increase, the system performance will become worse.

A Normal Message-Based Solution: Poor Result Consistency

Simplify the preceding XA transaction-based solution. Order system changes are performed as local transactions, and the remaining system changes are performed as the downstream of normal messages. The transaction branch is simplified into normal messages + order table transactions, which fully uses the asynchronous capabilities of messages to shorten links and improve concurrency.

However, this solution is prone to deliver inconsistent results between the core transaction and transaction branches. For example:

The message is sent, but the order is not executed. As a result, the whole transaction needs to be rolled back.
The order is executed, but the message is not sent. As such, the message has to be resent for consumption.
Timeout errors cannot be reliably detected, which makes it difficult to determine whether the order needs to be rolled back or an order change needs to be committed.

RocketMQ-Based Distributed Transactional Messages: Supporting Eventual Consistency

In the preceding normal message solution, normal messages and order transactions cannot be consistent because normal messages fail to be committed, rolled back, and coordinated like stand-alone database transactions.

The distributed transactional message feature based on Message Queue for Apache RocketMQ supports two-phase commits based on normal messages. Bind a two-phase commit to a local transaction to achieve consistency in global commit results.

The solution of transactional messages in Message Queue for Apache RocketMQ provides the advantages of high performance, scalability, and simple business development.

Basic Principles

Concepts

Transactional Messages: RocketMQ provides a distributed transaction feature similar to XA or Open XA. RocketMQ transactional messages can be used to achieve eventual consistency in distributed transactions.
Half-Transactional Messages: This is a message that cannot be delivered. The producer has sent the message to the RocketMQ server, but the RocketMQ server has not received the second confirmation of the message from the producer. At this time, the message is marked as temporarily undeliverable. The message in this state is half-transactional.
Message Lookback: The second acknowledgment of a transactional message is lost due to network disconnection or producer application restart. If the RocketMQ Broker scans for a message that has been in the half-transactional message for a long time, it needs to ask the message producer about the eventual status (Commit or Rollback) of the message. The query process is message lookback.

Lifecycle of a Transactional Message

Initialization: This is the status of half-transactional messages being created and initialized by the producer and sent to the server.
Transaction to be Committed: Half-transactional messages are sent to the server. Unlike normal messages, half-transactional messages are not directly persisted by the server. Instead, they are separately stored in the transaction storage system and committed after the execution result of the second-stage local transaction is returned. The message is invisible to downstream consumers during this period.
Message Rollback: In the second phase, if the transaction execution result is a rollback, the server will roll back the half-transactional message, and the transactional message process is terminated.
Transactions to be Consumed: In the second phase, if the transaction execution result is committed, the server will restore the half-transactional message to the general storage system. As such, the message will be visible to the downstream consumer and will be obtained and consumed by the consumer.
Consuming: The consumer obtains a message and processes it based on the consumer's local business logic. As such, the server waits for the consumer to complete the consumption and submits the consumption result. If no response is received from the consumer after a certain period, RocketMQ retries processing the message. Please see Message Retry for more information.
Consumption Commit: The consumer completes the consumption process and submits the consumption result to the server. The server marks that the current message has been processed (including consumption success or failure). By default, RocketMQ allows you to retain all messages, and the message data is not deleted immediately but is logically marked as consumed. Therefore, the consumer can backtrack the message for re-consumption before it is deleted due to the expiration of the retention period or insufficient storage space.
Message Deletion: When the storage duration of a message expires or the storage space is insufficient, RocketMQ deletes the earliest message data from the physical file based on the rolling mechanism.

Basic Process of Message Consumption

The interaction process of transactional messages is shown in the following figure:

1. The producer sends the message to the RocketMQ server.

2. After the RocketMQ server persists the message, it returns ACK to the producer to confirm that the message has been sent. At this time, the message is marked as temporarily undeliverable. The message in this state is half-transactional.

3. The producer executes the local transaction.

4. The producer submits the secondary confirmation result (Commit or Rollback) to the server based on the local transaction result. The following is the processing logic after the server receives the confirmation result:

The second confirmation result is Commit: The server marks the half-transactional message as deliverable and delivers it to the consumer.
*The second confirmation result is Rollbacks*: The server rolls back the transaction and does not deliver the half-transactional message to the consumer.

5. If the network is disconnected or the producer application is restarted, and the Broker does not receive a second confirmation (or the status of the half message is Unknown), the Broker waits a period and sends a request to a producer in the producer cluster to query the status of the half message.

6. After the producer receives the request, the producer checks the eventual status of the local transaction that corresponds to the message.

7. The producer submits the second confirmation based on the eventual status of the checked local transaction. The server still processes the half-transactional message as per step 4.

Implementation Details: How to Implement RocketMQ Transactional Messages

According to the needs of the basic process of sending transactional messages, the implementation is divided into three main processes: receiving and processing Half messages, Commit or Rollback command processing, and transactional message check.

Processing Half Messages

After the sender sends a Half message to the Broker in the first phase, the Broker processes the Half message. Please see the following figure for the Broker process:

The specific process is to convert the message topic to RMQ_SYS_TRANS_HALF_TOPIC and then write the rest of the message content to the Half queue. The specific implementation refers to the logical processing of the SendMessageProcessor.

Commit or Rollback Command Handling

After the sender completes the local transaction, it continues to send a Commit or Rollback to the Broker. Since the current transaction is completed, the Broker needs to delete the original Half messages. Due to the appendOnly feature of RocketMQ, the Broker uses OP messages to delete tags. Please see the following figure for the Broker process:

Commit: The Broker writes the OP message. The body of the OP message specifies the queueOffset of the Commit message to mark that the previous Half message has been deleted. At the same time, the Broker reads the original Half messages, restores the topic, and rewrites it to CommitLog. The consumer can pull the consumption.
Rollback. The Broker also writes OP messages, and the process is the same as Commit. However, the Half message will not be read and restored later. Consumers will not consume the messages.

The specific implementation is in EndTransactionProcessor.

Transactional Message Check

If the sender sends the UNKNOWN command, or the Broker/sender restarts publishing, the marked deleted OP messages in process 2 may be missing. Therefore, a transaction message check process is added. This process is regularly executed on asynchronous threads (transactionCheckInterval is 30s by default) to check the status of these Half messages whose OP messages are missed. Please see the following figure for more information:

The transactional message check process scans the current OP message queue and reads the queueOffset of the Half message that has been marked for deletion. If it is found that a Half message does not have the corresponding mark of the OP message and has timed out (transactionTimeOut is six seconds by default), the Half message is read and rewritten to the half queue, and the check command is sent to the original sender to check the transaction status. If there is no timeout, the OP message queue will be read after waiting to obtain new OP messages.

If the bornTime of a Half message exceeds the maximum retention time (transactionCheckMaxTimeInMs is 12 hours by default), the message is automatically skipped and not checked to avoid the status of the transaction being undetermined for a long time due to the sender exception.

Specific implementation reference:

TransactionalMessageServiceImpl#check Method

Application: Using Transactional Messages

After understanding the principle of RocketMQ transactional messages, let's take a look at how to use transactions. First, we need to create a topic of the transactional message type, which can be created using the console or CLI commands.

Sending transactional messages is different from sending normal messages in the following aspects:

Before sending transactional messages, you must enable the transaction and associate it with local transaction execution.
When creating a producer, you must set the transaction checker and bind the list of topics of messages to be sent. These actions enable the built-in transaction checker of the client to restore topics in the event of exceptions.

After the transactional message is committed, the message is a normal message delivered to the user topic. For consumers, it is no different from the consumption of normal messages.

Note:

Avoid timeout caused by many in-doubt transactions: Initiate a transaction check-back when the transaction commit phase is abnormal to ensure transactional consistency. However, producers should try to avoid local transactions returning unknown results. A large number of transaction checks will cause system performance damage and easily lead to transaction processing delays.
The Group ID of transactional messages cannot be shared with other types of messages: Unlike other types of messages, transactional messages have a lookback mechanism. the server queries the producer client based on the Group ID during lookback.
Transaction Timeout Mechanism: After a half-transactional message is sent by the producer to the server, if the server fails to confirm the commit or rollback status within a specified period, the message is rolled back by default.

Today, I introduced the transactional messages of RocketMQ to offer a deeper understanding of its principles and applications. I hope the transactional messages of RocketMQ can help you solve your business problems effectively.

Community

Use Transactional Messages to Understand RocketMQ Service Messages

Scenario: Why Are Transactional Messages Needed?

Traditional XA Solution: Poor Performance

A Normal Message-Based Solution: Poor Result Consistency

RocketMQ-Based Distributed Transactional Messages: Supporting Eventual Consistency

Basic Principles

Concepts

Lifecycle of a Transactional Message

Basic Process of Message Consumption

Implementation Details: How to Implement RocketMQ Transactional Messages

Processing Half Messages

Commit or Rollback Command Handling

Transactional Message Check

Application: Using Transactional Messages

Read previous post:

Read next post:

Alibaba Cloud Native Community

You may also like

Comments

Alibaba Cloud Native Community

Related Products

ApsaraMQ for RocketMQ

ChatAPP

Function Compute

Voice Messaging Service