If a consumer encounters an exception, ApsaraMQ for RocketMQ redelivers the message for fault recovery based on the consumption retry policy. This topic describes the scenarios, mechanisms, version compatibility, and usage recommendations for consumption retries.
Scenarios
The consumption retry feature in ApsaraMQ for RocketMQ primarily resolves consumption integrity issues caused by failures in business processing logic. It is a fallback policy for your business and should not be used for business flow control.
Recommended scenarios for message retry
Business processing fails, and the failure is related to the content of the current message. For example, the transaction status for the message may not be available yet, but the operation is expected to succeed after a short period.
The cause of the consumption failure is not persistent. This means the failure is a rare event, not a regular occurrence, and subsequent messages are likely to be consumed successfully. In this case, you can retry the current message to avoid blocking the process.
Not recommended scenarios for message retry
Using consumption failures to branch your logic based on conditions is not a good practice, because the logic already anticipates that this branch will be taken frequently.
Using consumption failures to throttle the processing rate is not recommended. The purpose of throttling is to temporarily stack excess messages in the queue to smooth out traffic peaks, not to send messages into the retry flow.
Purpose
A typical problem in asynchronous decoupling with middleware is ensuring the integrity of the entire call chain if a downstream service fails to process a message event. As a financial-grade and reliable business message middleware, ApsaraMQ for RocketMQ is designed to support reliable transmission. It uses a complete acknowledgment and retry mechanism to ensure that every message is processed as expected.
Understanding the message acknowledgment mechanism and consumption retry policy of ApsaraMQ for RocketMQ helps you analyze the following issues:
How to ensure complete message processing for your business: Understand the consumption retry policy to ensure the integrity of each message when you design and implement consumer logic. This practice prevents messages from being ignored when exceptions occur, which can lead to inconsistent business states.
How to recover the status of in-process messages during a system exception: This helps you understand how to recover the status of messages that are being processed and whether inconsistencies will occur when a system exception, such as a breakdown, occurs.
Consumption retry policy
A consumption retry policy specifies the retry interval and the maximum number of retries after a consumer fails to consume a message.
Trigger conditions for message retry
Consumption fails. This includes the consumer returning a failed status identifier or throwing an unexpected exception.
Message processing times out. This includes queuing timeouts in a PushConsumer.
Main behaviors of message retry
Retry process state machine: Controls the status and change logic of a message in the retry flow.
Retry interval: The time between a consumption failure or timeout and when the message can be consumed again.
Maximum retries: The maximum number of times a message can be retried.
Differences in message retry policies
The internal mechanisms and configuration methods for consumption retry policies vary based on the consumer type. The following table describes the differences.
Consumer type | Retry process state machine | Retry interval | Maximum retries |
PushConsumer |
| Controlled by metadata when the consumer group is created.
| Set in the console or by calling an OpenAPI operation. For more information, see Modify maximum retries. |
SimpleConsumer |
| Modify the invisibility duration when receiving messages using an API. | Set in the console or by calling an OpenAPI operation. For more information, see Modify maximum retries. |
For specific retry policies, see PushConsumer retry policy and SimpleConsumer retry policy.
PushConsumer retry policy
Retry state machine
When a PushConsumer consumes a message, the main states of the message are as follows:
Ready
The message is ready on the ApsaraMQ for RocketMQ server and can be consumed by a consumer.
Inflight: A status that indicates processing is in progress.
The message has been retrieved by the consumer client and is being processed, but a consumption result has not yet been returned.
WaitingRetry: A state unique to PushConsumer.
When message processing fails or times out, the consumption retry logic is triggered. If the current number of retries has not reached the maximum, the message enters the WaitingRetry state. After the retry interval elapses, the message becomes Ready again and can be re-consumed. The retry interval can be extended between multiple retries to prevent frequent, invalid failures.
Commit
The consumption success state. The message's state machine ends when the consumer returns a success response.
DLQ: The dead-letter state.
This is the final fallback mechanism for the consumption logic. If the feature to save dead-letter messages is enabled, a failed message is delivered to a dead-letter topic after the maximum number of retries is exceeded. You can consume messages from the dead-letter topic to perform business recovery. For more information, see Dead-letter messages.
Discard
If the feature to save dead-letter messages is not enabled, a failed message is discarded after the maximum number of retries is exceeded.

For example, the consumption retry flow for a message is shown in the figure above. Assume the message is in the Ready state for 5 s, and the processing time is 6 s.
Each time a message is retried, its status changes from Ready to Inflight and then to WaitingRetry. The message retry interval is the time between a consumption failure or timeout and when the message can be consumed again. The actual interval between two consumptions of a message also includes the processing time and the duration in the Ready state. For example:
The message enters the Ready state at 0 s for its first consumption.
Due to the consumer's processing speed, it starts to pull and consume the message at 5 s. After 6 s, message processing becomes abnormal and the client returns a consumption failure.
At this point, consumption cannot be retried immediately. The system must wait for the retry interval to pass before the message can be consumed again.
At 21 s, the message becomes Ready again.
The client starts to re-consume the message after another 5 s.
Therefore, the actual interval between two consumptions of the message is: processing time + retry interval + duration in Ready state = 21 s.
Retry interval
Normal messages (non-ordered messages): The retry interval is a stepped time. The specific times are as follows:
Retry number
Retry interval
Retry number
Retry interval
1
10 seconds
9
7 minutes
2
30 seconds
10
8 minutes
3
1 minute
11
9 minutes
4
2 minutes
12
10 minutes
5
3 minutes
13
20 minutes
6
4 minutes
14
30 minutes
7
5 minutes
15
1 hour
8
6 minutes
16
2 hours
NoteIf the number of retries exceeds 16, the interval for all subsequent retries is 2 hours.
Ordered messages: The retry interval is a fixed time. For specific values, see Parameter limits.
Maximum retries
Default value: 16.
Maximum value: 1000.
The maximum number of retries for a PushConsumer is controlled by the metadata of the consumer group. To change this value, see Modify maximum retries.
For example, if the maximum number of retries is 3, the message can be delivered a maximum of 4 times: 1 original delivery and 3 retries.
Usage example
To trigger a message retry for a PushConsumer, you can simply return a consumption failure status code. The SDK also catches unexpected exceptions.
SimpleConsumer simpleConsumer = null;
// Usage example: Use a PushConsumer to consume normal messages. If consumption fails, return an error to trigger a retry.
MessageListener messageListener = new MessageListener() {
@Override
public ConsumeResult consume(MessageView messageView) {
System.out.println(messageView);
// Return ConsumeResult.FAILURE to automatically retry until the maximum number of retries is reached.
return ConsumeResult.FAILURE;
}
};
View consumption retry logs
Retries for ordered consumption by a PushConsumer occur on the consumer client. The server cannot obtain detailed logs for consumption retries. If the delivery result for an ordered message in the message trace is 'failed', you can check the consumer client logs for information such as the maximum number of retries and the consumer client.
For information about the consumer client log path, see Log configuration.
You can search for the following keywords in the client logs to quickly locate content related to consumption failures:
Message listener raised an exception while consuming messages
Failed to consume fifo message finally, run out of attempt timesSimpleConsumer retry policy
Retry state machine
When a SimpleConsumer consumes a message, the main states of the message are as follows:
Ready
The message is ready on the ApsaraMQ for RocketMQ server and can be consumed by a consumer.
Inflight: A status that indicates processing is in progress.
The message has been retrieved by the consumer client and is being processed, but a consumption result has not yet been returned.
Commit
The consumption success state. The message's state machine ends when the consumer returns a success response.
DLQ: The dead-letter state.
This is the final fallback mechanism for the consumption logic. If the feature to save dead-letter messages is enabled, a failed message is delivered to a dead-letter topic after the maximum number of retries is exceeded. You can consume messages from the dead-letter topic to perform business recovery. For more information, see Dead-letter messages.
Discard
If the feature to save dead-letter messages is not enabled, a failed message is discarded after the maximum number of retries is exceeded.
Unlike the retry policy for a PushConsumer, the retry interval for a SimpleConsumer is pre-allocated. Each time a message is received, the consumer sets an invisibility duration parameter, InvisibleDuration, when calling the API. This parameter specifies the maximum processing time for the message. If consumption fails and triggers a retry, you do not need to set the next retry interval because the value of the invisibility duration parameter is reused.

The pre-allocated invisibility duration may differ significantly from the actual message processing time in your business. You can use an API operation to modify the invisibility duration.
For example, you can preset the message processing time to a maximum of 20 ms. However, in your actual business, the message cannot be processed within 20 ms. You can change the message invisibility duration to extend the processing time and prevent the message from triggering the retry mechanism.
To change the message invisibility duration, the following conditions must be met:
Message processing has not timed out.
The consumption status has not been committed.
As shown in the following figure, a change to the message invisibility duration takes effect immediately. The message invisibility duration is recalculated from the moment the API is called.

Message retry interval
Retry interval = Invisibility duration - Actual message processing time
The consumption retry interval for a SimpleConsumer is controlled by the message invisibility duration. For example, if the invisibility duration is 30 ms and the actual message processing takes 10 ms before a failure response is returned, the next retry occurs after 20 ms. The retry interval is 20 ms. If the message is not processed and no result is returned after 30 ms, the message times out and is retried immediately. In this case, the retry interval is 0 ms.
Maximum retries
Default value: 16.
Maximum value: 1000.
The maximum number of retries for a SimpleConsumer is controlled by the metadata created for the consumer group. To change this value, see Modify maximum retries.
For example, if the maximum number of retries is 3, the message can be delivered a maximum of 4 times: 1 original delivery and 3 retries.
Usage example
To trigger a message retry for a SimpleConsumer, you can simply wait for a timeout.
// Usage example: Use a SimpleConsumer to consume normal messages. If you want to retry, just wait for a timeout. The server will automatically retry.
List<MessageView> messageViewList = null;
try {
messageViewList = simpleConsumer.receive(10, Duration.ofSeconds(30));
messageViewList.forEach(messageView -> {
System.out.println(messageView);
// If processing fails and you want the server to retry, just ignore the message. You can try to receive it again after it becomes visible.
});
} catch (ClientException e) {
// If the pull fails due to reasons such as system throttling, you need to initiate the request to receive the message again.
e.printStackTrace();
}Modify maximum retries
You can modify the maximum number of consumption retries for PushConsumer and SimpleConsumer in the following ways.
1. If your client uses the Remoting protocol, the actual maximum number of retries is subject to the settings on the client, and this configuration does not take effect. If the client uses the gRPC protocol, the maximum number of retries follows this configuration.
2. The consumption retry policies, such as stepped backoff and fixed interval, are effective only for clients that use the gRPC protocol. They are not valid for clients that use the Remoting protocol.
Modify using an OpenAPI operation: UpdateConsumerGroup
Modify using the console:
Follow these steps:
On the Instances page, click the name of the target instance.
In the navigation pane on the left, click Groups. On the Groups page, click Create Group.

Usage recommendations
Retry reasonably and avoid triggering consumption retries for throttling purposes
As mentioned in Scenarios, message retry is suitable for scenarios where business processing fails and the current consumption failure is a rare event. It is not suitable for scenarios with persistent failures, such as consumption throttling.
Incorrect example:
If the current consumption speed is too high and triggers throttling, you can return a consumption failure and wait for the next re-consumption.
Correct example:
If the current consumption speed is too high and triggers throttling, you can delay receiving messages and consume them later.
FAQ about message retry
How do I set the message consumption timeout?
The consumption timeout is set on the consumer client. The specific parameter settings are as follows:
gRPC protocol
SimpleConsumer: The maximum timeout can be set to 12 hours. The minimum is 10 seconds.
Sample code:
private long minInvisiableTimeMillsForRecv = Duration.ofSeconds(10).toMillis(); private long maxInvisiableTimeMills = Duration.ofHours(12).toMillis();PushConsumer: The default is 230 minutes and cannot be changed.
Remoting protocol
consumer.setConsumeTimeout(15); // Unit: minutes. Minimum value: 1. Maximum value: 180.