All Products
Search
Document Center

ApsaraMQ for RocketMQ:Message sending retry and throttling

Last Updated:Mar 11, 2026

When a producer sends a message to the ApsaraMQ for RocketMQ broker, the request can fail due to network issues, broker restarts, or capacity limits. The client SDK handles these failures through two built-in mechanisms:

  • Sending retry re-sends failed messages automatically until they succeed or the retry limit is reached.

  • Throttling protects the broker from overload by rejecting requests when capacity is insufficient.

Both mechanisms work together: when throttling triggers a rejection, the retry mechanism uses exponential backoff to re-send the message without overwhelming the broker further.

Sending retry

Retry process

The client SDK includes built-in retry logic. When a send request fails, the SDK re-sends the message automatically -- no application-level retry code is needed.

Set the maximum number of retries when you initialize the producer. If a request fails, the SDK retries until the message is delivered or the retry limit is reached. After the final retry fails, the SDK returns an error to your application.

The retry behavior differs by sending mode:

Sending modeThread behaviorOn final failure
SynchronousCalling thread blocks for the entire retry sequenceSDK throws an exception
AsynchronousCalling thread is not blockedSDK delivers a failure callback event

Retry triggers

Retries are triggered by two categories of failure:

Client-side failures

  • A network exception causes a connection failure or request timeout.

  • The broker is restarting or being undeployed, causing connection failures.

  • The broker is running slowly, causing request timeouts.

Broker-side errors

  • System logic error: An internal processing error on the broker.

  • System throttling error: The broker rejects the request because it has exceeded capacity. See Throttling.

Note

Transactional messages only support transparent retries. The SDK does not retry transactional messages on network exceptions or timeouts.

Retry interval

The retry interval depends on the error type:

Error typeRetry interval
All errors except throttlingImmediate (no delay)
System throttling errorExponential backoff with jitter

For throttling errors, the SDK uses exponential backoff with the following parameters:

ParameterDescriptionDefault
INITIAL_BACKOFFDelay before the first retry1 second
MULTIPLIERFactor by which the delay increases after each retry1.6
JITTERRandomization factor applied to each delay0.2
MAX_BACKOFFMaximum delay between retries120 seconds
MIN_CONNECT_TIMEOUTMinimum connection timeout20 seconds

The backoff algorithm works as follows:

ConnectWithBackoff()
  current_backoff = INITIAL_BACKOFF
  current_deadline = now() + INITIAL_BACKOFF
  while (TryConnect(Max(current_deadline, now() + MIN_CONNECT_TIMEOUT)) != SUCCESS)
    SleepUntil(current_deadline)
    current_backoff = Min(current_backoff * MULTIPLIER, MAX_BACKOFF)
    current_deadline = now() + current_backoff +
      UniformRandom(-JITTER * current_backoff, JITTER * current_backoff)

For the full specification, see gRPC connection backoff.

Understand the total retry time budget

The SDK exposes only one retry control: the maximum number of retries. In synchronous mode, the calling thread blocks for the entire retry sequence, so the total blocking time depends on the relationship between your per-request timeout and the maximum retry count:

Total blocking time (worst case) = max_retries x per_request_timeout + sum_of_backoff_delays

For non-throttling errors (immediate retry), the backoff delay is zero:

Total blocking time = max_retries x per_request_timeout

For throttling errors, backoff delays accumulate exponentially. For example, with the default parameters and 5 retries:

RetryBackoff delay (approximate)Cumulative delay
11 s1 s
21.6 s2.6 s
32.56 s5.16 s
44.1 s9.26 s
56.55 s15.81 s

Evaluate your per-request timeout and maximum retries together to avoid blocking the calling thread for too long in synchronous mode.

Handle failed messages after exhausted retries

Built-in retries do not guarantee delivery. If all retries fail, the SDK returns an error. Catch this error in your application and implement a fallback strategy:

  • Write the failed message to a local log or dead-letter store for later reprocessing.

  • Alert your monitoring system so you can investigate the root cause.

Handle duplicate messages from retries

When a send request times out, the SDK cannot determine whether the broker already received and stored the message. A retry may produce a duplicate on the broker. This is a fundamental trade-off in at-least-once delivery systems.

To handle duplicates, design your consumers for idempotent processing:

  • Assign each message a unique business key (such as an order ID or transaction ID).

  • Before processing, check whether the key has already been processed.

  • Use database constraints or deduplication caches to enforce uniqueness.

Throttling

Throttling is a normal operational mechanism in cloud messaging systems. When system capacity is insufficient or usage exceeds a predefined threshold, the ApsaraMQ for RocketMQ broker immediately rejects the request and returns a system throttling error. The SDK's built-in retry logic then handles the rejected request using exponential backoff.

Throttling triggers

Throttling is triggered in the following scenarios:

  • Storage pressure surge: A consumer group starts consuming from the maximum offset of a queue. In scenarios such as business rollouts where a consumer group must begin consuming at a specific time, storage pressure on the queue spikes. For more information, see Consumer progress management.

  • Message accumulation: When consumers cannot keep up with the rate of incoming messages, unconsumed messages accumulate in the queue. If the accumulation exceeds the threshold, the broker triggers throttling to reduce pressure on the downstream system.

Error codes and retry behavior by client type

When throttling is triggered, the error code and retry behavior depend on your client protocol.

gRPC clients

ItemValue
Error code530
Error message keywordTOO_MANY_REQUESTS
Retry behaviorAutomatic retry with exponential backoff

Remoting clients

ItemValue
Error code215
Error message keywordmessages flow control

Retry behavior for Remoting clients varies by SDK version:

SDKRetry behavior on throttling
ApsaraMQ for RocketMQ TCP client SDK for Java < 1.9.0.FinalNo retry
ApsaraMQ for RocketMQ TCP client SDK for Java >= 1.9.0.FinalAutomatic retry with exponential backoff
Open-source Apache RocketMQ SDK (producer)No retry
Open-source Apache RocketMQ SDK (consumer)Automatic retry with exponential backoff

If your SDK version does not retry automatically on throttling errors, implement retry logic with exponential backoff in your application code.

Note

For supported client versions, see SDK compatibility.

Prevent and handle throttling

Monitor capacity before traffic spikes

Use the ApsaraMQ for RocketMQ observability features to monitor system usage and capacity. Before business rollouts or anticipated traffic spikes:

  • Verify that your instance has sufficient resources for expected traffic.

  • Check consumer group lag to identify accumulation risks.

  • Scale your instance or optimize consumer throughput if needed.

Handle unexpected throttling at runtime

If throttling occurs unexpectedly and the SDK's built-in retries cannot recover:

  • Route requests to a fallback system until the throttling condition clears.

  • Log throttling events (look for error code 530 / TOO_MANY_REQUESTS for gRPC or 215 / messages flow control for Remoting) to help diagnose the root cause.