When a Message Integration task fails to deliver a message, the retry policy controls how and when delivery is reattempted. After all retries are exhausted, the fault tolerance policy takes over. Depending on the configuration, the message is routed to a dead-letter queue, discarded, or the task is paused.
How retry, fault tolerance, and dead-letter queues interact
When a message fails to deliver:
The retry policy reattempts delivery at the configured intervals until the retry limit is reached.
If all retries are exhausted, the fault tolerance policy takes effect:
Fault tolerance allowed + dead-letter queue enabled -- The message is routed to the dead-letter queue.
Fault tolerance allowed + dead-letter queue disabled -- The message is discarded.
Fault tolerance prohibited -- The task pauses and its status changes to Ready.
If a retry cannot run due to invalid resource configurations, the task status changes to Start Failed.
Retry policies
A retry policy controls how failed messages are retried within a Message Integration task. Message Integration supports two retry policies: backoff retry and exponential decay retry.
Backoff retry (default)
Retries a failed message up to 3 times. Each retry waits a random interval between 10 and 20 seconds.
Use backoff retry for transient failures that resolve within seconds, such as brief network interruptions or temporary service unavailability.
Exponential decay retry
Retries a failed message up to 176 times over one day. The interval doubles with each attempt, from 1 second up to 512 seconds:
1s, 2s, 4s, 8s, 16s, 32s, 64s, 128s, 256s, 512s
After the interval reaches 512 seconds, the remaining 167 retries all use the 512-second interval.
Use exponential decay retry when failures may persist for minutes to hours, such as downstream service outages or rate-limiting.
Compare retry policies
| Backoff retry | Exponential decay retry | |
|---|---|---|
| Max retries | 3 | 176 |
| Retry window | -- | 1 day |
| Interval | Random, 10--20 s | Doubles from 1 s to 512 s |
| Best for | Transient failures that resolve within seconds | Failures that may persist for minutes to hours |
Fault tolerance policies
A fault tolerance policy determines how the task responds after all retries are exhausted. Message Integration supports two fault tolerance policies.
| Policy | Behavior after retries are exhausted | Task continues? |
|---|---|---|
| Fault tolerance allowed | The failed message is delivered to the dead-letter queue (if configured) or discarded. | Yes |
| Fault tolerance prohibited | The task stops processing. Its status changes to Ready. | No |
Dead-letter queues
A dead-letter queue preserves messages that fail to be processed or sent after the retry policy is exhausted. The raw message data is stored in the dead-letter queue.
Dead-letter queues are disabled by default. Each dead-letter queue is scoped to a single Message Integration task.
Supported queue types
The following services can serve as dead-letter queues:
| Service |
|---|
| ApsaraMQ for RocketMQ |
| Simple Message Queue (formerly MNS) |
| ApsaraMQ for Kafka |
| Event buses in EventBridge |