All Products
Search
Document Center

Simple Message Queue (formerly MNS):Throttling policy

Last Updated:Mar 11, 2026

Simple Message Queue (formerly MNS) elastically scales throughput based on real-time cluster resources. When requests exceed the cluster's current capacity, throttling temporarily limits traffic to maintain stability.

How throttling works

The throttling threshold is not a fixed cap. As traffic approaches the threshold, the server automatically scales resources and raises the limit. In most cases, this elastic adjustment absorbs traffic spikes without affecting your workload.

If a sudden burst exceeds what elastic scaling can absorb, the server activates a backpressure mechanism. It holds excess requests for approximately 500 ms before returning an error. After automatic scale-out completes, the system resumes normal processing at a higher threshold.

Default throughput

The system elastically supports throughput above the baseline when cluster resources are available. Each Alibaba Cloud account has a default guaranteed throughput of 20,000 Transactions Per Second (TPS) per region.

To raise the default threshold, submit a ticket.

How TPS is calculated

Each API operation counts as one request, with one exception: batch operations are counted by the number of messages, not the number of API calls.

OperationTPS formulaExample
Single-message operations1 request = 1 TPS100 requests/sec = 100 TPS
BatchSendMessageRequests/sec x messages per request100 requests/sec x 10 messages = 1,000 TPS
BatchReceiveMessageRequests/sec x messages per request100 requests/sec x 10 messages = 1,000 TPS

Throttling for abnormal queue consumption

In a standard queue consumption pattern, clients delete messages after processing them. If a client repeatedly receives messages without sending delete requests, the system flags this as abnormal behavior. It then reduces the rate at which that client can receive messages.

Throttling activates when any one of the following conditions is met:

ConditionThreshold
DurationAbnormal behavior persists for more than 30 minutes
Message countTotal received-but-not-deleted messages reaches 5,000
RateInstantaneous receive-without-delete rate exceeds 1,000 TPS

Error response

When throttling is triggered, the server returns the following error:

HTTP status codeError codeError message
429TooManyRequestsThe request is denied by cluster flow limiter for too many requests.

Handle this error by waiting briefly and retrying the request. Throttling is temporary -- the system automatically scales out to restore capacity.

Mitigate throttling

Plan ahead for traffic spikes. If you expect a large increase in traffic, submit a ticket in advance. The team can pre-allocate resources to prevent throttling during peak periods.

Set up monitoring and alerts. Use the monitoring tools for Simple Message Queue to track real-time traffic and throttling status. Early detection helps you respond before throttling affects your workload.

Implement retry with exponential backoff. When a 429 error occurs, retry with exponential backoff. This gives the system time to scale out and prevents retry storms from worsening throttling.

FAQ

Is 20,000 TPS a hard limit?

No. 20,000 TPS is the default guaranteed baseline. The system elastically supports higher throughput when cluster resources are available.

Why does throttling happen sometimes but not others at the same TPS?

Throttling depends on real-time cluster conditions, not just your request rate. Three factors determine whether throttling occurs:

  • Cluster load: When the cluster has spare capacity, it supports TPS well above the baseline.

  • Traffic pattern: A sudden spike is more likely to trigger throttling than a gradual ramp-up, because scale-out takes time.

  • Scale-out timing: Automatic scaling is not instant. Brief throttling can occur during the scale-out window.

I received a 429 error. What should I do?

Wait and retry with exponential backoff. The backpressure mechanism holds excess requests for about 500 ms before returning the error, so the system is already working to restore capacity.