ApsaraMQ for RocketMQ 5.x Serverless instances automatically scale compute, storage, and network resources based on real-time traffic. You pay only for what you consume -- message volume, topics, network traffic, and storage -- with fees settled hourly. You do not need to plan capacity, provision resources, or manage infrastructure.
Benefits
| Benefit | Description |
|---|---|
| Automatic scaling | Resources adjust dynamically based on real-time traffic. No need to estimate or preconfigure instance specifications. |
| Zero Operations and Maintenance (O&M) overhead | Instances are fully managed. Focus on business logic instead of infrastructure stability and scaling. |
| Open source compatibility | Compatible with open source versions. Instances are application-centric, so you can focus on core business code. |
| Pay-as-you-go billing | Billing is based on actual usage of resources, such as message volume, topic resources, network traffic, and storage. Fees are settled hourly. |
How elasticity works
Serverless instances handle traffic changes through two scaling mechanisms, categorized based on whether client message requests are affected during elastic scaling:
-
Lossless elasticity -- Scaling occurs transparently without affecting message requests. Each instance has a lossless elasticity throttling threshold that defines the traffic level below which scaling is seamless.
-
Adaptive elasticity -- When traffic exceeds the lossless elasticity throttling threshold, adaptive elasticity rules take effect. The instance temporarily throttles traffic during a scale-out. After the scale-out completes, the throttling threshold increases accordingly.
Scale-out and scale-in behavior
| Parameter | Value |
|---|---|
| Step size (cumulative usage capacity mode) | ~25,000 TPS |
| Step size (reserved + elastic capacity mode) | ~Size of the reserved specification |
| Scale-out duration | Several minutes. Larger reserved specifications take longer. |
| Traffic check window | ~10 minutes |
| Scale-in behavior | Reduces capacity by one step size per operation |
Capacity modes
Serverless instances support three configurations.
| Shared: Cumulative usage | Shared: Reserved + elastic | Dedicated: Reserved + elastic | |
|---|---|---|---|
| Deployment mode | Physically shared; logically single-tenant | Physically shared; logically single-tenant | Physically dedicated, exclusive physical nodes |
| Capacity mode | No reserved capacity. Billed by cumulative message count (sent and received). | Reserved capacity. Billed by reserved capacity + elastic TPS. | Reserved capacity. Billed by reserved capacity + elastic TPS. |
| Lossless elasticity | Supported. Threshold: 50,000 TPS | Supported. Threshold: reserved specification x 3 | Supported. Threshold: reserved specification x 1.5 |
| Adaptive elasticity | Supported | Supported | Not supported |
| Maximum throttling threshold | 300,000 | min(300,000, reserved specification x 10) | Reserved specification x 1.5 |
How the lossless elasticity throttling threshold is calculated
The lossless elasticity throttling threshold equals the reserved specification plus the lossless elasticity capability:
Lossless elasticity throttling threshold = Reserved specification + Lossless elasticity capability
| Configuration | Reserved specification | Lossless elasticity capability | Threshold |
|---|---|---|---|
| Shared: Cumulative usage | 0 | 50,000 | 50,000 |
| Shared: Reserved + elastic | 1x | 2x reserved specification | Reserved specification x 3 |
| Dedicated: Reserved + elastic | 1x | 0.5x reserved specification | Reserved specification x 1.5 |
Throttling threshold after upgrades and downgrades
After you upgrade or downgrade an instance by changing its reserved specification, the new throttling threshold is:
New throttling threshold = MAX(current throttling threshold, lossless elasticity throttling threshold after the change)
The instance retains the higher of the two values. A downgrade does not reduce the throttling threshold below its current level.
Architecture
ApsaraMQ for RocketMQ 5.x Serverless instances run on a container-based architecture with multitenancy resource fencing. Each instance is isolated so that traffic or resource consumption in one instance does not affect others.
All technical components -- computing, storage, and network -- are deployed in containers, which enables:
-
Rapid resource allocation in response to traffic changes
-
Seamless elastic scaling without manual intervention
-
Flexible distribution of underlying cloud resources across tenants
Supported regions
Serverless instances are available in the following regions:
-
China (Hangzhou)
-
China (Shanghai)
-
China (Beijing)
-
China (Zhangjiakou)
-
China (Shenzhen)
-
China (Chengdu)
-
Singapore
-
Germany (Frankfurt)
-
US (Virginia)
Support for additional regions will be available soon.
Billing
For billing rules, see Billing of Serverless instances.