ApsaraMQ for RocketMQ provides built-in dashboards powered by Managed Service for Prometheus and Grafana. These dashboards give you real-time visibility into message throughput, consumer lag, processing latency, and client-side performance -- so you can detect anomalies, troubleshoot delivery issues, and plan capacity from a single view.
Dashboard metrics are classified as basic metrics in ARMS Prometheus Service and are free of charge.
Use cases
Detect consumption anomalies: Messages are piling up and consumers are falling behind. Set alerts on consumer lag and ready message queue time to catch problems early.
Trace message delivery: An order status is not updating. Check whether the producer sent the message and whether the consumer processed it by following the message through production and consumption metrics.
Analyze traffic trends: Review message volume, throughput, and size distribution over time to forecast capacity needs and plan for growth.
Map application dependencies: Examine the upstream and downstream topology of producers and consumers to inform architecture changes.
Prerequisites
Before you begin, make sure that you have:
Created the required service-linked role: For details, see Service-linked roles.
Property Value Role name AliyunServiceRoleForOns Policy name AliyunServiceRolePolicyForOns Permissions Grants ApsaraMQ for RocketMQ access to CloudMonitor and ARMS for monitoring, alerting, and dashboard features
How message accumulation works
The following diagram shows the lifecycle of messages in a topic queue.

ApsaraMQ for RocketMQ tracks both the count and the age of messages at each processing stage. These metrics reveal whether consumers are keeping up with incoming messages.
Message count metrics
| Metric | Definition | Formula |
|---|---|---|
| Inflight messages | Messages currently being processed by a consumer that has not yet returned a consumption result | Offset of latest pulled message - Offset of latest submitted message |
| Ready messages | Messages stored on the server that are visible to consumers and available for consumption | Maximum message offset - Offset of latest pulled message |
| Consumer lag | Total unprocessed messages (inflight + ready) | Inflight message count + Ready message count |
Message duration metrics
| Metric | Definition | Formula |
|---|---|---|
| Ready time | The point at which a message becomes available for consumption. For normal and ordered messages, this is the storage time. For scheduled and delayed messages, this is when the delay expires. For transactional messages, this is when the transaction commits. | N/A |
| Ready message queue time | How long the oldest ready message has been waiting. Indicates how promptly consumers pull messages. | Current time - Ready time of the oldest ready message |
| Consumer lag time | How long the oldest message awaiting a consumption result has been ready. Indicates how promptly consumers process messages. | Current time - Ready time of the oldest message awaiting a response |
PushConsumer local processing
PushConsumer uses a Reactor thread model. The SDK runs a built-in long-polling thread that pulls messages from the server and stores them in a local buffer queue. A pool of consumption threads then picks messages from this buffer and invokes the message listener.

For more information, see PushConsumer.
Three client-side metrics track the state of this local buffer:
Cached message count: Total messages currently in the local buffer queue.
Cached message size: Total size (in bytes) of all messages in the local buffer queue.
Await time: How long each message sits in the local buffer queue before consumption begins.
View the dashboard
The ApsaraMQ for RocketMQ console provides dashboard access from four entry points, each scoped to a different level of detail:
| Entry point | What it shows |
|---|---|
| Dashboard page | Metrics for all topics and consumer groups in the instance |
| Instance Details page | Producer overview, billing metrics, and throttling metrics |
| Topic Details page | Production metrics and producer client metrics for a specific topic |
| Group Details page | Consumer lag and consumer client metrics for a specific group |
To open a dashboard:
Log on to the ApsaraMQ for RocketMQ console. In the left-side navigation pane, click Instances.
In the top navigation bar, select a region, such as China (Hangzhou). On the Instances page, click the target instance name.
Open the dashboard from any of the following entry points:
Instance-level dashboard: On the Instance Details page, click the Dashboard tab.
Dedicated dashboard page: In the left-side navigation pane, click Dashboard.
Topic-level dashboard: In the left-side navigation pane, click Topics. Click a topic name, then click the Dashboard tab on the Topic Details page.
Group-level dashboard: In the left-side navigation pane, click Groups. Click a group name, then click the Dashboard tab on the Group Details page.
Metric reference
TPS, API call counts, and message volume metrics are calculated based on a standard 4 KB normal message. Larger messages or messages that use advanced features (ordered, transactional, delayed) have multipliers applied. For calculation rules, see Calculation specifications.
Metric types and labels
Metric types
| Type | Behavior | Example |
|---|---|---|
| Counter | Monotonically increasing cumulative value | Total messages produced |
| Gauge | Point-in-time value that can increase or decrease | Current TPS |
| Histogram | Distribution of values across predefined buckets | Message size distribution |
Common labels
| Label | Description |
|---|---|
instance_id | ApsaraMQ for RocketMQ instance ID |
topic | Topic name |
consumer_group | Consumer group name |
message_type | normal, fifo (ordered), transaction, or delay (delayed/scheduled) |
fifo_enable | true if the server delivers messages in order; false for concurrent delivery |
uid | Alibaba Cloud account ID |
client_id | ApsaraMQ for RocketMQ client ID |
invocation_status | success or failure |
Server-side metrics
| Type | Metric name | Unit | Description | Labels |
|---|---|---|---|---|
| Gauge | rocketmq_instance_requests_max | count/s | Peak TPS for all messages sent and received by the instance. Excludes throttled requests. Maximum of 60 per-second samples over a 1-minute window. | uid, instance_id |
| Gauge | rocketmq_instance_requests_in_max | count/s | Peak send TPS for the instance. Excludes throttled requests. Maximum of 60 per-second samples over 1 minute. | uid, instance_id |
| Gauge | rocketmq_instance_requests_out_max | count/s | Peak consumption TPS for the instance. Excludes throttled requests. Maximum of 60 per-second samples over 1 minute. | uid, instance_id |
| Gauge | rocketmq_topic_requests_max | count/s | Peak send TPS for a specific topic. Excludes throttled requests. Maximum of 60 per-second samples over 1 minute. | uid, instance_id, topic |
| Gauge | rocketmq_group_requests_max | count/s | Peak consumption TPS for a consumer group. Excludes throttled requests. Maximum of 60 per-second samples over 1 minute. | uid, instance_id, consumer_group |
| Gauge | rocketmq_instance_requests_in_threshold | count/s | Throttling threshold for message sends on the instance. | uid, instance_id |
| Gauge | rocketmq_instance_requests_out_threshold | count/s | Throttling threshold for message consumption on the instance. | uid, instance_id |
| Gauge | rocketmq_throttled_requests_in | count | Number of throttled send requests. | uid, instance_id, topic, message_type |
| Gauge | rocketmq_throttled_requests_out | count | Number of throttled consumption requests. | uid, instance_id, topic, fifo_enable, consumer_group |
| Gauge | rocketmq_instance_elastic_requests_max | count/s | Peak elastic TPS for the instance. | uid, instance_id |
| Counter | rocketmq_requests_in_total | count | Total send API calls. | uid, instance_id, topic, message_type |
| Counter | rocketmq_requests_out_total | count | Total consumption API calls. | uid, instance_id, topic, consumer_group, fifo_enable |
| Counter | rocketmq_messages_in_total | message | Total messages sent by producers. | uid, instance_id, topic, message_type |
| Counter | rocketmq_messages_out_total | message | Total messages delivered to consumers. Includes messages being processed, successfully processed, and failed. | uid, instance_id, topic, consumer_group, fifo_enable |
| Counter | rocketmq_throughput_in_total | byte | Total inbound message throughput from producers. | uid, instance_id, topic, message_type |
| Counter | rocketmq_throughput_out_total | byte | Total outbound message throughput to consumers. Includes messages being processed, successfully processed, and failed. | uid, instance_id, topic, consumer_group, fifo_enable |
| Counter | rocketmq_internet_throughput_out_total | byte | Downstream Internet traffic for message sends and receives. | uid, instance_id, topic, message_type |
| Histogram | rocketmq_message_size | byte | Size distribution of successfully sent messages. Buckets: le_1_kb (<=1 KB), le_4_kb (<=4 KB), le_512_kb (<=512 KB), le_1_mb (<=1 MB), le_2_mb (<=2 MB), le_4_mb (<=4 MB), le_overflow (>4 MB). | uid, instance_id, topic, message_type |
| Gauge | rocketmq_consumer_ready_messages | message | Ready messages waiting on the server for consumers to pull. | uid, instance_id, topic, consumer_group |
| Gauge | rocketmq_consumer_inflight_messages | message | Messages currently being processed by consumers (no result returned yet). | uid, instance_id, topic, consumer_group |
| Gauge | rocketmq_consumer_queueing_latency | ms | How long the oldest ready message has been waiting. Indicates pull promptness. | uid, instance_id, topic, consumer_group |
| Gauge | rocketmq_consumer_lag_latency | ms | How long the oldest unacknowledged message has been ready. Indicates processing promptness. | uid, instance_id, topic, consumer_group |
| Counter | rocketmq_send_to_dlq_messages | message | Dead-letter messages per minute. A message becomes a dead letter after exceeding the maximum redelivery attempts. Based on the group's dead-letter policy, these messages are either saved to a designated topic or discarded. | uid, instance_id, topic, consumer_group |
| Gauge | rocketmq_storage_size | byte | Total storage used by the instance, including all files. | uid, instance_id |
Producer metrics
| Type | Metric name | Unit | Description | Labels |
|---|---|---|---|---|
| Histogram | rocketmq_send_cost_time | ms | Latency distribution for successful send API calls. Buckets: le_1_ms, le_5_ms, le_10_ms, le_20_ms, le_50_ms, le_200_ms, le_500_ms, le_overflow. | uid, instance_id, topic, client_id, invocation_status |
Consumer metrics
| Type | Metric name | Unit | Description | Labels |
|---|---|---|---|---|
| Histogram | rocketmq_process_time | ms | PushConsumer message processing time distribution (includes both successful and failed messages). rocketmq_process_time = process end time - process start time. Buckets: le_1_ms, le_5_ms, le_10_ms, le_100_ms, le_10000_ms, le_60000_ms, le_overflow. | uid, instance_id, consumer_group, topic, client_id, invocation_status |
| Gauge | rocketmq_consumer_cached_messages | message | Messages in the PushConsumer local buffer queue. | uid, instance_id, consumer_group, topic, client_id |
| Gauge | rocketmq_consumer_cached_bytes | byte | Total size of messages in the PushConsumer local buffer queue. | uid, instance_id, consumer_group, topic, client_id |
| Histogram | rocketmq_await_time | ms | Time messages wait in the PushConsumer local buffer queue before processing begins. rocketmq_await_time = process start time - arrival time. Buckets: le_1_ms, le_5_ms, le_20_ms, le_100_ms, le_1000_ms, le_5000_ms, le_10000_ms, le_overflow. | uid, instance_id, consumer_group, topic, client_id |
Billing
Dashboard metrics are classified as basic metrics in ARMS Prometheus Service. Basic metrics are free. No additional charges apply for using the dashboard feature.
For more information, see Basic metrics and Pay-as-you-go.
FAQ
How do I access dashboard metric data programmatically?
All ApsaraMQ for RocketMQ metrics are stored in your Managed Service for Prometheus instance. To get the data access endpoint:
Log on to the ARMS console. In the left-side navigation pane, click Integration Center.
Search for
RocketMQand select Alibaba Cloud RocketMQ (5.0) Service. Complete the integration by following Integrate monitoring data of an Alibaba Cloud service.After provisioning completes, click Provisioning in the left-side navigation pane.
Click the Cloud Service Region Environment tab, then click the target environment name.
On the Component Management tab, in the Basic Information area, click the cloud service region next to Prometheus Instance.
On the Settings tab, find the available data access methods (HTTP API endpoint, Remote Write, etc.).
How do I integrate metrics into a self-managed Grafana?
After retrieving the HTTP API address from the steps above, add it as a Prometheus data source in your Grafana instance. For step-by-step instructions, see Integrate Prometheus data into Grafana using an HTTP API address.
What does TPS Max represent?
TPS Max is the peak transactions-per-second value within a 1-minute window. The system samples TPS once per second (60 samples per minute) and reports the highest value.
Example: An instance sends 60 normal messages (4 KB each) in 1 minute.
If all 60 messages are sent in the first second, TPS per second = 60, 0, 0, ..., 0. TPS Max = 60.
If 40 messages are sent in the first second and 20 in the second, TPS per second = 40, 20, 0, ..., 0. TPS Max = 40.