All Products
Search
Document Center

ApsaraMQ for RocketMQ:Dashboard

Last Updated:Mar 11, 2026

ApsaraMQ for RocketMQ provides built-in dashboards powered by Managed Service for Prometheus and Grafana. These dashboards give you real-time visibility into message throughput, consumer lag, processing latency, and client-side performance -- so you can detect anomalies, troubleshoot delivery issues, and plan capacity from a single view.

Dashboard metrics are classified as basic metrics in ARMS Prometheus Service and are free of charge.

Use cases

  • Detect consumption anomalies: Messages are piling up and consumers are falling behind. Set alerts on consumer lag and ready message queue time to catch problems early.

  • Trace message delivery: An order status is not updating. Check whether the producer sent the message and whether the consumer processed it by following the message through production and consumption metrics.

  • Analyze traffic trends: Review message volume, throughput, and size distribution over time to forecast capacity needs and plan for growth.

  • Map application dependencies: Examine the upstream and downstream topology of producers and consumers to inform architecture changes.

Prerequisites

Before you begin, make sure that you have:

  • Activated ARMS Prometheus Service

  • Created the required service-linked role: For details, see Service-linked roles.

    PropertyValue
    Role nameAliyunServiceRoleForOns
    Policy nameAliyunServiceRolePolicyForOns
    PermissionsGrants ApsaraMQ for RocketMQ access to CloudMonitor and ARMS for monitoring, alerting, and dashboard features

How message accumulation works

The following diagram shows the lifecycle of messages in a topic queue.

Queue message status

ApsaraMQ for RocketMQ tracks both the count and the age of messages at each processing stage. These metrics reveal whether consumers are keeping up with incoming messages.

Message count metrics

MetricDefinitionFormula
Inflight messagesMessages currently being processed by a consumer that has not yet returned a consumption resultOffset of latest pulled message - Offset of latest submitted message
Ready messagesMessages stored on the server that are visible to consumers and available for consumptionMaximum message offset - Offset of latest pulled message
Consumer lagTotal unprocessed messages (inflight + ready)Inflight message count + Ready message count

Message duration metrics

MetricDefinitionFormula
Ready timeThe point at which a message becomes available for consumption. For normal and ordered messages, this is the storage time. For scheduled and delayed messages, this is when the delay expires. For transactional messages, this is when the transaction commits.N/A
Ready message queue timeHow long the oldest ready message has been waiting. Indicates how promptly consumers pull messages.Current time - Ready time of the oldest ready message
Consumer lag timeHow long the oldest message awaiting a consumption result has been ready. Indicates how promptly consumers process messages.Current time - Ready time of the oldest message awaiting a response

PushConsumer local processing

PushConsumer uses a Reactor thread model. The SDK runs a built-in long-polling thread that pulls messages from the server and stores them in a local buffer queue. A pool of consumption threads then picks messages from this buffer and invokes the message listener.

PushConsumer consumption flow

For more information, see PushConsumer.

Three client-side metrics track the state of this local buffer:

  • Cached message count: Total messages currently in the local buffer queue.

  • Cached message size: Total size (in bytes) of all messages in the local buffer queue.

  • Await time: How long each message sits in the local buffer queue before consumption begins.

View the dashboard

The ApsaraMQ for RocketMQ console provides dashboard access from four entry points, each scoped to a different level of detail:

Entry pointWhat it shows
Dashboard pageMetrics for all topics and consumer groups in the instance
Instance Details pageProducer overview, billing metrics, and throttling metrics
Topic Details pageProduction metrics and producer client metrics for a specific topic
Group Details pageConsumer lag and consumer client metrics for a specific group

To open a dashboard:

  1. Log on to the ApsaraMQ for RocketMQ console. In the left-side navigation pane, click Instances.

  2. In the top navigation bar, select a region, such as China (Hangzhou). On the Instances page, click the target instance name.

  3. Open the dashboard from any of the following entry points:

    • Instance-level dashboard: On the Instance Details page, click the Dashboard tab.

    • Dedicated dashboard page: In the left-side navigation pane, click Dashboard.

    • Topic-level dashboard: In the left-side navigation pane, click Topics. Click a topic name, then click the Dashboard tab on the Topic Details page.

    • Group-level dashboard: In the left-side navigation pane, click Groups. Click a group name, then click the Dashboard tab on the Group Details page.

Metric reference

Important

TPS, API call counts, and message volume metrics are calculated based on a standard 4 KB normal message. Larger messages or messages that use advanced features (ordered, transactional, delayed) have multipliers applied. For calculation rules, see Calculation specifications.

Metric types and labels

Metric types

TypeBehaviorExample
CounterMonotonically increasing cumulative valueTotal messages produced
GaugePoint-in-time value that can increase or decreaseCurrent TPS
HistogramDistribution of values across predefined bucketsMessage size distribution

Common labels

LabelDescription
instance_idApsaraMQ for RocketMQ instance ID
topicTopic name
consumer_groupConsumer group name
message_typenormal, fifo (ordered), transaction, or delay (delayed/scheduled)
fifo_enabletrue if the server delivers messages in order; false for concurrent delivery
uidAlibaba Cloud account ID
client_idApsaraMQ for RocketMQ client ID
invocation_statussuccess or failure

Server-side metrics

TypeMetric nameUnitDescriptionLabels
Gaugerocketmq_instance_requests_maxcount/sPeak TPS for all messages sent and received by the instance. Excludes throttled requests. Maximum of 60 per-second samples over a 1-minute window.uid, instance_id
Gaugerocketmq_instance_requests_in_maxcount/sPeak send TPS for the instance. Excludes throttled requests. Maximum of 60 per-second samples over 1 minute.uid, instance_id
Gaugerocketmq_instance_requests_out_maxcount/sPeak consumption TPS for the instance. Excludes throttled requests. Maximum of 60 per-second samples over 1 minute.uid, instance_id
Gaugerocketmq_topic_requests_maxcount/sPeak send TPS for a specific topic. Excludes throttled requests. Maximum of 60 per-second samples over 1 minute.uid, instance_id, topic
Gaugerocketmq_group_requests_maxcount/sPeak consumption TPS for a consumer group. Excludes throttled requests. Maximum of 60 per-second samples over 1 minute.uid, instance_id, consumer_group
Gaugerocketmq_instance_requests_in_thresholdcount/sThrottling threshold for message sends on the instance.uid, instance_id
Gaugerocketmq_instance_requests_out_thresholdcount/sThrottling threshold for message consumption on the instance.uid, instance_id
Gaugerocketmq_throttled_requests_incountNumber of throttled send requests.uid, instance_id, topic, message_type
Gaugerocketmq_throttled_requests_outcountNumber of throttled consumption requests.uid, instance_id, topic, fifo_enable, consumer_group
Gaugerocketmq_instance_elastic_requests_maxcount/sPeak elastic TPS for the instance.uid, instance_id
Counterrocketmq_requests_in_totalcountTotal send API calls.uid, instance_id, topic, message_type
Counterrocketmq_requests_out_totalcountTotal consumption API calls.uid, instance_id, topic, consumer_group, fifo_enable
Counterrocketmq_messages_in_totalmessageTotal messages sent by producers.uid, instance_id, topic, message_type
Counterrocketmq_messages_out_totalmessageTotal messages delivered to consumers. Includes messages being processed, successfully processed, and failed.uid, instance_id, topic, consumer_group, fifo_enable
Counterrocketmq_throughput_in_totalbyteTotal inbound message throughput from producers.uid, instance_id, topic, message_type
Counterrocketmq_throughput_out_totalbyteTotal outbound message throughput to consumers. Includes messages being processed, successfully processed, and failed.uid, instance_id, topic, consumer_group, fifo_enable
Counterrocketmq_internet_throughput_out_totalbyteDownstream Internet traffic for message sends and receives.uid, instance_id, topic, message_type
Histogramrocketmq_message_sizebyteSize distribution of successfully sent messages. Buckets: le_1_kb (<=1 KB), le_4_kb (<=4 KB), le_512_kb (<=512 KB), le_1_mb (<=1 MB), le_2_mb (<=2 MB), le_4_mb (<=4 MB), le_overflow (>4 MB).uid, instance_id, topic, message_type
Gaugerocketmq_consumer_ready_messagesmessageReady messages waiting on the server for consumers to pull.uid, instance_id, topic, consumer_group
Gaugerocketmq_consumer_inflight_messagesmessageMessages currently being processed by consumers (no result returned yet).uid, instance_id, topic, consumer_group
Gaugerocketmq_consumer_queueing_latencymsHow long the oldest ready message has been waiting. Indicates pull promptness.uid, instance_id, topic, consumer_group
Gaugerocketmq_consumer_lag_latencymsHow long the oldest unacknowledged message has been ready. Indicates processing promptness.uid, instance_id, topic, consumer_group
Counterrocketmq_send_to_dlq_messagesmessageDead-letter messages per minute. A message becomes a dead letter after exceeding the maximum redelivery attempts. Based on the group's dead-letter policy, these messages are either saved to a designated topic or discarded.uid, instance_id, topic, consumer_group
Gaugerocketmq_storage_sizebyteTotal storage used by the instance, including all files.uid, instance_id

Producer metrics

TypeMetric nameUnitDescriptionLabels
Histogramrocketmq_send_cost_timemsLatency distribution for successful send API calls. Buckets: le_1_ms, le_5_ms, le_10_ms, le_20_ms, le_50_ms, le_200_ms, le_500_ms, le_overflow.uid, instance_id, topic, client_id, invocation_status

Consumer metrics

TypeMetric nameUnitDescriptionLabels
Histogramrocketmq_process_timemsPushConsumer message processing time distribution (includes both successful and failed messages). rocketmq_process_time = process end time - process start time. Buckets: le_1_ms, le_5_ms, le_10_ms, le_100_ms, le_10000_ms, le_60000_ms, le_overflow.uid, instance_id, consumer_group, topic, client_id, invocation_status
Gaugerocketmq_consumer_cached_messagesmessageMessages in the PushConsumer local buffer queue.uid, instance_id, consumer_group, topic, client_id
Gaugerocketmq_consumer_cached_bytesbyteTotal size of messages in the PushConsumer local buffer queue.uid, instance_id, consumer_group, topic, client_id
Histogramrocketmq_await_timemsTime messages wait in the PushConsumer local buffer queue before processing begins. rocketmq_await_time = process start time - arrival time. Buckets: le_1_ms, le_5_ms, le_20_ms, le_100_ms, le_1000_ms, le_5000_ms, le_10000_ms, le_overflow.uid, instance_id, consumer_group, topic, client_id

Billing

Dashboard metrics are classified as basic metrics in ARMS Prometheus Service. Basic metrics are free. No additional charges apply for using the dashboard feature.

For more information, see Basic metrics and Pay-as-you-go.

FAQ

How do I access dashboard metric data programmatically?

All ApsaraMQ for RocketMQ metrics are stored in your Managed Service for Prometheus instance. To get the data access endpoint:

  1. Log on to the ARMS console. In the left-side navigation pane, click Integration Center.

  2. Search for RocketMQ and select Alibaba Cloud RocketMQ (5.0) Service. Complete the integration by following Integrate monitoring data of an Alibaba Cloud service.

  3. After provisioning completes, click Provisioning in the left-side navigation pane.

  4. Click the Cloud Service Region Environment tab, then click the target environment name.

  5. On the Component Management tab, in the Basic Information area, click the cloud service region next to Prometheus Instance.

  6. On the Settings tab, find the available data access methods (HTTP API endpoint, Remote Write, etc.).

How do I integrate metrics into a self-managed Grafana?

After retrieving the HTTP API address from the steps above, add it as a Prometheus data source in your Grafana instance. For step-by-step instructions, see Integrate Prometheus data into Grafana using an HTTP API address.

What does TPS Max represent?

TPS Max is the peak transactions-per-second value within a 1-minute window. The system samples TPS once per second (60 samples per minute) and reports the highest value.

Example: An instance sends 60 normal messages (4 KB each) in 1 minute.

  • If all 60 messages are sent in the first second, TPS per second = 60, 0, 0, ..., 0. TPS Max = 60.

  • If 40 messages are sent in the first second and 20 in the second, TPS per second = 40, 20, 0, ..., 0. TPS Max = 40.