All Products
Search
Document Center

ApsaraMQ for RocketMQ:Dashboard

Last Updated:Mar 11, 2026

Distributed messaging systems generate metrics across producers, consumers, and broker instances that are difficult to correlate manually. ApsaraMQ for RocketMQ provides a built-in dashboard, powered by Alibaba Cloud Managed Service for Prometheus and Grafana, that aggregates these metrics in a single view. Use it to detect consumption delays, track message accumulation trends, verify message delivery, and estimate billing -- without switching between tools.

Use cases

  • Consumption anomaly detection: Get alerts when consumers fall behind or fail, and identify the affected groups and topics.

  • Message delivery verification: Confirm messages are being sent as expected when the status of specific online orders is abnormal.

  • Traffic trend analysis: Track production and consumption rates over time to forecast capacity and plan for traffic peaks.

  • Dependency topology review: Map upstream and downstream application dependencies to guide architecture upgrades.

Prerequisites

Before you begin, make sure that you have:

  • Managed Service for Prometheus activated

  • The AliyunServiceRoleForOns service-linked role created (policy name: AliyunServiceRolePolicyForOns), which grants ApsaraMQ for RocketMQ access to Cloud Monitor and Application Real-Time Monitoring Service (ARMS) for monitoring, alerting, and dashboard features. For details, see Service-linked roles

Billing

Dashboard metrics are basic metrics in ARMS Managed Service for Prometheus. Basic metrics are free, so the dashboard incurs no additional cost.

For more information, see Metrics and Pay-as-you-go.

How message accumulation works

Understanding how ApsaraMQ for RocketMQ tracks message processing state is essential for interpreting dashboard metrics -- particularly consumer lag, which is the primary indicator of consumption health.

Message processing states in a queue

The following diagram shows each message's state in a queue. The broker calculates message counts and processing durations at each stage. These metrics show how quickly consumers pull and acknowledge messages, and whether accumulation is building up.

Message count metrics

MetricDescriptionFormula
Inflight messagesMessages that a consumer has pulled but not yet acknowledged.Latest pulled offset - Latest acknowledged offset
Ready messagesMessages stored on the broker that are visible to consumers and available for consumption.Maximum offset - Latest pulled offset
Consumer lagTotal unprocessed messages, including both inflight and ready messages. A rising consumer lag indicates consumers are falling behind producers.Inflight messages + Ready messages

Timing metrics

MetricDescriptionFormula
Ready timeWhen a message becomes available for consumption. Varies by message type (see below).N/A
Ready message queue timeHow long the earliest ready message has been waiting. Indicates how quickly consumers pull messages.Current time - Ready time of the earliest ready message
Consumer lag timeHow long the earliest unacknowledged message has been waiting. Indicates overall processing speed.Current time - Ready time of the earliest unacknowledged message

Ready time by message type:

  • Normal or ordered message: The time the broker stores the message.

  • Scheduled message: The scheduled delivery time. For a delayed message, the time the delay period elapses.

  • Transactional message: The time the transaction is committed.

View the dashboard

  1. Log on to the ApsaraMQ for RocketMQ console. In the left-side navigation pane, click Instances.

  2. In the top navigation bar, select a region, such as China (Hangzhou). On the Instances page, click the name of the instance that you want to manage.

  3. Open the dashboard using any of the following methods:

    • On the Instance Details page, click the Dashboard tab.

    • In the left-side navigation pane, click Dashboard.

    • In the left-side navigation pane, click Topics. Click a topic name to open the Topic Details page, then click the Dashboard tab.

    • In the left-side navigation pane, click Groups. Click a group name to open the Group Details page, then click the Dashboard tab.

Dashboard metrics

Metrics are organized into four categories: producer, consumer, instance top 20, and billing.

Important

All metrics are collected at 1-minute intervals. Data is available for the last 15 days, with a maximum query range of 24 hours.

Producer metrics

MetricDescriptionUnit
Message production rateProduction rate and API call rate for a topic.messages/s, calls/s
Peak message production rateMaximum production rate.messages/s
Total messages producedTotal messages produced in the instance.messages
Message production call success ratePercentage of successful send calls for a topic.%
Message production call latencySend latency for a topic.ms

Consumer metrics

MetricDescriptionUnit
Average consumption success rateConsumption success rate across all messages in the instance.%
Accumulated messages (Ready + Inflight)Total accumulated messages in the instance, combining ready and inflight messages.messages
Inflight messagesMessages being processed by a consumer that have not been acknowledged.messages
Ready messagesMessages available on the broker for consumption. Reflects the scale of unprocessed messages.messages
Ready message queue timeTime since the earliest ready message became available. The overview shows the instance average; specific charts show per-group-topic values. Monitor closely for latency-sensitive workloads.ms
Message consumption rateRate at which a group consumes messages.messages/s
Peak message consumption rateMaximum consumption rate.messages/s
Total messages consumedTotal consumed messages in the instance.messages
Consumption accumulationAccumulated messages for a group, including ready and inflight messages.messages
Message processing latencyTime from when a group starts consuming a message to completion.ms
Consumer local wait timeTime a message waits on the consumer client before processing begins.ms
Consumption success rateSuccess rate of message consumption.%
Consumer client access protocol ratioDistribution of consumed messages by protocol type (TCP vs. HTTP).--

Instance top 20 overview

MetricUnit
Top 20 topics by message production ratemessages/s
Top 20 groups by message consumption ratemessages/s
Top 20 groups by ready messagesmessages
Top 20 groups by ready message queue timems
Top 20 groups by accumulated messages (Ready + Inflight)messages
Top 20 groups by inflight messagesmessages
Top 20 groups by consumption processing latencyms
Top 20 groups by consumer local wait timems
Top 20 topics by message production call failure rate%
Top 20 groups by message consumption failure rate%

Billing metrics

Use billing metrics to estimate cost-related items for your instance, such as peak TPS and API call volume.

Note

Billing metric values include multipliers for large messages and advanced features:

  • Large message multiplier: The billing unit is 4 KB. A 16 KB message counts as 16 / 4 = 4 API calls.

  • Advanced feature multiplier: Ordered, scheduled, delayed, and transactional messages count as 5x the API calls of normal messages.

MetricDescriptionUnit
Peak production TPSMaximum production TPS. Use to estimate the peak TPS specification for billing.calls/s
Peak consumption TPSMaximum consumption TPS. Use to estimate the peak TPS specification for billing.calls/s
Peak TPSMaximum combined production and consumption TPS. Use to estimate the peak TPS specification for billing.calls/s
Total API callsTotal API calls. Use to estimate API call volume for billing.calls
Average message sizeAverage size of all produced messages.bytes
Production and consumption TPSCombined production and consumption TPS.calls/s
Daily API callsDaily total of production and consumption API calls.calls

Prometheus metric reference

Important

TPS, message count, and total message calculations use a 4 KB normal message as the base unit. Size and advanced-feature multipliers apply on top of this base.

All metrics use the Gauge type -- an instantaneous measurement that can increase or decrease.

Common labels

LabelDescription
instance_idApsaraMQ for RocketMQ instance ID
topicTopic name
message_typeMessage type: normal, fifo, transaction, or delay
uidAlibaba Cloud account ID
protocol_typeProtocol: tcp or http
client_group / group_idConsumer group identifier

Server-side metrics

Metric nameUnitDescriptionLabels
rocketmq_instance_requests_thresholdcount/sInstance throttling threshold.uid, instance_id
rocketmq_instance_requests_maxcount/sMaximum TPS per minute, excluding throttled requests. Calculated as the highest of 60 per-second samples within one minute.uid, instance_id

Producer metrics

Metric nameUnitDescriptionLabels
rocketmq_producer_requestscountBillable API calls for sending messages.uid, instance_id, topic, message_type
rocketmq_producer_messagesmessagesNumber of sent messages.uid, instance_id, topic, message_type
rocketmq_producer_message_size_bytesbytesTotal size of sent messages.uid, instance_id, topic, message_type
rocketmq_producer_send_success_rate%Send success rate.uid, instance_id, topic
rocketmq_producer_failure_api_callscountFailed send API calls.uid, instance_id, topic
rocketmq_producer_send_rt_milliseconds_avgmsAverage send latency.uid, instance_id, topic
rocketmq_producer_send_rt_milliseconds_minmsMinimum send latency.uid, instance_id, topic
rocketmq_producer_send_rt_milliseconds_maxmsMaximum send latency.uid, instance_id, topic
rocketmq_producer_send_rt_milliseconds_p95msP95 send latency.uid, instance_id, topic
rocketmq_producer_send_rt_milliseconds_p99msP99 send latency.uid, instance_id, topic

Consumer metrics

Metric nameUnitDescriptionLabels
rocketmq_consumer_requestscountAPI calls for consuming messages.uid, instance_id, topic, client_group, protocol_type
rocketmq_consumer_send_back_requestscountAPI calls to return messages that failed consumption.uid, instance_id, topic, group_id
rocketmq_consumer_send_back_messagesmessagesMessages returned by consumers after failed consumption.uid, instance_id, topic, group_id
rocketmq_consumer_messagesmessagesNumber of consumed messages.uid, instance_id, topic, client_group, protocol_type
rocketmq_consumer_message_size_bytesbytesSize of consumed messages, accumulated over one minute.uid, instance_id, topic, client_group, protocol_type
rocketmq_consumer_ready_and_inflight_messagesmessagesConsumer lag: ready messages plus inflight messages.uid, instance_id, topic, group_id
rocketmq_consumer_ready_messagesmessagesReady messages. Calculated as maxOffset - lastPullOffset.uid, instance_id, topic, group_id
rocketmq_consumer_inflight_messagesmessagesInflight messages. Calculated as lastPullOffset - committedOffset.uid, instance_id, topic, group_id
rocketmq_consumer_queue_time_millisecondsmsMessage queue time.uid, instance_id, topic, group_id
rocketmq_consumer_message_await_time_milliseconds_avgmsAverage consumer local wait time.uid, instance_id, topic, group_id
rocketmq_consumer_message_await_time_milliseconds_minmsMinimum consumer local wait time.uid, instance_id, topic, group_id
rocketmq_consumer_message_await_time_milliseconds_maxmsMaximum consumer local wait time.uid, instance_id, topic, group_id
rocketmq_consumer_message_await_time_milliseconds_p95msP95 consumer local wait time.uid, instance_id, topic, group_id
rocketmq_consumer_message_await_time_milliseconds_p99msP99 consumer local wait time.uid, instance_id, topic, group_id
rocketmq_consumer_message_process_time_milliseconds_avgmsAverage message processing latency.uid, instance_id, topic, group_id
rocketmq_consumer_message_process_time_milliseconds_minmsMinimum message processing latency.uid, instance_id, topic, group_id
rocketmq_consumer_message_process_time_milliseconds_maxmsMaximum message processing latency.uid, instance_id, topic, group_id
rocketmq_consumer_message_process_time_milliseconds_p95msP95 message processing latency.uid, instance_id, topic, group_id
rocketmq_consumer_message_process_time_milliseconds_p99msP99 message processing latency.uid, instance_id, topic, group_id
rocketmq_consumer_consume_success_rate%Consumption success rate.uid, instance_id, topic, group_id
rocketmq_consumer_failure_api_callscountFailed consumption API calls.uid, instance_id, topic, group_id
rocketmq_consumer_to_dlq_messagesmessagesMessages sent to the dead-letter queue (DLQ).uid, instance_id, topic, group_id

FAQ

How do I get raw metric data from the dashboard?

Dashboard metrics are stored in ARMS Managed Service for Prometheus. To access the raw data:

  1. Log on to the ARMS console with your Alibaba Cloud account.

  2. In the left-side navigation pane, click Integration Center.

  3. Search for RocketMQ and select Alibaba Cloud RocketMQ (4.0) Service. For setup details, see Integrate monitoring data of an Alibaba Cloud service.

  4. After integration succeeds, click Provisioning in the left-side navigation pane.

  5. In the Cloud Service Area Environment list, click the target environment name.

  6. On the Component Management tab, in the Basic Information section, click the region link for the Prometheus Instance.

  7. On the Settings tab, find the available data access methods.

How do I integrate dashboard metrics into a self-managed Grafana system?

Complete the integration steps described in How do I obtain metrics on the dashboard? to connect ApsaraMQ for RocketMQ monitoring data to Managed Service for Prometheus. Then retrieve the HTTP API URL from the environment settings and use it to connect your self-managed Grafana instance. For details, see Use an HTTP API URL to connect a Prometheus instance to a self-managed Grafana system.

How are average TPS and max TPS calculated?

  • Average TPS = Total requests in one minute / 60 seconds

  • Max TPS = The highest value among 60 per-second TPS samples taken within one minute

Example: An instance produces 60 normal 4 KB messages in one minute.

  • Average TPS = 60 calls / 60 s = 1 call/s

  • If all 60 messages are sent in the first second, the per-second TPS values are 60, 0, 0, ..., 0. Max TPS = 60 calls/s.

  • If 40 messages are sent in the first second and 20 in the second, the values are 40, 20, 0, ..., 0. Max TPS = 40 calls/s.