All Products
Search
Document Center

ApsaraMQ for RocketMQ:Dashboard

Last Updated:Sep 25, 2025

ApsaraMQ for RocketMQ provides a dashboard for real-time data statistics that uses the metric storage and display capabilities of Alibaba Cloud ARMS Managed Service for Prometheus and Grafana. This feature helps you centrally collect and observe metrics from multiple dimensions to quickly understand the operational status of your business. This topic describes the scenarios, billing, metrics, and usage of the dashboard.

Scenarios

  • Scenario 1: You need to receive alerts and locate issues in a timely manner when exceptions occur during online message consumption.

  • Scenario 2: You need to check whether messages are sent as expected in the messaging system when the status of specific online orders is abnormal.

  • Scenario 3: You need to analyze the change trend of message traffic, the characteristics of traffic distribution, or message volume to help you analyze the business trend and make business plans.

  • Scenario 4: You need to view and analyze the upstream and downstream dependency topologies of applications to upgrade, optimize, or transform the architecture.

Prerequisites

  • Activate Managed Service for Prometheus.

  • Create a service-linked role.

    • Role name: AliyunServiceRoleForOns

    • Policy name: AliyunServiceRolePolicyForOns

    • Permissions: Allows ApsaraMQ for RocketMQ to use this role to access other Alibaba Cloud services, such as CloudMonitor and ARMS, to implement features for monitoring, alerting, and dashboards.

    • For more information, see Service-linked Role.

Billing

The dashboard metrics for ApsaraMQ for RocketMQ are basic metrics in ARMS Managed Service for Prometheus. Basic metrics are free of charge. Therefore, the dashboard feature is also free.

For more information, see Metrics and Pay-as-you-go.

Concepts

Before you view dashboard metrics, you need to understand the following concepts related to message accumulation.

The following figure shows the status of each message in a queue of a specific topic.

队列消息状态

In the preceding figure, ApsaraMQ for RocketMQ calculates the number of messages and the processing duration at different processing stages. The metrics that are used in this process reflect the processing rate and message accumulation in the queue. By monitoring the metrics, you can determine whether exceptions occur during consumption. The following table describes the details of the metrics and the formulas that are used to calculate the metrics.

Category

Metric

Description

Calculation formula

Message quantity

Inflight messages

The messages that a consumer client is processing and for which the client has not returned the consumption results.

Number of inflight messages = Offset of the latest pulled message - Offset of the latest acknowledged message

Ready messages

The messages that are visible to consumers and are ready for consumption on the ApsaraMQ for RocketMQ broker.

Number of ready messages = Maximum offset - Offset of the latest pulled message

Consumer lag

The messages that are being processed and ready to be processed.

Consumer lag = Number of inflight messages + Number of ready messages

Duration

Ready time

  • For a normal message or an ordered message, the ready time is the time when the message is stored in the broker.

  • For a scheduled message, the ready time is the time that is scheduled for the broker to deliver the message. For a delayed message, the ready time is the time when the specified delay period elapses.

  • For a transactional message, the ready time is the time when a transaction is committed.

N/A

Ready message queue time

The interval between the current point in time and the ready time of the earliest ready message.

This metric indicates how soon a consumer pulls messages.

Ready message queue time = Current time - Ready time of the earliest ready message

Consumer lag time

The interval between the ready time of the earliest unacknowledged message and the current time.

This metric indicates how soon a consumer processes messages.

Consumer lag time = Current time - Ready time of the earliest unacknowledged message

Metric details

The ApsaraMQ for RocketMQ dashboard provides the following metrics:

  • Producer: View metrics for a topic, such as the number of messages sent, send success rate, and send latency.

  • Consumer: View metrics related to a group's subscription to a specific topic, such as consumption volume, consumption success rate, and message accumulation.

  • Instance Top 20 overview: View the top 20 topics or groups for specific metric values within an instance.

  • Billing metrics: View metrics for an instance, such as message TPS, API calls, and average message size. These metrics can be used as a reference for estimating billing items.

Important

The collection period for all metrics is 1 minute. ApsaraMQ for RocketMQ supports queries for data from the last 15 days. The maximum time range for a single query is 24 hours.

Producer

Metric

Description

Message Production Rate

The message production rate and the API call rate for message production for a topic.

Units:

  • Message rate: messages/second

  • API call rate: calls/second

Peak Message Production Rate

The maximum message production rate.

Unit: messages/second.

Total Messages Produced

The total number of messages produced in a specific instance.

Unit: messages.

Message Production Call Success Rate

The success rate of message production for a topic.

Message Production Call Latency

The latency of message production for a topic.

Unit: ms.

Consumer

Metric

Description

Average Consumption Success Rate

The consumption success rate for all messages in a specific instance.

Accumulated Messages (Ready + Inflight)

The total number of accumulated messages in a specific instance, including ready and inflight messages.

Unit: messages.

Inflight Messages

The number of messages that are being processed by a consumer client but for which a success response has not been returned.

Unit: messages.

Ready Messages

The number of messages that are ready on the ApsaraMQ for RocketMQ server and can be consumed.

This metric reflects the scale of messages that have not yet been processed by consumers.

Unit: messages.

Ready Message Queue Time

The time difference between the current time and the ready time of the earliest ready message.

This metric reflects the latency of unprocessed messages and is a critical measure for time-sensitive services.

The metric value in the overview represents the average ready message queue time for the instance. The metric value in a specific chart represents the ready message queue time for a specific group subscribing to a specific topic.

Unit: ms.

Message Consumption Rate

The rate at which a group consumes messages.

Unit: messages/second

Peak Message Consumption Rate

The maximum message consumption rate.

Unit: messages/second

Total Messages Consumed

The total number of messages consumed in a specific instance.

Unit: messages.

Consumption Accumulation

The number of accumulated messages for a group, including ready and inflight messages.

Unit: messages.

Message Processing Latency

The time it takes for a group to process a message, from the start of consumption to completion.

Unit: ms.

Consumer Local Wait Time

The time it takes for a message to be processed after it arrives at the consumer client.

Unit: ms.

Consumption Success Rate

The success rate of message consumption.

Consumer Client Access Protocol Ratio

The ratio of consumed messages by protocol type.

Instance Top 20 overview

Metric

Description

Top 20 Topics by Message Production Rate

The top 20 topics with the highest message production rate.

Unit: messages/second.

Top 20 GroupIDs by Message Consumption Rate

The top 20 groups with the highest message consumption rate.

Unit: messages/second.

Top 20 GroupIDs by Number of Ready Messages

The top 20 groups with the most ready messages.

Unit: messages.

Top 20 GroupIDs by Ready Message Queue Time

The top 20 groups with the longest ready message queue time.

Unit: ms.

Top 20 GroupIDs by Number of Accumulated Messages (Ready + Inflight)

The top 20 groups with the most accumulated messages.

Unit: messages.

Top 20 GroupIDs by Number of Inflight Messages

The top 20 groups with the most inflight messages.

Unit: messages.

Top 20 GroupIDs by Consumption Processing Latency

The top 20 groups with the longest consumption processing latency.

Unit: ms.

Top 20 GroupIDs by Consumer Local Wait Time

The top 20 groups with the longest consumer local wait time.

Unit: ms.

Top 20 Topics by Message Production Call Failure Rate

The top 20 topics with the highest failure rate for message production.

Top 20 GroupIDs by Message Consumption Failure Rate

The top 20 groups with the highest failure rate for message consumption.

Billing metrics

Note

The values of the following billing metrics include multipliers for large messages and advanced features.

  • Large message multiplier: The unit of measurement is 4 KB. For example, if you send a 16 KB message, the number of API calls is calculated as 16 KB / 4 KB = 4.

  • Advanced feature multiplier: The number of API calls for messages with advanced features, such as ordered, scheduled, delayed, and transactional messages, is five times the number of API calls for normal messages.

Metric

Description

Peak Production TPS

The maximum message production TPS. This metric can be used as a reference for estimating the peak TPS specification in the instance's billing items.

Unit: calls/second.

Peak Consumption TPS

The maximum message consumption TPS. This metric can be used as a reference for estimating the peak TPS specification in the instance's billing items.

Unit: calls/second.

Peak TPS

The maximum value of the sum of message production TPS and message consumption TPS. This metric can be used as a reference for estimating the peak TPS specification in the instance's billing items.

Unit: calls/second.

Total API Calls

The total number of API calls. This metric can be used as a reference for estimating the number of API calls in the instance's billing items.

Unit: calls.

Average Message Size

The average size of all produced messages.

Unit: bytes.

Production And Consumption TPS

The sum of message production TPS and message consumption TPS.

Unit: calls/second.

Daily API Calls

The daily total number of API calls for message production and consumption.

Unit: calls.

Metrics Details

Important

When calculating metrics related to message TPS, the number of messages sent and received, or the total number of messages, the base unit is a 4 KB normal message. Multipliers for message size and advanced message types are applied to this base unit.

The following table describes the fields in the metrics.

Field

Value

Metric type

Gauge: A metric that can increase or decrease. Its value represents an instantaneous measurement of the statistical object. For example, the TPS of API calls.

Label

  • instance_id: ApsaraMQ for RocketMQ instance ID.

  • topic: ApsaraMQ for RocketMQ topic.

  • message_type: Message type. normal indicates a normal message. fifo indicates an ordered message. transaction indicates a transactional message. delay indicates a scheduled or delayed message.

  • uid: Your Alibaba Cloud account ID.

  • protocol_type: Protocol type. tcp indicates the TCP protocol. http indicates the HTTP protocol.

Server-side metrics

Metric type

Metric name

Unit

Description

Label

Gauge

rocketmq_instance_requests_threshold

count/s

Instance throttling threshold.

  • uid

  • instance_id

Gauge

rocketmq_instance_requests_max

count/s

The maximum TPS of an instance per minute. Requests that are throttled are not included.

Rule: The maximum value among the 60 TPS samples taken within 1 minute.

  • uid

  • instance_id

Producer metrics

Metric type

Metric name

Unit

Description

Label

Gauge

rocketmq_producer_requests

(commercialCount, billable requests)

count

Number of API calls related to sending messages.

  • uid

  • instance_id

  • topic

  • message_type="normal|fifo|transaction|delay"

Gauge

rocketmq_producer_messages

message

Number of sent messages.

  • uid

  • instance_id

  • topic

  • message_type="normal|fifo|transaction|delay"

Gauge

rocketmq_producer_message_size_bytes

byte

Total size of sent messages.

  • uid

  • instance_id

  • topic

  • message_type="normal|fifo|transaction|delay"

Gauge

rocketmq_producer_send_success_rate

%

Send success rate.

  • uid

  • instance_id

  • topic

Gauge

rocketmq_producer_failure_api_calls

count

Number of failed API calls for sending messages.

  • uid

  • instance_id

  • topic

Gauge

rocketmq_producer_send_rt_milliseconds_avg

ms

Average latency of sending messages.

  • uid

  • instance_id

  • topic

Gauge

rocketmq_producer_send_rt_milliseconds_min

ms

Minimum latency of sending messages.

  • uid

  • instance_id

  • topic

Gauge

rocketmq_producer_send_rt_milliseconds_max

ms

Maximum latency of sending messages.

  • uid

  • instance_id

  • topic

Gauge

rocketmq_producer_send_rt_milliseconds_p95

ms

P95 latency of sending messages.

  • uid

  • instance_id

  • topic

Gauge

rocketmq_producer_send_rt_milliseconds_p99

ms

P99 latency of sending messages.

  • uid

  • instance_id

  • topic

Consumer metrics

Metric type

Metric name

Unit

Description

Label

Gauge

rocketmq_consumer_requests

count

Number of API calls related to consuming messages.

  • uid

  • instance_id

  • topic

  • client_group

  • protocol_type="tcp|http"

Gauge

rocketmq_consumer_send_back_requests

count

Number of API calls to send back messages that failed to be consumed.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_send_back_messages

message

Messages that failed to be consumed and were sent back by consumers.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_messages

message

Number of consumed messages.

  • uid

  • instance_id

  • topic

  • client_group

  • protocol_type="tcp|http"

Gauge

rocketmq_consumer_message_size_bytes

byte

Size of consumed messages (accumulated over one minute).

  • uid

  • instance_id

  • topic

  • client_group

  • protocol_type="tcp|http"

Gauge

rocketmq_consumer_ready_and_inflight_messages

message

Message consumption lag (includes ready and inflight messages).

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_ready_messages

message

Number of ready messages.

Actual accumulation: maxOffset - lastPullOffset

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_inflight_messages

message

Number of inflight messages.

Rule: lastPullOffset - committedOffset

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_queue_time_milliseconds

ms

Message queue time.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_await_time_milliseconds_avg

ms

Average time that a message waits for processing resources on the consumer client.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_await_time_milliseconds_min

ms

Minimum time that a message waits for processing resources on the consumer client.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_await_time_milliseconds_max

ms

Maximum time that a message waits for processing resources on the consumer client.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_await_time_milliseconds_p95

ms

P95 time that a message waits for processing resources on the consumer client.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_await_time_milliseconds_p99

ms

P99 time that a message waits for processing resources on the consumer client.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_process_time_milliseconds_avg

ms

Average message processing latency for a consumer.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_process_time_milliseconds_min

ms

Minimum message processing latency for a consumer.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_process_time_milliseconds_max

ms

Maximum message processing latency for a consumer.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_process_time_milliseconds_p95

ms

P95 message processing latency for a consumer.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_message_process_time_milliseconds_p99

ms

P99 message processing latency for a consumer.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_consume_success_rate

%

Message consumption success rate.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_failure_api_calls

count

Number of failed API calls for consumption.

  • uid

  • instance_id

  • topic

  • group_id

Gauge

rocketmq_consumer_to_dlq_messages

message

Number of messages sent to the dead-letter queue (DLQ).

  • uid

  • instance_id

  • topic

  • group_id

View the dashboard

  1. Log on to the ApsaraMQ for RocketMQ console. In the left-side navigation pane, click Instances.

  2. In the top navigation bar, select a region, such as China (Hangzhou). On the Instances page, click the name of the instance that you want to manage.

  3. Use one of the following methods to view the dashboard:

    • On the Instance Details page, click the Dashboard tab.

    • In the left-side navigation pane of the Instance Details page, click Dashboard.

    • In the left-side navigation pane of the Instance Details page, click Topics. On the page that appears, click the name of the topic that you want to manage. On the Topic Details page, click the Dashboard tab.

    • In the left-side navigation pane of the Instance Details page, click Groups. On the page that appears, click the name of the group that you want to manage. On the Group Details page, click the Dashboard tab.

Dashboard FAQ

How do I obtain dashboard metric data?

  1. Log on to the ARMS console with your Alibaba Cloud account.

  2. In the navigation pane on the left, click Integration Center.

  3. On the Integration Center page, enter RocketMQ in the search box and click the search icon.

  4. In the search results, select the Alibaba Cloud service that you want to integrate, such as Alibaba Cloud RocketMQ (4.0) Service. For more information, see Step 1: Integrate monitoring data of an Alibaba Cloud service.

  5. After the integration is successful, click Provisioning in the navigation pane on the left.

  6. In the Cloud Service Area Environment list, click the name of the target environment to go to its details page.

  7. On the Component Management tab, in the Basic Information section, click the region of the Prometheus Instance.

  8. On the Settings tab, you can find different data access methods.

How do I integrate metric data provided by the dashboard of ApsaraMQ for RabbitMQ into a self-managed Grafana system?

All metric data on the dashboard of ApsaraMQ for RocketMQ are stored in Alibaba Cloud Managed Service for Prometheus. You can follow the procedure in the "How do I obtain metrics on the dashboard?" section to integrate the monitoring data of ApsaraMQ for RocketMQ into Managed Service for Prometheus, obtain the environment name and HTTP API URL, and then use the HTTP API URL to integrate the metric data on the dashboard of ApsaraMQ for RocketMQ into a self-managed Grafana system. For more information, see Use an HTTP API URL to connect a Prometheus instance to a self-managed Grafana system.

How do I understand the average TPS and max TPS of an instance?

  • Average TPS = Total requests in 1 minute / 60 seconds

  • Max TPS: Within a 1-minute statistical period, the TPS value is sampled once per second. The max TPS is the highest of these 60 sampled values.

For example:

Assume that an instance produces 60 messages in 1 minute. All messages are normal messages and each is 4 KB in size. The production rate of the instance is 60 messages per minute.

Average instance TPS = 60 calls / 60 seconds = 1 call per second

The max instance TPS is calculated as follows:

  • If the 60 messages are sent in the first second, the TPS values for each second of that minute are 60, 0, 0, ..., 0.

    Max instance TPS = 60 calls per second.

  • If 40 messages are sent in the first second and 20 messages are sent in the second second, the TPS values for each second of that minute are 40, 20, 0, 0, ..., 0.

    Max instance TPS = 40 calls per second.