Message Queue for Apache RocketMQ uses Prometheus Service and Grafana Service provided by Application Real-Time Monitoring Service (ARMS) to offer the dashboard feature. Prometheus Service is used for monitoring and Grafana Service is used to store and display metrics. The dashboard feature allows you to monitor metrics and collect metric data in an all-in-one, comprehensive, and multi-dimensional manner. This helps you obtain information about the status of your business. This topic describes the billing details and scenarios of the dashboard feature. This topic also describes available dashboard metrics and how to use the dashboard feature.
Prerequisites
- ARMS Prometheus Service is activated. For more information, see Activate and upgrade ARMS.
- The service-linked role is created.
- Role name: AliyunServiceRoleForOns.
- Role policy name: AliyunServiceRolePolicyForOns.
- Permission description: allows Message Queue for Apache RocketMQ to assume this role to access CloudMonitor and ARMS to implement the monitoring, alerting, and dashboard features.
- Reference: Service-linked roles.
Billing
Dashboard metrics that are used in Message Queue for Apache RocketMQ are basic metrics that are used in Alibaba Cloud ARMS Prometheus Service. You are not charged for basic metrics. Therefore, you are not charged for using the dashboard feature.
For more information, see Basic metrics and Pay-as-you-go.
Scenarios
Concepts
This section introduces concepts related to metrics that are used in message accumulation. These concepts can help you understand the dashboard metrics that are used in Message Queue for Apache RocketMQ.
- Inflight message: A message that is being processed on the consumer client and for which no success response is returned.
- Ready message: A message that is ready on the Message Queue for Apache RocketMQ broker and can be consumed by consumers.
The Ready messages metric reflects the number of messages that have not been processed by consumers.
- Ready time of a ready message
- For a normal message, the ready time equals the point in time when the normal message is stored.
- For a scheduled message, the ready time equals the point in time that is scheduled for the broker to deliver the message. For a delayed message, the ready time equals the point in time when the specified delay period elapses.
- For a transactional message, the ready time equals the point in time when the transaction is committed.
- Ready message queuing duration: the offset between the current point in time and the point in time when the earliest
message is ready.
The Ready message queue time metric reflects the delay period for ready messages before they are processed. This metric is important for time-sensitive workloads.
For example, in the preceding figure, the ready time of the first ready message M1 is 12:00:00, and the ready time of the last ready message M2 is 12:00:30. If the current point in time is 12:00:50, the ready message queuing duration can be calculated based on the following formula: Current point in time (12:00:50) - Ready time of M1 (12:00:00) = 50 seconds.
- Ready time of a ready message
Metrics
- Producer: displays metrics that collect message production statistics for a specific topic or all topics, such as the number of sent messages, the success rate of message sending, and the sending duration.
- Consumer: displays metrics that collect message consumption statistics for a specific group or all groups, such as the number of consumed messages from a specific topic, the success rate of message consumption, and message accumulation.
- Instance top 20 info: displays the top 20 values of some metrics for a specified instance and the topic or group to which each value corresponds.
- Billing metrics overview: displays metrics that collect billing statistics for a specified instance, such as the message sending TPS, the message consumption TPS, the number of API calls, and the average message size. These metrics can be used to estimate the billable items of the instance.
Producer
Metric | Description |
---|---|
Send message rate |
The rate at which messages are sent to a specified topic or all topics and the rate at which API operations are called to send messages. Unit:
|
Max send message rate |
The maximum rate at which messages are sent. Unit: messages per second. |
Total sent messages |
The total number of messages produced in a specified instance. Unit: messages. |
Send API call success rate | The percentage of successful API calls that are made to send messages to a specified topic or all topics. |
Send RT |
The amount of time that is used to send a message to a topic. Unit: milliseconds. |
Consumer
Metric | Description |
---|---|
Avg consumption success rate | The percentage of all messages that are successfully consumed in a specified instance. |
Consumer lag |
The total number of accumulated messages in a specified instance, including ready messages and inflight messages. Unit: messages. |
Inflight messages. |
The number of messages that are being processed on the consumer clients and for which no success response is returned. Unit: messages. |
Ready messages. |
The number of messages that are ready on the Message Queue for Apache RocketMQ broker and can be consumed by consumers. This metric reflects the number of messages that have not been processed by consumers. Unit: messages. |
Ready message queue time |
The offset between the current point in time and the point in time when the earliest message is ready. This metric reflects the delay period for ready messages before they are processed. This metric is important for time-sensitive workloads. The metric value in the overview information indicates the average queuing duration of ready messages in a specified instance. The metric value in a specific chart indicates the queuing duration of ready messages in a specified topic to which a specified group subscribes. Unit: milliseconds. |
Receive message rate |
The rate at which a specified group or all groups consume messages. Unit: messages per second. |
Max receive message rate |
The maximum rate at which a specified group or all groups consume messages. Unit: messages per second. |
Total received messages |
The total number of consumed messages in a specified instance. Unit: messages. |
Consumer lag |
The number of accumulated messages for a specified group or all groups, including ready messages and inflight messages. Unit: messages. |
Message processing time |
The amount of time that is used to consume a message in a specified group or all groups. Unit: milliseconds. |
Wait to process time |
The amount of time before a message starts to be consumed by a group after the consumer client receives the message. Unit: milliseconds. |
Consumption success rate | The percentage of messages that are successfully consumed. |
Consumption messages each protocol | The proportion of consumed messages of each client protocol. |
Instance top 20 info
Metric | Description |
---|---|
Send message rate per Topic |
The top 20 topics with the highest message sending rate. Unit: messages per second. |
Receive message rate per GroupID |
The top 20 groups with the highest message consumption rate. Unit: messages per second. |
Ready messages per GroupID |
The top 20 groups with the largest number of ready messages. Unit: messages. |
Ready message queue time per GroupID |
The top 20 groups with the longest queuing duration of ready messages. Unit: milliseconds. |
Consumer lag per GroupID |
The top 20 groups with the largest number of accumulated messages. Unit: messages. |
Inflight messages per GroupID | The top 20 groups with the largest number of messages that are being processed.
Unit: messages. |
Message processing time per GroupID |
The top 20 groups with the longest message consumption duration. Unit: milliseconds. |
Message wait time per GroupID |
The top 20 groups with the longest wait duration before a message is consumed. Unit: milliseconds. |
Send API call failure rate per Topic | The top 20 topics with the highest failure rate for API calls that are made to send messages. |
Consumption failure rate per GroupID | The top 20 groups with the highest failure rate for message consumption. |
Billing metrics overview
- Large message multiple: The message body in each API request has a size limit of 4 KB. If you need to send a message that is larger than 4 KB, you must use multiple API requests to send the message. For example, if you need to send a message of 16 KB, the number of API calls is calculated by using the following formula: Message size (16 KB)/4 KB = 4 calls.
- Featured message multiple: The number of API calls to send and subscribe to featured messages is counted as five times as those to send and subscribe to normal messages. Featured messages include ordered messages, scheduled messages, delayed messages, transactional messages.
Metric | Description |
---|---|
Max send TPS |
The maximum TPS for sending messages to a specified topic or all topics. This metric can be used to estimate the maximum TPS specification in the billable items for an instance. Unit: TPS. |
Max receive TPS |
The maximum TPS for message consumption of a specified group or all groups. This metric can be used to estimate the maximum TPS specification in the billable items for an instance. Unit: TPS. |
Max TPS |
The maximum sum of the message sending TPS and the message consumption TPS. This metric can be used to estimate the maximum TPS specification in the billable items for an instance. Unit: TPS. |
Total API calls |
The total number of API calls. This metric can be used to estimate the number of API calls in the billable items for an instance. Unit: calls. |
Average message size |
The average size of all messages that are produced. Unit: bytes. |
Send and receive TPS | The sum of the message sending TPS and the message consumption TPS.
Unit: TPS. |
Total API calls per day |
The sum of the number of API calls that are made to send messages and the number of API calls that are made to subscribe to messages on a daily basis. Unit: calls. |
View the dashboard of an instance