ApsaraMQ for RocketMQ integrates with CloudMonitor to collect metrics, visualize performance data, and trigger real-time alerts at no additional cost. Use these capabilities to track instance health, detect consumer lag, and identify throttling.
Prerequisites
Before you begin, ensure that you have:
The service-linked role AliyunServiceRoleForOns with the AliyunServiceRolePolicyForOns policy attached. This role grants ApsaraMQ for RocketMQ access to CloudMonitor and Application Real-Time Monitoring Service (ARMS) for monitoring, alerting, and dashboard features. For details, see Service-linked roles.
Metrics reference
ApsaraMQ for RocketMQ reports metrics across three categories: instance, producer, and consumer.
For metrics specific to message accumulation -- including consumer lag, consumer lag time, ready messages, and ready message queue time -- see Scenarios.
Instance metrics
| Metric | Metric name | Aggregation | Unit |
|---|---|---|---|
| TPS of an instance | InstanceApiCallTps | Sum | count/s |
| Storage capacity of an ApsaraMQ for RocketMQ 5.0 instance | InstanceStorageSize | Sum | byte |
| Internet outbound bandwidth of an ApsaraMQ for RocketMQ 5.0 instance | InstanceInternetFlowoutBandwidth | Max | byte/s |
| Peak TPS for sending messages on an ApsaraMQ for RocketMQ 5.0 instance | InstanceSendApiCallTps | Max | count/s |
| Peak TPS for receiving messages on an ApsaraMQ for RocketMQ 5.0 instance | InstanceReceiveApiCallTps | Max | count/s |
The following metrics apply to Internet-facing instances and support Average, Minimum, and Maximum aggregations:
| Metric | Metric name | Unit |
|---|---|---|
| Active connections per second | InstanceActiveConnection | count/s |
| Inbound bits per second | InstanceTrafficRX | bit/s |
| Outbound bits per second | InstanceTrafficTX | bit/s |
| Dropped outbound bits per second | InstanceDropTrafficTX | bit/s |
| Dropped inbound bits per second | InstanceDropTrafficRX | bit/s |
| Outbound bandwidth utilization | InstanceTrafficTXUtilization | % |
| Inbound bandwidth utilization | InstanceTrafficRXUtilization | % |
Producer metrics
| Metric | Metric name | Aggregation | Unit |
|---|---|---|---|
| Messages sent per instance per minute | SendMessageCountPerInstance | Sum | count/min |
| Messages sent per topic per minute | SendMessageCountPerTopic | Sum | count/min |
| Throttled send requests per instance per minute | ThrottledSendRequestsPerInstance | Sum | count/min |
| Throttled send requests per topic per minute | ThrottledSendRequestsPerTopic | Sum | count/min |
Consumer metrics
Consumer metrics help you track message consumption rates, detect consumer lag, and identify throttling at the instance, consumer group, and topic levels.
| Metric | Metric name | Aggregation | Unit |
|---|---|---|---|
| Messages received per instance per minute | ReceiveMessageCountPerInstance | Sum | count/min |
| Messages received per consumer group per minute | ReceiveMessageCountPerGid | Sum | count/min |
| Messages received from a topic in a consumer group per minute | ReceiveMessageCountPerGidTopic | Sum | count/min |
| Throttled receive requests per instance per minute | ThrottledReceiveRequestsPerInstance | Sum | count/min |
| Throttled receive requests per consumer group per minute | ThrottledReceiveRequestsPerGid | Sum | count/min |
| Throttled receive requests from a topic in a consumer group per minute | ThrottledReceiveRequestsPerGidTopic | Sum | count/min |
Consumer lag and ready message metrics
Consumer lag metrics indicate how far behind consumers are in processing messages. Monitor these metrics to identify slow or stuck consumers and take corrective action such as scaling consumer instances or investigating processing bottlenecks.
| Metric | Metric name | Aggregation | Unit |
|---|---|---|---|
| Consumer lag in a consumer group | ConsumerLag | Sum | count |
| Consumer lag from a topic in a consumer group | ConsumerLagPerGidTopic | Sum | count |
| Consumer lag time in a consumer group | ConsumerLagLatencyPerGid | Max | ms |
| Consumer lag time from a topic in a consumer group | ConsumerLagLatencyPerGidTopic | Max | ms |
| Ready message queuing time in a consumer group | (GroupId)ReadyMessageQueueTime | Max | ms |
| Ready message queuing time from a topic in a consumer group | ReadyMessageQueueTimePerGidTopic | Max | ms |
| Ready messages in a consumer group | ReadyMessages | Sum | count |
| Ready messages from a topic in a consumer group | ReadyMessagesPerGidTopic | Sum | count |
Dead-letter message metrics
| Metric | Metric name | Aggregation | Unit |
|---|---|---|---|
| New dead-letter messages in a consumer group per minute | SendDLQMessageCountPerGid | Sum | count/min |
| New dead-letter messages from a topic in a consumer group per minute | SendDLQMessageCountPerGidTopic | Sum | count/min |
View metrics
Log on to the ApsaraMQ for RocketMQ console. In the left-side navigation pane, click Instances.
In the top navigation bar, select a region, such as China (Hangzhou). On the Instances page, click the name of the instance that you want to manage.
In the left-side navigation pane, click Monitoring and Alerts.
On the Monitoring and Alerts page, select a resource type from the Group Name drop-down list, then select a query time range. The page automatically displays metric charts for the selected resource type.
Create an alert rule
On the Instances page, click the name of the instance that you want to manage.
In the left-side navigation pane, click Monitoring and Alerts.
In the upper-right corner of the Monitoring and Alerts page, click Create Alert Rule. The Create Alert Rule panel opens in the CloudMonitor console.
Configure the alert rule and notification settings, then click OK. For parameter descriptions, see Create an alert rule.