This topic describes how to use CloudMonitor or Message Queue for Apache RocketMQ's built-in monitoring service to collect monitoring data, display the data visually, and implement real-time monitoring and alerting. This helps you to understand the consumption status to handle consumption exceptions promptly. You can use one of them for monitoring and alerting as needed. We recommend that you use CloudMonitor. This topic describes how to use the two monitoring services.

CloudMonitor (recommended)

The monitoring and alerting function of CloudMonitor can be used to remind you to handle problems or upgrade specifications promptly. The CloudMonitor backend determines whether the resource usage exceeds the configured threshold based on the alert rule you configured. If the resource usage exceeds the threshold, it notifies the corresponding contacts by SMS, email, TradeManager, or DingTalk Chatbot.

Notice
  • Currently, CloudMonitor is available only in the China (Hangzhou) region.
  • To receive SMS notifications, log on to the CloudMonitor console. On the Overview page, click Purchase SMS to purchase the SMS service.

Procedure

  1. Grant access to cloud resources.

    When you use the monitoring and alerting function for the first time, Message Queue for Apache RocketMQ requires your Alibaba Cloud account to grant access to your cloud resources, including CloudMonitor.

    Note When your Alibaba Cloud account has granted the access, Message Queue for Apache RocketMQ can also access the cloud resources of Resource Access Management (RAM) users under your Alibaba Cloud account. For more information, see Grant permissions to RAM users.
    1. Log on to the Message Queue for Apache RocketMQ console.
    2. In the top navigation bar, select the region, such as China (Hangzhou).
    3. On the Instances page, find the target instance and click Details in the Actions column.
    4. In the left-side navigation pane, click Monitoring and Alerts (Recommended).
    5. On the Cloud Resource Access Authorization page, click Confirm Authorization Policy.
  2. View the monitoring report.
    1. In the top navigation bar, select the region China (Hangzhou).
    2. On the Instances page, find the target instance and click Details in the Actions column.
    3. In the left-side navigation pane, click Monitoring and Alerts (Recommended), and select the resource whose monitoring data you want to view.cloud_monitor
      • To view the alert information of an instance, choose Instance > Monitoring Report. Example:instance_cloud_monitor

        Message Retention Period

        This metric indicates the maximum retention of all messages in the current cluster. It is available only in Platinum Edition instances. To ensure the continuous availability of Message Queue for Apache RocketMQ, when the disk space occupied has reached the disk capacity specification of your Platinum Edition instance, Message Queue for Apache RocketMQ deletes the messages with the earliest storage time in the first-in-first-out (FIFO) order.

        You can use this metric to measure the capacity of your Platinum Edition instance cluster, and the results are to be taken into account of when you upgrade or downgrade the capacity of your Platinum Edition instance.

        The horizontal axis represents time point, and the vertical axis represents message retention period. For example, the time point on the horizontal axis is 21:00, and the corresponding value on the vertical axis is 10, which means that the message retention period of the Platinum Edition is 10 hours at 21:00. If you want to retain these messages for a longer period, you need to expand the capacity of the disk.

      • To view the alert information under a topic, chooseTopic > Monitoring Report . Example:topic_cloud_monitor
      • To view the alert information under a group ID, choose Group > Monitoring Report. Example:gid_cloud_monitor

        Number of Accumulated Messages

        This metric indicates the number of accumulated messages under the group ID. For more information, see Terms.

      You can view the data of the last 1 hour, 3 hours, 6 hours, 12 hours, 1 day, 3 days, 7 days, or 14 days, or click the rightmost time picker to customize a time range.

      If you want to customize a time range, you can view data of up to the last 31 days. Data generated prior to the last 31 days is not retained. That is, if the end time in the time picker is the current system time, the earliest start time can be 31 days prior to the current date. If the end time is not the current system time, you can view data of up to 7 consecutive days within the last 31 days.

      Note The data aggregation cycle of the metric is 1 minute.
  3. Set an alert rule.
    1. Find the target resource and click Set Alert in the Actions column.
    2. On the Create Alarm Rule page, set the alert rule and notification method. For more information, see Alarm service.Set an alert rule

Built-in monitoring service of Message Queue for Apache RocketMQ

The built-in monitoring service of Message Queue for Apache RocketMQ can monitor the consumption status of messages under a specific topic that a group ID has subscribed to and receive SMS alerts, helping you learn about message accumulation in real time. SMS alerts sent by the built-in monitoring service are free of charge.

Prerequisites

  • Make sure that the region where the instance is located supports the built-in monitoring and alerting function. Currently, the following regions support this function:
    • Internet
    • China (Hangzhou)
    • China (Shanghai)
    • China (Qingdao)
    • China (Beijing)
    • China (Zhangjiakou-Beijing Winter Olympic)
    • China (Hohhot)
    • China (Shenzhen)
    • China (Hong Kong)
    • China North 2 Ali Gov
    • Singapore
    • Japan (Tokyo)
  • Make sure that the group ID you want to monitor has subscribed to the corresponding topic. For more information about how to subscribe to messages, see the following topics:

Procedure

Create a metric as follows:

  1. Log on to the Message Queue for Apache RocketMQ console. In the top navigation bar, select the region, such as China (Hangzhou).
  2. On the Instances page, find the target instance and click Details in the Actions column.
  3. In the left-side navigation pane, choose Monitoring and Alerts > New Monitoring Item.create_monitor_item
  4. In the New Monitoring Itemdialog box, set parameters as needed and click OK. The new metric appears on the Monitoring and Alerts page. new_monitor_item

    The parameters for creating a metric are described as follows:

    • Group ID:: indicates the group ID to be monitored.
    • Topic:: indicates the topic that the group ID has subscribed to.
    • Accumulation Threshold:: indicates the alert threshold for consumption accumulation. The valid threshold ranges from 1 to 100,000,000. If messages are accumulated when the group ID that you want to monitor consumes messages under the specified topic, and the number of accumulated messages exceeds the threshold, Message Queue for Apache RocketMQ sends a notification to the alert contact by SMS.
    • Consumption Delay Threshold:: indicates the consumption latency. Consumption latency is the interval between the latest time when the group ID consumes the messages of the specified topic and the time when the last message of the topic is sent to the group ID. The minimum threshold is 1 minute.
    • Alert Time:: indicates the alert time, which is accurate to the minute. The maximum range is 00:00-23:59. You will receive SMS alerts only within the specified time range.
    • Alert Frequency:: the alert frequency, which can be Every 5 Minutes, Every 15 Minutes, or Every 30 Minutes.
    • Alert Recipient: indicates the alert contact, which can be either the nickname or phone number of the contact. The maximum length of the nickname is 100 characters.
    Note If you have canceled a group ID's subscription to a topic, delete the corresponding metrics.
  5. (Optional) On the Monitoring and Alerts page, you can edit, disable, or delete the metrics that you have created. You can also click Enable in the Actions column to enable a disabled metric again.enable_monitor_item