Manage alarm rules

Last Updated: Feb 07, 2018

The alarm service provides the monitoring alarm capability allowing you to obtain up-to-date metric data for troubleshooting any cloud product abnormality in a timely manner.

Parameter description

  • Product: host monitoring, ApsaraDB for RDS, OSS, and so on.
  • Resource range: indicates the range in which an alarm rule takes effect. Three range types are provided: all resources, application group, and instance.
    • “All resources”: The alarm rule takes effect for all instances of a product under a username. For example, if you set an alarm rule for MongoDB CPU usage greater than 80% and select the all resources range, the alarm rule is hit when the CPU usage of a MongoDB instance under your username is greater than 80%.
    • “Application group”: The alarm rule takes effect for all instances in an application group. For example, if you set an alarm rule for host CPU usage greater than 80% and select the application group range, the alarm rule is hit when the CPU usage of a host in the specified application group is greater than 80%.
    • “Instance”: The alarm rule takes effect only for a specific instance. For example, if you set an alarm rule for host CPU usage greater than 80% and select the instance range, the alarm rule is hit when the CPU usage of the specified instance is greater than 80%.
  • Rule name: name of an alarm rule.
  • Rule description: subject of an alarm rule, which describes the conditions that metric data must meet to trigger the alarm rule. For example, if you configure rule description as “1-minute average CPU usage >=90%”, the alarm service checks every minute whether the average value of metric data collected during 1 minute is greater than or equal to 90%.

    1. Alarm rule example: In host monitoring, a single server metric reports one data point every 15 seconds, so 20 data points are reported in 5 minutes.
    2. 1. 5-minute average CPU usage > 90% indicates that the average value of the 20 data points about CPU usage in 5 minutes is greater than 90%.
    3. 2. 5-minute CPU usage always > 90% indicates that the values of the 20 data points about CPU usage in 5 minutes are all greater than 90%.
    4. 3. 5-minute CPU usage once > 90% indicates that the value of at least one of the 20 data points about CPU usage in 5 minutes is greater than 90%.
    5. 4. Total 5-minute Internet outbound traffic > 50 MB indicates that the sum of the values of the 20 data points about Internet outbound traffic in 5 minutes is greater than 5 MB.
  • Alarm after the threshold value is exceeded multiple times consecutively: An alarm notification is sent in the case that the value has been detected multiple times to meet the alarm rule.
  • Effective time: time when an alarm rule takes effect. The alarm service checks metric data and determines whether to generate an alarm only when the alarm rule is effective.
  • Notification object: a group of contacts who receive alarm notifications.
  • Notification method: method by which alarm notifications are sent. Two methods are available: email+TradeManager and mobile phone+email+TradeManager.
  • Email remarks: supplementary information customized for an alarm email. The remarks are sent together with the alarm notification email.

    Note

  • Each account can create up to 7,000 alarm rules.

    Alarm rule management

    Cloud Monitor provides three alarm rule management portals: application group page, monitoring list page for various metric items, and alarm rule list page of the alarm service.

  • Manage alarm rules on the application group page.

  • Manage alarm rules in host monitoring.
  • Manage alarm rules in cloud service monitoring.
  • Use alarm rules in site monitoring.
  • Manage alarm rules in custom monitoring.
Thank you! We've received your feedback.