Manage alert policies

Last Updated: Jun 30, 2017

The alert service provides the monitoring alert capability, allowing you to obtain up-to-date metric data for troubleshooting any cloud product abnormality in a timely manner.

Parameter description

  • Product: Host monitoring, RDS, OSS, and so on.

  • Resource range: Indicates the range in which an alert policy takes effect. Three range types are provided: All resources, Group, and Instance.

    • All resources: The alert policy takes effect for all instances of a product under a username. For example, if you set an alert policy for MongoDB CPU usage greater than 80% and select all resources range, the alert is triggered when the CPU usage of a MongoDB instance under your username is greater than 80%.

    • Group: The alert policy takes effect for all instances in a group. For example, if you set an alert policy for host CPU usage greater than 80% and select the group range, the alert is triggered when the CPU usage of a host in the specified group is greater than 80%.

    • Instance: The alert policy takes effect only for a specific instance. For example, if you set an alert policy for host CPU usage greater than 80% and select the instance range, the alert is triggered when the CPU usage of the specified instance is greater than 80%.

  • Policy Name: Name of an alert policy.

  • Policy Description: Subject of an alert policy, which describes the conditions that metric data must meet to trigger the alert.

    For example, if you configure policy description as “1-minute average CPU usage >=90%”, the alert service checks every minute whether the average value of metric data collected during 1 minute is greater than or equal to 90%.

    In host monitoring, a single server metric reports one data point every 15 seconds, so 20 data points are reported in 5 minutes.

    • “5-minute average CPU usage > 90%” indicates that the average value of the 20 data points about CPU usage in 5 minutes is greater than 90%.

    • “5-minute CPU usage always > 90%” indicates that the values of the 20 data points about CPU usage in 5 minutes are all greater than 90%.

    • “5-minute CPU usage once > 90%” indicates that the value of at least one of the 20 data points about CPU usage in 5 minutes is greater than 90%.

    • “Total 5-minute Internet outbound traffic > 50 MB” indicates that the sum of the values of the 20 data points about Internet outbound traffic in 5 minutes is greater than 5 MB.

  • Alert after the threshold value is exceeded multiple times consecutively: An alert notification is sent in the case that the value has been detected multiple times to meet the alert policy.

  • Effective Time: Time when an alert policy takes effect. The alert service checks metric data and determines whether to generate an alert only when the alert policy is effective.

  • Notification Object: A group of contacts who receive alert notifications.

  • Notification Method: Method by which alert notifications are sent. Two methods are available: E-mail+TradeManager and mobile phone+E-mail+TradeManager.

  • Email Remarks: Supplementary information customized for an alert email. The remarks are sent together with the alert notification email.

Alert policy management

CloudMonitor provides three alert policy management portals: Group page, monitoring list page for various metrics, and Alert Policy List page of the alert service.

Thank you! We've received your feedback.