All Products
Search
Document Center

CloudOps Orchestration Service:Create an alert O&M task

Last Updated:Dec 27, 2023

CloudOps Orchestration Service (OOS) supports O&M tasks for threshold-triggered alerts based on metrics of cloud services. An alert O&M task executes the specified template when the specified metric of a monitored cloud service reaches the threshold. An alert O&M task keeps running to listen to the specified alert until you cancel the task. For example, you can configure an alert O&M task to automatically clear the log directory when the disk usage exceeds 80%.

For more information about the supported metrics, see Major metrics of Alibaba Cloud services.

To create an alert O&M task, perform the following steps:

  1. Configure an alert rule.

  2. Select the template to be executed.

  3. Configure the parameters for executing the template.

Configure an alert rule

Parameter

Required

Description

Product type

Yes

The service to be monitored. Select a service from the drop-down list.

Rule description

Yes

The rule for triggering the alert based on the threshold.

Trigger silence cycle

No

The period during which the alert is triggered only once even if the metric value consecutively exceeds the threshold several times. Default value: 24Hours.

Effective From

No

The time period during which the alert rule is effective. By default, the alert rule takes effect all day.

A threshold-triggered alert rule contains the following information:

  • Metric name

  • Aggregation period of monitoring data

  • Number of aggregation periods

  • Statistics collection method

  • Comparison operator

  • Threshold

image

Select the template to be executed

Select the template to be executed when the alert is generated.image

Configure the parameters for executing the template

You can set the Template Parameters parameter to Extract Value from Message Body or Fixed Value. If you select Fixed Value, the template is executed based on the parameter values that you set. If you select Extract Value from Message Body, you can use jQuery expressions to extract values from alert message bodies.

To extract values from alert message bodies, use jQuery expressions in the $.Parameter name format. For example, the following content indicates an alert message for the Host.cpu.total metric of an Elastic Compute Service (ECS) instance:

{
    "Average": 50.15,
    "Maximum": 50.75,
    "Minimum": 49.75,
    "curLevel": "INFO",
    "instanceId": "i-bp1gn7od******qh5r12",
    "ruleName": "alarmtrigger-130920******0047-exec-de81413d******71b537",
    "timestamp": 1575970560000,
    "userId": "130920******0047"
}

To obtain the ID of the instance for which the alert is triggered, use the following expression: $.instanceId.

The following table describes the parameters that can be extracted from alert message bodies.

Expression

Description

Example

$.timestamp

The timestamp when the alert was triggered. Unit: milliseconds.

1575970560000

$.curLevel

The level of the alert.

INFO

$.userId

The ID of the Alibaba Cloud account.

130920**0047

$.dimensionFieldName

OK indicates that the alert has been cleared. The dimension of the metric. Replace dimensionFieldName in the expression with the parameter name of the metric dimension. For example, the CPU utilization of ECS instances is monitored based on the instance ID. You can use the $.instanceId expression to extract the instance ID from alert message bodies. For more information, see the dimensions for metrics in Major metrics of Alibaba Cloud services.

N/A

The following figure shows an example of extracting values from an alert message body.

image.png

You can also set fixed parameter values for executing the template. The method is similar to that for regular templates.