Operation Orchestration Service (OOS) supports O&M tasks triggered by threshold alerts in CloudMonitor. In an alert-triggered O&M task, you can specify a template to be executed when the specified metric of the monitored cloud service resources reaches the threshold. An alert-triggered O&M task keeps running to listen to the specified alert until you cancel the task. For example, you can configure an alert-triggered O&M task to automatically clear the log directory when the disk usage exceeds 80%.
For more information about the supported metrics, see Major metrics of Alibaba Cloud services.
To create an alert-triggered O&M task, follow these steps:
Set a trigger rule
Set a trigger rule to monitor the alert that you concern. The following table describes the parameters for a trigger rule.
|Product||Yes||The service to monitor. Select a service from the drop-down list.|
|Rule Description||Yes||The rule for triggering the threshold alert.|
|Trigger Mute Period||No||The period during which only one alert is sent even if the metric value consecutively exceeds the alert rule threshold several times. Default value: 24 hours.|
|Effective From||No||The time period during which the alert rule is effective. By default, the alert rule takes effect all day.|
The rule description contains the following information:
- Metric name
- Aggregation period of monitoring data
- Number of aggregation periods
- Statistical method
- Comparison operator
Select a template to be executed
Set parameters for executing the template
In this step, select Extract Value from Message Body or Fixed Value. If you select Fixed Value, the template will be executed based the parameter values that you set. If you select Extract Value from Message Body, you can use jQuery expressions to extract values from alert message bodies for executing the template.
To extract values from alert message bodies, use jQuery expressions in the $.Parameter name format.For example, an alert message of the Host.cpu.total metric of Elastic Compute Service (ECS) is as follows:
To obtain the ID of the instance that triggers the alert, use the following expression:
The following table describes the parameters that can be extracted from all alert message bodies.
|$.timestamp||The timestamp when the alert was triggered. Unit: milliseconds.||1575970560000|
|$.curLevel||The level of the alert. OK: indicates that the alert has been cleared.||INFO|
|$.userId||The ID of the Alibaba Cloud account.||130920**0047|
|$.dimensionFieldName||The dimension of the metric. Replace dimensionFieldName in the expression with the parameter name of the metric dimension. For example, the CPU usages of ECS instances are monitored based on the instance ID. You can use the
The following figure shows an example of extracting values from an alert message body.
You can also set fixed parameter values for executing the template. The method is similar to that for common templates.