Automated O&M - DataWorks - Alibaba Cloud Documentation Center

The Operation Center service of DataWorks provides the automated O&M feature for node instances that are running on exclusive resource groups. You can customize O&M rules for the node instances based on your business requirements. This topic describes how to manage automated O&M rules.

Background information

The automated O&M feature enables DataWorks to automatically perform O&M operations on node instances that are running on exclusive resource groups. DataWorks performs O&M operations based on the O&M rules that you created and the monitoring rules that are associated with the O&M rules. You can customize metrics and create an O&M rule for the node instances based on your business requirements. Then, you can associate the O&M rule with a monitoring rule. The monitoring rule can be an existing monitoring rule or a new monitoring rule. If the conditions specified in the O&M rule are met, the system automatically performs the O&M operation defined in the rule.

Limits

Only workspace administrators can create, modify, or delete automated O&M rules.
You can create automated O&M rules only for node instances that are running on exclusive resource groups.
You can view the execution records that are generated only within 30 days for automated O&M rules.
DataWorks supports only the automated O&M operation that stops running node instances when the resource usage of the exclusive resource groups used by the node instances is excessively high.
You can associate one automated O&M rule with only one monitoring rule. However, you can associate multiple automated O&M rules with the same monitoring rule.

Go to the Automatic page

Log on to the DataWorks console. In the left-side navigation pane, choose Data Modeling and Development > Operation Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Operation Center.
In the left-side navigation pane, choose Alarm > Automatic.

Manage automated O&M rules

The Automatic page displays all the automated O&M rules that are created and the execution records of the rules. You can perform the following operations on this page.

Note

Only workspace administrators can create or modify automated O&M rules. For more information about how to obtain the permissions of the workspace administrator role, see Add workspace members and assign roles to them.

自动运维

Create an automated O&M rule

On the Rules tab, click Create Rule in the upper-right corner. In the Create Rule dialog box, configure the parameters. The following table describes the parameters. 添加规则

Section	Parameter	Description
Basic information	Name	The name of the automated O&M rule.
	Associated Monitoring Rule	The monitoring rule with which you want to associate the automated O&M rule. For information about how to create a monitoring rule, see Create a custom alert rule. Note You can associate only monitoring rules for exclusive resource groups for scheduling with automated O&M rules.
	O&M Operation	The automated O&M operation that you want to perform if the automated O&M rule is triggered. Note Only the operation Terminate Running Instance is supported.
Filter conditions	Resource Group	The name of the resource group that you specified when you created the monitoring rule. Note Only the names of exclusive resource groups for Data Integration and exclusive resource groups for scheduling are displayed.
	Workspace	The name of the workspace to which the automated O&M rule is applied.
	Instance Type	The type of the node instance to which the automated O&M rule is applied. Valid values: Auto Triggered Node Instance, Data Backfill Instance, Test Instance, and Manually Triggered Workflow.
	Scheduling Cycle	The scheduling cycle of the node instance. Valid values: Minutes, Hour, Days, Week, and Month.
	Priority	The priority of the automated O&M rule. A larger value indicates a higher priority. Valid values: 1 3 5 7 8
	Status	The status of the node instance. Valid values: Waiting for Resources and Running.
Whitelist	Whitelist	The list of the node instances for which the automated O&M rule does not take effect. If you want to add another node instance to the list, you can select the name of the node instance from the drop-down list below the node instance list and click Add. Note Automated O&M rules do not take effect for the node instances that are contained in the node instance list.
Constraints on Rule	Effective Period	The time range within which the automated O&M rule is effective.
	Maximum Effective Times	The maximum number of times that the automated O&M rule can be triggered.
	Minimum Effective Interval	The minimum interval at which the automated O&M rule can be triggered.

Search for an automated O&M rule
In the search box of the Rules tab, you can enter the name of an automated O&M rule to search for the rule.
View, modify, and delete an automated O&M rule
- If you want to view the information about an automated O&M rule, find the desired rule in the automated O&M rule list of the Rules tab and click View in the Actions column.
- If you want to modify an automated O&M rule, find the desired rule in the automated O&M rule list of the Rules tab and click View in the Actions column. In the View Rule dialog box, click Modify in the lower-right corner.
- If you want to delete an automated O&M rule, find the desired rule in the automated O&M rule list of the Rules tab and click Delete in the Actions column. In the Delete Rule message, click OK.
View the execution records of an automated O&M rule
The Execution Records tab displays the execution information about automated O&M rules, including the time when the rules are executed, rule owners, and the number of node instances to which the rules are applied. If you want to view the detailed execution information about a rule, click View Details in the Actions column of the rule. In the Record Details dialog box, you can view the line chart that displays the resource usage of the desired resource group within the last 24 hours and the number of node instances that are waiting for resources since the rule took effect.
- Instances Waiting for Resources/Resource Usage: This section provides a chart that displays the number of node instances that are waiting for resources and the resource usage of the desired resource group. You can move the pointer over a point in the chart to view the number of node instances that are waiting for resources and the resource usage of the desired resource group at the related point in time.
- Terminated Node Instances: This section displays all the node instances whose running is stopped.

Summary

After you create an automated O&M rule, the system automatically monitors the resource usage of the resource group defined in the automated O&M rule. After the automated O&M rule is triggered, the system performs the O&M operation on the node instances that are running on the resource group. For more information about the O&M of resources in resource groups, see Resource O&M.