Data Quality provides dozens of built-in table-level and field-level monitoring rule templates. This topic describes how to configure monitoring rules based on a monitoring rule template.
Background information
Built-in monitoring rule templates are classified into table-level monitoring rule templates and field-level monitoring rule templates. You can use a built-in monitoring rule template to quickly configure monitoring rules for multiple tables or fields at a time in Data Quality.Limits
Data Quality allows you to configure monitoring rules for data in E-MapReduce (EMR), Hologres, AnalyticDB for PostgreSQL, and MaxCompute data sources based on monitoring rule templates.
Go to the Rule Configuration-Configure by Template page
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region where your workspace resides. On the Workspaces page, find your workspace and click DataStudio in the Actions column.
- On the DataStudio page, click the
icon in the upper-left corner and choose .
- In the left-side navigation pane of the Data Quality page, choose to go to the Rule Configuration-Configure by Template page. Data Quality provides built-in table-level and field-level monitoring rule templates. You can find the template that you want to use on the Rule Configuration-Configure by Template page and click Configure monitoring rules in the Actions column to configure monitoring rules for multiple tables or fields at a time based on the template.
Configure monitoring rules
- On the Rule Configuration-Configure by Template page, find the template that you want to use and click Configure monitoring rules in the Actions column to go to the Batch new monitoring rules wizard.
- Configure attributes for the monitoring rules.
- Click Next to go to the Generate rules step. In the Generate rules step, add tables or fields to which you want to apply the monitoring rules based on the table-level or field-level monitoring rule template that you use. If you add partitioned tables, you must configure partition filter expressions for the tables. The partition filter expressions are used to determine the sampling scope of the data that you want to monitor. By default, if you add non-partitioned tables, NOTAPARTITIONTABLE is displayed in the Partition expression columns that correspond to the tables.
- Click Generate rules to go to the Rule validation step. You can click Custom columns in the Rule validation section to select the columns that you want to display in the monitoring rule list. In the Rule validation section, you can perform the following operations:
- Test the validity of the monitoring rules.After the monitoring rules are configured, you can select one or more monitoring rules that you want to test and click Trial run below the monitoring rule list. In the Trial run dialog box, select a data timestamp from the Data Timestamp drop-down list. The data timestamp is used to simulate the time when the monitoring rules are triggered. Then, click Calculate actual partition. The system calculates values for the partitions in the tables to which the monitoring rules are applied based on the data timestamp you select and the partition filter expressions you configure. Then, click Trial run. The system checks data in the partitions in the tables based on the monitoring rules.After you test a monitoring rule, you can click Test run record in the Actions column that corresponds to the monitoring rule to view details about the test and perform the required operations.Note If an error occurred during a test on a monitoring rule, the reason may be that the table or the table partition does not exist or table data does not meet the requirements of the monitoring rule.
- Associate the monitoring rules with auto triggered nodes to trigger the monitoring rules
You can click Recommended Association scheduling or Manual Association scheduling to associate the monitoring rules with auto triggered nodes that generate the table data. The auto triggered nodes generate the table data after the auto triggered node instances, data backfill instances, or test instances generated for the auto triggered nodes are successfully run. When the auto triggered nodes start to run, the monitoring rules are triggered. You can configure the Rule Type parameter to control whether to block the descendant nodes of the auto triggered nodes. This helps reduce the impact of dirty data records.
- Recommended Association scheduling: The system associates the selected monitoring rules with auto triggered nodes based on the lineage of the auto triggered nodes that generate the table data.
- Manual Association scheduling: You can manually associate the selected monitoring rules with specific auto triggered nodes.
Important A monitoring rule can be triggered only if it is associated with auto triggered nodes. - Delete monitoring rules: You can delete one or more monitoring rules.
- View the details of a monitoring rule: You can find the monitoring rule whose details you want to view and click Rule details in the Actions column. You can also modify, enable, disable, or delete the monitoring rule, specify strength for the monitoring rule, or view the logs of the monitoring rule.
- Test the validity of the monitoring rules.
- After the test on the monitoring rules is successful and the monitoring rules are associated with auto triggered nodes, click Save. Check whether the configuration is complete. If the configuration is complete, click OK.
What to do next
- After the monitoring rules are configured based on a template, you can view the details about the monitoring rules and subscribe to the monitoring rules when you configure monitoring rules by table. Alert messages that are generated after the monitoring rules are triggered can be sent to the related alert contacts by using DingTalk chatbots, text messages, or emails. For more information about how to configure monitoring rules by table, see Configure monitoring rules by table.
- If you want to prevent data that does not meet the requirements of a monitoring rule from blocking the running of the associated auto triggered node on the specified data timestamp, you can configure a noise reduction rule for the monitoring rule to denoise the data. For more information, see Mange noise reduction rules.