Configure monitoring rules for multiple tables by template - DataWorks

Data Quality provides dozens of built-in table-level and field-level monitoring rule templates. This topic describes how to configure monitoring rules based on a monitoring rule template.

Background information

Built-in monitoring rule templates are classified into table-level monitoring rule templates and field-level monitoring rule templates. You can use a built-in monitoring rule template to quickly configure monitoring rules for multiple tables or fields at a time in Data Quality. You can also configure monitoring rules by table. For more information, see Configure monitoring rules for a single table.

Limits

Data Quality allows you to configure monitoring rules for data in E-MapReduce (EMR), Hologres, AnalyticDB for PostgreSQL, and MaxCompute data sources based on monitoring rule templates.

Go to the Rule Configuration-Configure by Template page

Log on to the DataWorks console. In the left-side navigation pane, choose Data Modeling and Development > Data Quality. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Quality.
In the left-side navigation pane of the Data Quality page, choose Rule Management > Configure Rule (by Template) to go to the Rule Configuration-Configure by Template page.
Data Quality provides built-in table-level and field-level monitoring rule templates. You can find the template that you want to use on the Rule Configuration-Configure by Template page and click Configure Monitoring Rules in the Actions column to configure monitoring rules for multiple tables or fields at a time based on the template.

Configure monitoring rules

On the Rule Configuration-Configure by Template page, find the template that you want to use and click Configure Monitoring Rules in the Actions column to go to the Batch new monitoring rules wizard.

Configure attributes for the monitoring rules.

Parameter	Description
Engine/Data Source	The type of the compute engine or data source of tables or fields for which you want to configure monitoring rules. Note Data Quality allows you to configure monitoring rules for data in E-MapReduce (EMR), Hologres, AnalyticDB for PostgreSQL, and MaxCompute data sources based on monitoring rule templates.
Rule Source	The value of this parameter is fixed as Built-in Template.
Template	The name of the built-in monitoring rule template. For more information, see Built-in monitoring rule templates. Note You can configure field-level monitoring rules of the following types only for numeric fields: average value, sum of values, minimum value, and maximum value.
Rule Name	The naming format of the monitoring rules. The names of the monitoring rules are automatically generated. You can change the suffix in the naming format based on your business requirements.
Description	The description of the monitoring rules.

Configure advanced attributes for the monitoring rules.

Parameter	Description
Rule Type	The strength of the monitoring rules. Valid values: Strong and Soft. Strong: If the critical threshold is exceeded, critical alerts are reported and descendant nodes are blocked. If the warning threshold is exceeded, warning alerts are reported but descendant nodes are not blocked. Soft: If the critical threshold is exceeded, critical alerts are reported but descendant nodes are not blocked. If the warning threshold is exceeded, warning alerts are not reported and descendant nodes are not blocked.
Comparison Method	If you configure monitoring rules of the numeric type, the valid values of the Comparison Method parameter are Greater Than, Greater Than or Equal To, Equal To, Unequal To, Less Than, and Less Than or Equal To. If you configure monitoring rules of the fluctuation type, the valid values of the Comparison Method parameter are Absolute Value, Raise, and Drop.
Expected Value	This parameter is required only if you configure monitoring rules of the numeric type. When the monitoring rules are triggered, the system compares data profiling results with the expected value that you specify. If the data profiling results deviate from the expected value, an alert or blocking is triggered.
Thresholds	If you configure monitoring rules of the fluctuation type, you must configure Warning Threshold and Error Threshold. This way, the system compares the fluctuation rate of data profiling results with that of data sampling results of a specific time range. The comparison of the raising range, drop range, and fluctuation range (absolute value) is supported. For example, you set Rule Type to Strong, Warning Threshold to 5%, and Error Threshold to 10%. If the fluctuation rate is greater than 5% but less than or equal to 10%, a warning alert is reported, and descendant nodes are not blocked. If the fluctuation rate is greater than 10%, a critical alert is reported, and descendant nodes are blocked.
Start-Stop Status	You can turn on or off the switch to enable or disable the monitoring rules to control whether to apply the monitoring rules in the production environment. Important If you disable the monitoring rules, you cannot test the monitoring rules, and the monitoring rules cannot be triggered by auto triggered nodes that are associated with them.

Click Next to go to the Generate Monitoring Rule step.
In the Generate Monitoring Rule step, add tables or fields to which you want to apply the monitoring rules based on the table-level or field-level monitoring rule template that you use. If you add partitioned tables, you must configure partition filter expressions for the tables. The partition filter expressions are used to determine the sampling scope of the data that you want to monitor. By default, if you add non-partitioned tables, NOTAPARTITIONTABLE is displayed in the Partition expression columns that correspond to the tables.
1. Add tables or fields.
  - In the Select Table section, click Add Tables. In the Add Table dialog box, configure Engine/Database Instance. All tables that belong to the selected compute engine instance or database are displayed. You can also configure Table Name to search for the desired table. Then, you can select the desired tables and click Create to add the tables to the Select Table section.
  - In the Select Field section, click Add Fields. In the Add Field dialog box, configure Engine/Database Instance. The Table to Be Added section displays all tables that belong to the selected compute engine instance or database. Then, select the tables that contain the fields to which you want to apply the monitoring rules. The Select Field section displays all fields in the selected tables. You can filter the fields by field name or field description. Select the fields to which you want to apply the monitoring rules and click Create. The selected fields are displayed in the Select Field section.
2. Configure partition filter expressions.
  In the Select Table section, find the table for which you want to configure a partition filter expression and click the icon. In the Set Partition Filter Expression for Multiple Auto Triggered Nodes dialog box, select a partition filter expression from the Partition Filter Expression drop-down list and click OK. Data Quality matches the partitions in which data generated by the auto triggered node is stored every day based on the partition filter expression. If you want to configure partition filter expressions for multiple tables at a time, you can select the tables and click Set Partition Filter Expression.
Click Generate Monitoring Rule to go to the Verify Monitoring Rule step.
You can click Custom Columns in the Verify Monitoring Rule section to select the columns that you want to display in the monitoring rule list. In the Verify Monitoring Rule section, you can perform the following operations:
- Test the validity of the monitoring rules.
  After the monitoring rules are configured, you can select one or more monitoring rules that you want to test and click Test Run below the monitoring rule list. In the Test Run dialog box, select a data timestamp from the Data Timestamp drop-down list. The data timestamp is used to simulate the time when the monitoring rules are triggered. Then, click Calculate actual partition. The system calculates values for the partitions in the tables to which the monitoring rules are applied based on the data timestamp you select and the partition filter expressions you configure. Then, click Test Run. The system checks data in the partitions in the tables based on the monitoring rules.
  After you test a monitoring rule, you can click Test Run Records in the Actions column that corresponds to the monitoring rule to view details about the test and perform the required operations.
  Note
  If an error occurred during a test on a monitoring rule, the reason may be that the table or the table partition does not exist or table data does not meet the requirements of the monitoring rule.
- Specify an alert recipient.
  You can specify an alert recipient by clicking Subscriptions in the Actions column of the desired monitoring rule. In the Manage Subscriptions dialog box, you can specify an alert notification method and an alert recipient. The following notification methods are supported: Email, Email and SMS, DingTalk Chatbot, DingTalk Chatbot @ALL, Lark Group Chatbot, Enterprise WeChat Chatbot, and Custom Webhook. Alert notifications are sent to the alert recipient by using the specified notification method.
  Note
  The Custom Webhook notification method is supported only in DataWorks Enterprise Edition. For information about the message format of an alert notification sent by using a custom webhook, see the "Appendix: Message format of alert notifications sent by using a custom webhook URL" section in Configure monitoring rules for multiple tables by template.
- Associate the monitoring rules with auto triggered nodes to trigger the monitoring rules
  You can click Associate Rule with Recommended Auto Triggered Nodes or Manually Associate Rule with Auto Triggered Nodes to associate the monitoring rules with auto triggered nodes that generate the table data. The auto triggered nodes generate the table data after the auto triggered node instances, data backfill instances, or test instances generated for the auto triggered nodes are successfully run. When the auto triggered nodes start to run, the monitoring rules are triggered. You can configure the Rule Type parameter to control whether to block the descendant nodes of the auto triggered nodes. This helps reduce the impact of dirty data records.
  - Associate Rule with Recommended Auto Triggered Nodes: The system associates the selected monitoring rules with auto triggered nodes based on the lineage of the auto triggered nodes that generate the table data.
  - Manually Associate Rule with Auto Triggered Nodes: You can manually associate the selected monitoring rules with specific auto triggered nodes.
  Important
  A monitoring rule can be triggered only if it is associated with related auto triggered nodes.
- Delete monitoring rules: You can delete one or more monitoring rules.
- View the details of a monitoring rule: You can find the monitoring rule whose details you want to view and click View Rule Details in the Actions column. You can also modify, enable, disable, or delete the monitoring rule, specify strength for the monitoring rule, or view the logs of the monitoring rule.
After the test on the monitoring rules is successful and the monitoring rules are associated with auto triggered nodes, click Save. Check whether the configuration is complete. If the configuration is complete, click OK.

Appendix: Message format of alert notifications sent by using a custom webhook URL

This section describes the message format of an alert notification sent by using a custom webhook URL and the related parameters.

Sample message

{
  "detailUrl": "https://dqc-cn-zhangjiakou.data.aliyun.com/?defaultProjectId=3058#/jobDetail?envType=ODPS&projectName=yongxunQA_zhangbei_standard&tableName=sx_up_001&entityId=10878&taskId=16876941111958fa4ce0e0b5746379cd9bc67999d05f8&bizDate=1687536000000&executeTime=1687694111000",
  "datasourceName": "emr_test_01",
  "engineTypeName": "EMR",
  "projectName": "Project name",
  "dqcEntityQuality": {
    "entityName": "tb_auto_test",
    "actualExpression": "ds=20230625",
    "strongRuleAlarmNum": 1,
    "weakRuleAlarmNum": 0
  },
  "ruleChecks": [
    {
      "blockType": 0,
      "warningThreshold": 0.1,
      "property": "id",
      "tableName": "tb_auto_test",
      "comment": "Test a monitoring rule",
      "checkResultStatus": 2,
      "templateName": "Compare the Number of Unique Field Values Against Expectation",
      "checkerName": "fulx",
      "ruleId": 123421,
      "fixedCheck": false,
      "op": "",
      "upperValue": 22200,
      "actualExpression": "ds=20230625",
      "externalId": "123112232",
      "timeCost": "10",
      "trend": "up",
      "externalType": "CWF2",
      "bizDate": 1600704000000,
      "checkResult": 2,
      "matchExpression": "ds=$[yyyymmdd]",
      "checkerType": 0,
      "projectName": "auto_test",
      "beginTime": 1600704000000,
      "dateType": "YMD",
      "criticalThreshold": "0.6",
      "isPrediction": false,
      "ruleName": "Rule name",
      "checkerId": 7,
      "discreteCheck": true,
      "endTime": 1600704000000,
      "MethodName": "max",
      "lowerValue": 2344,
      "entityId": 12142421,
      "whereCondition": "type!='type2'",
      "expectValue": 90,
      "templateId": 5,
      "taskId": "16008552981681a0d6",
      "id": 234241453,
      "open": true,
      "referenceValue": [
        {
          "discreteProperty": "type1",
          "value": 20,
          "bizDate": "1600704000000",
          "singleCheckResult": 2,
          "threshold": 0.2
        }
      ],
      "sampleValue": [
        {
          "discreteProperty": "type2",
          "bizDate": "1600704000000",
          "value": 23
        }
      ]
    }
  ]
}

Parameters

Parameter	Type	Sample value	Description
ProjectName	String	autotest	The name of the compute engine instance or data source for which data quality is monitored.
actualExpression	String	ds=20200925	The partition in the monitored data source table.
RuleChecks	Array of RuleChecks		The monitoring results returned.
BlockType	Integer	1	The strength type of the monitoring rule. The strength type of a monitoring rule indicates the importance of the rule. Valid values: 1: indicates that the monitoring rule is a strong rule. 0: indicates that the monitoring rule is a weak rule. You can specify whether a monitoring rule is a strong rule based on your business requirements. If a monitoring rule is a strong rule and the critical threshold is exceeded, a critical alert is reported and nodes that are associated with the rule are blocked from running.
WarningThreshold	Float	0.1	The threshold for a warning alert. The threshold indicates the deviation of the monitoring result from the expected value. You can customize this threshold based on your business requirements.
Property	String	type	The field of the rule attribute. This field is the column name of the data source table that is monitored.
TableName	String	dual	The name of the table that is monitored.
Comment	String	Test a monitoring rule	The description of the monitoring rule.
CheckResultStatus	Integer	2	The status of the check result.
TemplateName	String	Compare the Number of Unique Field Values Against Expectation	The name of the monitoring template.
CheckerName	String	fulx	The name of the checker.
RuleId	Long	123421	The ID of the monitoring rule.
FixedCheck	Boolean	false	Indicates whether the monitoring is performed based on a fixed value. Valid values: true: indicates that the monitoring is performed based on a fixed value. false: indicates that the monitoring is performed based on a non-fixed value.
Op	String	>	The comparison operator of the monitoring rule.
UpperValue	Float	22200	The upper limit of the predicted result. The value of this parameter is automatically generated based on the threshold that you specified.
ActualExpression	String	ds=20200925	The partition in the monitored data source table.
ExternalId	String	123112232	The ID of the auto triggered node.
TimeCost	String	10	The time that was taken to run the monitoring task.
Trend	String	up	The trend of the monitoring results.
ExternalType	String	CWF2	The type of the scheduling system. Only CWF scheduling systems are supported.
BizDate	Long	1600704000000	The data timestamp. If the monitored business entity is offline data, the value is usually one day before the monitoring is performed.
CheckResult	Integer	2	The monitoring results.
MatchExpression	String	ds=$[yyyymmdd]	The partition filter expression.
CheckerType	Integer	0	The type of the checker.
ProjectName	String	autotest	The name of the compute engine instance or data source for which data quality is monitored.
BeginTime	Long	1600704000000	The time when the monitoring started.
DateType	String	YMD	The type of the scheduling cycle. In most cases, the value of this parameter is YMD. This value indicates year, month, and day.
CriticalThreshold	Float	0.6	The threshold for a critical alert. The threshold indicates the deviation of the monitoring result from the expected value. You can customize this threshold based on your business requirements. If a monitoring rule is a strong rule and the critical threshold is exceeded, a critical alert is reported and nodes that are associated with the rule are blocked from running.
IsPrediction	Boolean	false	Indicates whether the monitoring result is the same as the predicted result. Valid values: true: indicates that the monitoring result is the same as the predicted result. false: indicates that the monitoring result is different from the predicted result.
RuleName	String	Rule name	The name of the monitoring rule.
CheckerId	Integer	7	The checker ID.
DiscreteCheck	Boolean	true	Indicates whether the monitoring is discrete monitoring. Valid values: true: indicates that the monitoring is discrete monitoring. false: indicates that the monitoring is not discrete monitoring.
EndTime	Long	1600704000000	The time when the monitoring ended.
MethodName	String	max	The method used to collect sample data, such as avg, count, sum, min, max, count_distinct, user_defined, table_count, table_size, table_dt_load_count, table_dt_refuseload_count, null_value, null_value/table_count, (table_count-count_distinct)/table_count, or table_count-count_distinct.
LowerValue	Float	2344	The lower limit of the predicted result. The value of this parameter is automatically generated based on the threshold that you specified.
EntityId	Long	14534343	The ID of the partition filter expression.
WhereCondition	String	type!='type2'	The filter condition of the monitoring task.
ExpectValue	Float	90	The expected value.
TemplateId	Integer	5	The ID of the monitoring template.
TaskId	String	16008552981681a0d6****	The ID of the monitoring task.
Id	Long	2231123	The ID of the primary key.
ReferenceValue	Array of ReferenceValue		The historical sample values.
DiscreteProperty	String	type1	The values of the sample field that are grouped by using the GROUP BY clause. For example, the values of the Gender field are grouped by using the GROUP BY clause. In this case, the values of DiscreteProperty are Male, Female, and null.
Value	Float	20	The sample value.
BizDate	String	1600704000000	The data timestamp. If the monitored business entity is offline data, the value is usually one day before the monitoring is performed.
SingleCheckResult	Integer	2	The string of the monitoring result.
Threshold	Float	0.2	The threshold.
SampleValue	Array of SampleValue		The sample values.
DiscreteProperty	String	type2	The values of the sample field that are grouped by using the GROUP BY clause. For example, the values of the Gender field are grouped by using the GROUP BY clause. In this case, the values of DiscreteProperty are Male, Female, and null.
BizDate	String	1600704000000	The data timestamp. If the monitored business entity is offline data, the value is usually one day before the monitoring is performed.
Value	Float	23	The sample value.
Open	Boolean	true	Indicates whether the monitoring rule is enabled.

What to do next

After the monitoring rules are configured based on a template, you can view the details about the monitoring rules and subscribe to the monitoring rules when you configure monitoring rules by table. Alert messages that are generated after the monitoring rules are triggered can be sent to the related alert contacts by using email, email and SMS, DingTalk chatbot, DingTalk chatbot @ALL, Lark group chatbot, Enterprise WeChat chatbot, and custom webhook.
If you want to prevent data that does not meet the requirements of a monitoring rule from blocking the running of the associated auto triggered node on the specified data timestamp, you can configure a noise reduction rule for the monitoring rule to denoise the data. For more information, see Mange noise reduction rules.