All Products
Search
Document Center

DataWorks:Configure rules by template in batches

Last Updated:Mar 27, 2026

Data Quality provides preset table-level and field-level monitoring rule templates. Instead of building rules from scratch, select a template and apply it to multiple tables or fields at once. This is useful when you need consistent quality checks across many tables in a large data warehouse.

Supported data sources

Templates support monitoring rules for the following data sources: MaxCompute, E-MapReduce (EMR), Hologres, CDH Hive, AnalyticDB for PostgreSQL, AnalyticDB for MySQL, StarRocks, MySQL, Lindorm, SQL Server, and Data Lake Formation (DLF).

Before you begin: Choose your path

Use this table to determine your configuration approach before you start.

Decision Option A Option B
What do you want to check? An entire table (row counts, null rates, duplicates) A specific field (value distribution, uniqueness, format)
Template type to select Table-Level Field-Level
Do you know your expected data range? Yes — set thresholds manually No — let the system learn from historical data
Comparison method to use Manual Settings Intelligent Dynamic Threshold
Intelligent Dynamic Threshold is only available for rules based on custom SQL statements, custom ranges, or dynamic threshold.

How it works

Configuring monitoring rules from a template involves three stages:

  1. Select a template and configure rule parameters — Choose a Table-Level or Field-Level built-in template and set the comparison method and thresholds that define what "good data" looks like.

  2. Add tables or fields in batches — Select all the tables or fields you want the rule to cover.

  3. Associate the rule with a quality monitor — A quality monitor defines the data scope (for example, a specific partition of a partitioned table) and links the rule to a scheduling node that triggers checks automatically.

Step 1: Open the Configure by Template page

  1. Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose Data Governance > Data Quality. Select the target workspace from the drop-down list and click Go to Data Quality.

  2. In the left-side navigation pane, choose Configure Rules > Configure by Template. Data Quality displays built-in Table Level and Field Level rule templates. Click Configure Monitoring Rules for the template you want to use.

    image

Step 2: Configure rule properties

On the Batch Add Monitoring Rules page, configure the following settings.

Basic attributes

Parameter Description
Data Source Type The data source type for the tables this rule will cover. See Supported data sources for the full list.
Rule Source Displays Built-in Template and the selected template name. Read-only. For the full list of available templates, see View built-in rule templates.
Template The selected template.
Rule Name Auto-generated. Adjust the name suffix as needed.

Advanced properties

Choose a comparison method

Method When to use How thresholds work
Manual Settings You know the expected data range or fluctuation limits For numeric results, compare against a fixed expected value using a comparison operator. Set a Normal Threshold (expected range) and a Red Threshold (critical breach). For fluctuation-based checks, use Absolute Value, Raise, or Drop as the comparison method, and configure the Normal Threshold.
Intelligent Dynamic Threshold You are unsure what threshold values to set The system automatically determines the reasonable threshold based on intelligent algorithms. No manual threshold values required. Alerts trigger or tasks block immediately when abnormal data is detected. Configure the Orange Threshold to define the range of anomalous but non-critical results.
Intelligent Dynamic Threshold is only available for rules based on custom SQL statements, custom ranges, or dynamic threshold.

Configure the degree of importance

Value Effect when the red threshold is exceeded
Strong rules The associated scheduling node is blocked by default, preventing downstream tasks from running.
Weak rules The associated scheduling node is not blocked. An alert is sent, but downstream tasks continue.

Configure monitoring thresholds

  • If Comparison Method is set to Manual Settings:

    • Normal Threshold: The data quality check result meets the specified condition, meaning the data output is as expected.

    • Red Threshold: The data quality check result meets the specified condition, meaning the data output is not as expected.

    • For numeric results, compare against a fixed value using: Greater Than, Greater Than Or Equal To, Equal To, Not Equal To, Less Than, or Less Than Or Equal To.

    • For fluctuation-based checks, use Absolute Value, Raise, or Drop as the comparison method.

  • If Comparison Method is set to Intelligent Dynamic Threshold:

    • Orange Threshold: The data quality check result meets the specified condition, meaning the data is abnormal but your business is not affected.

Status: Set to Enable to allow the rule to run in production. If set to Disable, the rule cannot run in test mode or be triggered by scheduling nodes.

Important

Disabling a rule prevents it from running in both test and production environments.

Click Next to proceed to the Generate Monitoring Rule page.

Step 3: Add tables or fields

Based on the template type you selected, add the tables or fields you want to check.

Add tables (Table-Level Rule Template)

  1. Click Add Table. On the Batch Create page, select the tables you want to add.

    The list shows all tables matching the Data Source Type configured in the previous step. Filter by Table Name as needed.
  2. Click Confirm to add the selected tables to the Tables for Which You Want to Configure Rules list.

Add fields (Field-Level Rule Template)

  1. Click Add Fields. In the Select a field dialog box, select the table that contains the target field.

    The Tables to Be Selected area lists tables based on the Data Source Type configured in the previous step.
  2. After selecting a table, the Select Fields section shows all fields in that table. Filter by Field Name or Field Description as needed.

    image

  3. Select the field and click Create. The field appears in the Fields for Which You Want to Configure Rules list.

Step 4: Associate a quality monitor

A quality monitor defines the data scope (the Data Range, such as a specific partition) and the trigger settings for the quality check. Associate each table or field with a quality monitor.

Batch configuration

  1. Select one or more tables or fields, then click Configure Monitor.

    image

  2. Choose one of the following batch actions:

    • Automatically Associate: Associates the selected tables or fields with existing quality monitors automatically.

    • Disassociate: Removes the quality monitor association for the selected tables or fields.

    • Batch Add: Creates new quality monitor settings for the selected tables. Configure the following:

      Configuration item Description
      Data Range The partition or data scope to check. For non-partitioned tables, use a WHERE clause to limit the scope (or leave blank to check all data). For partitioned tables, use the format Partition key=Partition value. The partition value can be a constant or a built-in partition filter expression.
      Trigger Method Triggered by Node Scheduling in Production Environment: Checks run automatically after the associated scheduling node completes in Operation Center. Dry-run nodes do not trigger checks. Triggered Manually: Checks run only when triggered manually.
      Associated Scheduling Node Available when Trigger Method is set to Triggered By Node Scheduling In Production Environment. Select the scheduling nodes to associate with the monitor.
      Running Resources The compute resources used to run the rule. Defaults to the data source of the monitored table in the current workspace. If you select a different data source, make sure it can access the monitored table.
    Important

    For non-MaxCompute tables with Triggered By Node Scheduling In Production Environment selected, do not associate scheduling nodes that run on the shared resource group for scheduling. Doing so may cause errors when the monitor runs.

Single-table configuration

  1. In the Quality Monitoring column next to the target table or field, select an existing quality monitor or click New Quality Monitoring to create one.

    image

  2. If no monitor exists, click Create Monitor and configure the following:

    Basic configurations

    Parameter Description
    Monitor Name The name of the monitor.
    Quality Monitoring Owner The owner of the monitor. When configuring alert subscriptions, specify the owner as the recipient using Email, Email and SMS, or Telephone.
    Monitored Object The table to check. Defaults to the current table.
    Data Range The partition or data scope to check. For non-partitioned tables, all data is checked by default. For partitioned tables, use Partition key=Partition value format. The partition value can be a constant or a built-in partition filter expression. >
    Note

    For rules based on custom templates or custom SQL statements, Data Range has no effect. The partition is determined by the SQL statement in the rule.

    Monitoring Rule The monitoring rules to associate with this monitor. If no rules exist yet, skip this field and add rules after creating the monitor.

    Running settings

    Parameter Description
    Trigger Method Same as batch configuration.
    Associated Scheduling Node Same as batch configuration.
    Running Resources Same as batch configuration.

    Handling policies

    Parameter Description
    Quality Issue Handling Policies Blocks: If a data quality issue is detected, the scheduling node that generates the table is set to Failed, blocking downstream nodes. Default: Strong rules · Red anomaly. Alert: Sends alert notifications when an issue is detected. Defaults: Strong rules · Red anomaly, Strong rules · Orange exception, Strong rules · Check Failed, Weak rules · Red anomaly, Weak rules · Orange exception, Weak rules · Check Failed.
    Alert Method Configuration Notification channels: Email, Email and SMS, DingTalk Chatbot, DingTalk Chatbot @ALL, Lark Group Chatbot, Enterprise WeChat Robot, Custom Webhook, or Telephone. For DingTalk, Lark, or WeChat chatbots, add the bot and copy the webhook URL to the Recipient field. Custom Webhook is available in DataWorks Enterprise Edition only. See Appendix: Webhook message format for the message format. When using Email, Email and SMS, or Telephone, set the recipient to Data Quality Monitoring Owner (the owner set in Basic Configurations), Shift Schedule (the on-duty person in the shift schedule), or Scheduling Task Owner (the owner of the associated scheduling node).
  3. After creating the monitor, return to the batch monitoring rules page and click Refresh. Then select the monitor you just created in the Quality Monitoring column.

    image

Step 5: Test and finalize

Click Generate Monitoring Rule to open the Verify Monitoring Rule page. From this page:

  • Test Run: Select one or more rules and click Test Run. In the dialog box, select a Data Timestamp (the simulated trigger time). The system calculates the partition values based on the specified time and Data Range, then checks whether the data meets the configured rule. After the test run completes, click Running Records in the Actions column to view results.

    image

  • Subscriptions: Configure alert recipients and notification channels. See the alert method options in Step 4.

  • Associated Scheduling: Link rules to scheduling nodes. Use Use Recommended Running Mode to let the system automatically associate rules with the recommended nodes based on data lineage, or use Manually Specify Running Mode to associate rules with specific nodes manually.

    Important

    A rule must be associated with a scheduling node to be triggered automatically.

  • View Rule Details: Click View Rule Details in the Actions column to open a rule's detail page, where you can modify, start, stop, or delete the rule and view logs.

  • Delete: Remove one or more selected rules.

After the test run succeeds and scheduling is associated, click Complete Check.

What's next

After the monitor runs, choose Quality O&M in the left-side navigation pane and click Monitor > Running Records to view the quality check status and complete rule check history for the monitored tables.

Appendix: Webhook message format

This section describes the message format and parameters for alert notifications sent using a Custom Webhook.

Sample message

{
  "detailUrl": "https://dqc-cn-zhangjiakou.data.aliyun.com/?defaultProjectId=3058#/jobDetail?envType=ODPS&projectName=yongxunQA_zhangbei_standard&tableName=sx_up_001&entityId=10878&taskId=16876941111958fa4ce0e0b5746379cd9bc67999d05f8&bizDate=1687536000000&executeTime=1687694111000",
  "datasourceName": "emr_test_01",
  "engineTypeName": "EMR",
  "projectName": "Project name",
  "dqcEntityQuality": {
    "entityName": "tb_auto_test",
    "actualExpression": "ds=20230625",
    "strongRuleAlarmNum": 1,
    "weakRuleAlarmNum": 0
  },
  "ruleChecks": [
    {
      "blockType": 0,
      "warningThreshold": 0.1,
      "property": "id",
      "tableName": "tb_auto_test",
      "comment": "Test a monitoring rule",
      "checkResultStatus": 2,
      "templateName": "Compare the Number of Unique Field Values Against Expectation",
      "checkerName": "fulx",
      "ruleId": 123421,
      "fixedCheck": false,
      "op": "",
      "upperValue": 22200,
      "actualExpression": "ds=20230625",
      "externalId": "123112232",
      "timeCost": "10",
      "trend": "up",
      "externalType": "CWF2",
      "bizDate": 1600704000000,
      "checkResult": 2,
      "matchExpression": "ds=$[yyyymmdd]",
      "checkerType": 0,
      "projectName": "auto_test",
      "beginTime": 1600704000000,
      "dateType": "YMD",
      "criticalThreshold": "0.6",
      "isPrediction": false,
      "ruleName": "Rule name",
      "checkerId": 7,
      "discreteCheck": true,
      "endTime": 1600704000000,
      "MethodName": "max",
      "lowerValue": 2344,
      "entityId": 12142421,
      "whereCondition": "type!='type2'",
      "expectValue": 90,
      "templateId": 5,
      "taskId": "16008552981681a0d6",
      "id": 234241453,
      "open": true,
      "referenceValue": [
        {
          "discreteProperty": "type1",
          "value": 20,
          "bizDate": "1600704000000",
          "singleCheckResult": 2,
          "threshold": 0.2
        }
      ],
      "sampleValue": [
        {
          "discreteProperty": "type2",
          "bizDate": "1600704000000",
          "value": 23
        }
      ]
    }
  ]
}

Parameter description

Parameter Type Example Description
ProjectName String autotest The name of the compute engine instance or data source being monitored.
actualExpression String ds=20200925 The partition in the monitored data source table.
RuleChecks Array A list of validation results.
BlockType Integer 1 Rule strength. 1 = Strong rule. 0 = Weak rule. Strong rules block scheduling tasks when the critical threshold is exceeded.
WarningThreshold Float 0.1 Warning threshold. Indicates the allowable deviation from the expected value.
Property String type The column in the data source table that the rule checks.
TableName String dual The name of the validated table.
Comment String The description of the validation rule.
CheckResultStatus Integer 2 The status of the check result.
TemplateName String Compare number of unique field values against expectation The name of the validation template.
CheckerName String fulx The name of the checker.
RuleId Long 123421 The rule ID.
FixedCheck Boolean false Whether a fixed value is used for the check. true: fixed value. false: no fixed value.
Op String > The comparison operator.
UpperValue Float 22200 The predicted upper limit, automatically generated based on the configured threshold.
ActualExpression String ds=20200925 The actual partition in the data source table that was checked.
ExternalId String 123112232 The node ID for the scheduled task.
TimeCost String 10 The duration of the verification task.
Trend String up The trend of monitoring results.
ExternalType String CWF2 The CDN mapping system type. Only CWF is supported.
BizDate Long 1600704000000 The data timestamp. For offline data, this is typically one day before the check runs.
CheckResult Integer 2 The verification result.
MatchExpression String ds=$[yyyymmdd] The partition filter expression.
CheckerType Integer 0 The type of the checker.
ProjectName String autotest The name of the compute engine or data source for the data quality check.
BeginTime Long 1600704000000 The start time of the verification operation.
DateType String YMD The scheduling cycle type. YMD indicates yearly, monthly, and daily tasks.
CriticalThreshold Float 0.6 The critical threshold. Indicates the degree of deviation from the expected value. If a strong rule exceeds this threshold, scheduling tasks are blocked.
IsPrediction Boolean false Whether the result is a prediction. true: prediction. false: not a prediction.
RuleName String The name of the rule.
CheckerId Integer 7 The ID of the checker.
DiscreteCheck Boolean true Whether discrete validation is enabled. true: discrete. false: non-discrete.
EndTime Long 1600704000000 The end time of the verification results query.
MethodName String max The method used to collect sample data. Supported values: avg, count, sum, min, max, count_distinct, user_defined, table_count, table_size, table_dt_load_count, table_dt_refuseload_count, null_value, null_value/table_count, (table_count-count_distinct)/table_count, table_count-count_distinct.
LowerValue Float 2344 The predicted lower limit, automatically generated based on the configured threshold.
EntityId Long 14534343 The ID of the partition filter expression.
WhereCondition String type!='type2' The filter condition for the validation task.
ExpectValue Float 90 The expected value.
TemplateId Integer 5 The ID of the validation template.
TaskId String 16008552981681a0d6**** The ID of the verification task.
Id Long 2231123 The primary key ID.
ReferenceValue Array Historical sample values.
DiscreteProperty String type1 Values of the sample field grouped by GROUP BY. For example, grouping by Gender returns Male, Female, and null.
Value Float 20 The sample value.
BizDate String 1600704000000 The data timestamp. For offline data, typically one day before the check runs.
SingleCheckResult Integer 2 The verification result for a single check.
Threshold Float 0.2 The threshold value.
SampleValue Array Current sample values.
Open Boolean true Whether the rule is enabled.