All Products
Search
Document Center

DataWorks:Monitor data quality

Last Updated:Mar 26, 2026

Use Data Quality to catch data problems as soon as a scheduling node finishes — before those problems propagate to downstream computations. This tutorial walks through configuring two monitoring rules for the ods_user_info_d_starrocks table: a strong rule that checks whether the row count is greater than 0, and a weak rule that checks whether the business primary key is unique.

When the data synchronization node runs each day, both rules are triggered automatically. The strong rule blocks all descendant nodes if the row count is 0; the weak rule sends an alert if duplicate primary keys are detected.

Prerequisites

Before you begin, make sure you have:

  • Synced basic user information from the ApsaraDB RDS for MySQL table ods_user_info_d to the ods_user_info_d_starrocks table in an E-MapReduce (EMR) Serverless StarRocks instance via Data Integration

  • Synced website access logs from user_log.txt in Object Storage Service (OSS) to the ods_raw_log_d_starrocks table in an EMR Serverless StarRocks instance via Data Integration

  • Processed the collected data into basic user profile data in Data Studio

Monitoring requirements

The following table summarizes the monitoring rules for each table in the user profile analysis pipeline. Monitoring runs after each daily sync completes, catching issues before downstream extract, transform, and load (ETL) operations begin.

Table

Rule

Consequence

ods_raw_log_d_starrocks

Strong rule: row count > 0 daily

Blocks descendant nodes if row count = 0

ods_user_info_d_starrocks

Strong rule: row count > 0 daily

Blocks descendant nodes if row count = 0

ods_user_info_d_starrocks

Weak rule: business primary key is unique daily

Sends an alert if duplicates are found; does not block nodes

dwd_log_info_di_starrocks

No rule

dws_user_info_all_di_starrocks

No rule

ads_user_info_1d_starrocks

Rule: monitors row count fluctuation daily

Helps observe daily unique visitor (UV) trends

This tutorial covers the ods_user_info_d_starrocks table.

Step 1: Open the Configure by table page

  1. Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose Data Governance > Data Quality, select the target workspace from the drop-down list, and click Go to Data Quality.

  2. In the left-side navigation pane of the Data Quality page, choose Configure Rules > Configure by Table.

  3. On the Configure by table page, filter by connection:

    • In the Connection section, select StarRocks.

    • On the right side of the page, specify filter conditions to find the ods_user_info_d_starrocks table.

  4. In the search results, click Rule Management in the Actions column for the ods_user_info_d_starrocks table. The Table Quality Details page opens.

Step 2: Configure monitoring rules

Configure a monitor that checks the daily partition of ods_user_info_d_starrocks. The monitor combines two rules — a strong rule for row count and a weak rule for primary key uniqueness. For more information about how to configure monitoring rules, see Configure a monitoring rule for a single table.

  1. On the Monitor tab, click Create Monitor.

  2. Set Data Range to dt=$[yyyymmdd-1].

    Note

    Make sure that the value of the Data Range parameter corresponds to the partition generated for the table on the current day.

  3. Click Create Rule. The Create Rule panel opens.

  4. Configure the strong rule (row count check): On the System Template tab, find the Table is not empty rule and click Use. On the right side of the panel, set Degree of Importance to Strong Rule.

    Note

    When the number of rows in the ods_user_info_d_starrocks table is found to be 0, an alert is triggered and the descendant nodes are blocked from running.

  5. Configure the weak rule (primary key uniqueness check): On the System Template tab, find the Unique value. fixed value rule and click Use. Configure the following settings on the right side of the panel:

    Setting

    Value

    Rule Scope

    uid(STRING)

    Monitoring Threshold (Normal)

    Comparison operator: =, value: 0

    Degree of Importance

    Weak rules

  6. Click Determine to save both rules.

  7. Set the trigger and handling policy:

    • Set Trigger Method to Triggered by Node Scheduling in Production Environment and select the ods_user_info_d_starrocks node created during data synchronization.

    • Set the handling policy to blocking the running of the node or sending an alert notification to the recipient, depending on your requirements.

  8. Click Save.

Step 3: Run a test

After saving, verify that the rules behave as expected before the monitor runs in production.

  1. In the Monitor Perspective section of the Rule Management tab, select the monitor you created.

  2. Click Test Run on the right side of the tab.

  3. In the Test Run dialog box, set the Scheduling Time parameter and click Test Run.

  4. After the test run completes, click View Details to check whether the data passes each rule.

Step 4: Subscribe to alerts

Configure who receives notifications when a rule is triggered.

  1. In the Monitor Perspective section of the Rule Management tab, select the monitor.

  2. Click Alert Subscription on the right side of the tab.

  3. In the Alert Subscription dialog box, configure the Notification Method and Recipient parameters, then click Save in the Actions column.

  4. To view or modify your subscriptions later, go to Quality O&M > Monitor in the left-side navigation pane and select My Subscriptions.

What to do next

After the data is processed, use DataAnalysis to visualize it on a dashboard. For details, see Visualize data.