All Products
Search
Document Center

Dataphin:Create Data Source Quality Rules

Last Updated:Jan 23, 2025

Dataphin enables monitoring of data source connectivity and table structure changes. It regularly checks for changes and supports setting up alerts for abnormalities, helping you stay informed about the status of data source connectivity and table structure in real time. This topic describes how to set up quality rules for data sources.

Prerequisites

You must add the monitored object before configuring quality rules. For more information, see Add Monitored Object.

Permission Description

  • Super administrators, quality administrators, custom global roles with Quality Rule-Management permissions, and data source owners are authorized to configure scheduling, alerts, and more for quality rules.

  • Quality owners and regular users require additional permissions to review data sources. For instructions on how to obtain these permissions, see Request Data Source Permissions.

  • Operation permissions vary by object type. For more information, see Quality Rule Operation Permissions.

Differences Between Trial Run and Run

A Trial Run simulates the execution of a quality rule to verify its correctness and operational status, with results not displayed in the quality report. A Run executes the quality rule at a specified time, with results published in the quality report for user review and analysis.

Quality Rule Configuration

  1. Navigate to Administration > Data Quality from the top menu bar on the Dataphin home page.

  2. Select Quality Rule in the left-side navigation pane. On the Datasource page, click the name of the target object to access the Quality Rule Details page and configure quality rules.

  3. On the Quality Rule Details page, click the Create Quality Rule button.

  4. In the Create Quality Rule dialog box, set the necessary parameters.

    Parameter

    Description

    Basic Information

    Rule Name

    Custom name for the quality rule, not exceeding 256 characters.

    Rule Strength

    Supports Weak Rule and Strong Rule.

    • Weak Rule: If you select Weak Rule, the quality rule verification result will alert when abnormal but will not block downstream task nodes.

    • Strong Rule: If you select Strong Rule, the quality rule verification result will alert when abnormal. If there are downstream tasks (code check scheduling, task trigger scheduling), it will block downstream tasks to prevent data pollution. If there are no downstream tasks (such as periodic quality scheduling), it will only alert.

    Description

    Custom description of the quality rule. Not exceeding 128 characters.

    Rule Template

    Only supports Stability, including Data Source Connectivity Monitoring and Table Structure Change Monitoring.

    • Connectivity Monitoring: Configure alerts for changes in connectivity monitoring due to network changes, username, password, etc., that cause connection failures to data sources configured in Dataphin, leading to task errors.

    • Table Structure Change: Monitor and alert for changes in the structure of ancestor tables, such as renaming, deleting, or adding fields, which may cause downstream errors.

    Rule Type

    The rule type is related to the template and is the most basic property of the template. It can be used as a description and filtering feature.

    Rule Configuration

    Select Verification Table

    When the rule template is set to Table Structure Change Monitoring, you need to select the data table to be verified.

    Business Property Configuration

    Property Information

    The specification for filling in business properties depends on the configuration of the quality rule properties. For example, the field value type corresponding to the department in charge is an enumeration value (multiple choice), and the optional enumeration value range is the Big Data Department, Business Department, and Technical Department. Therefore, when creating a quality rule, this property value is a drop-down multiple-choice box. The optional items are enumeration values (multiple choice), and the optional enumeration value range is the Big Data Department, Business Department, and Technical Department.

    The field value type corresponding to the rule owner is custom input, and the property field length is 256. Therefore, when creating a quality rule, this property value can be entered with no more than 256 characters.

    If the filling method of the property field is Range Interval, the configuration method is as follows:

    Range Interval: Commonly used when the value range is continuous numbers or dates. You can select four symbols: >, >=, <, <=. For more property configurations, see Create and Manage Quality Rule Properties.

    Scheduling Property Configuration

    Scheduling Method

    Supports selecting configured scheduling. If the scheduling method is not yet decided, you can configure it after creating the quality rule. To create a new one, see Create Scheduling.

  5. Click OK to finalize the rule configuration.

Rule Configuration List

The rule configuration list page displays the data source rule information, where you can view, edit, trial run, run, and delete rules.

image

Area

Description

Filter and Search Area

Supports quick search by object or rule name.

Supports filtering by rule type, rule template, rule strength, trial run status, and effective status.

Note

If the quality rule property is configured with searchable and filterable business properties and is enabled, you can search or filter based on this property.

List Area

Displays the object type/name of the rule configuration list, rule name/ID, test run status, effective status, rule type, rule template, rule intensity, schedule type, and related knowledge base document information. Click the image icon before refresh to select the rule list fields you want to display.

  • Effective Status: It is recommended to perform a trial run before enabling the effective status of a rule. Enable the effective status for rules that pass the trial run to avoid blocking online tasks with incorrect rules.

    • After enabling the effective status, the selected rules will be automatically executed according to the configured scheduling.

    • After disabling the effective status, the selected rules will not be automatically executed but can be manually executed.

  • Related Knowledge Base Document: Click View Details to view the knowledge base information associated with the rule. This includes table names, verification objects, rules, and related knowledge base document information. You can also perform search, view, edit, and delete operations on the knowledge base. For more details, see View Knowledge Base.

Operation Area

You can perform view, clone, edit, trial run, run, scheduling configuration, associate knowledge base document, and delete operations.

  • View: View the details of the rule configuration.

  • Clone: Quickly clone a rule.

  • Edit: After editing a rule, a trial run is required again.

  • Trial Run: After the trial run, click the image icon to View Trial Run Logs.

  • Run: After running, you can view the verification results in Quality Record.

  • Scan Configuration: You can filter scheduling types or use scheduling names for quick search in the pop-up box. Editing scheduling is also supported.

  • Associate Knowledge Base Document: After associating a rule with a knowledge base, you can view the associated knowledge in the quality rule and administration workbench. You can select unassociated knowledge bases. To create, see Create and Manage Knowledge Base.

  • Delete: Deleting this quality rule object will delete all quality rules under the object. This action cannot be undone.

Batch Operation Area

You can perform batch trial run, run, configure scheduling, enable, disable, modify business properties, associate knowledge base document, and delete operations.

  • Trial Run: Supports batch trial run of rules. After the trial run, click the image icon to View Trial Run Logs.

  • Run: Supports batch running of rules. After running, you can view the verification results in Quality Record.

  • Scan Configuration: Supports filtering scheduling types or using scheduling names for quick search in the dialog box. Editing scheduling is also supported for batch configuration of scheduling for quality rules. Only supports modifying rules that can be edited on the quality rule list page.

  • Enable: After batch enabling the effective status, the selected rules will be automatically executed according to the configured scheduling. Only supports enabling rules that can be edited on the quality rule list page.

  • Disable: After batch disabling the effective status, the selected rules will not be automatically executed but can be manually executed. Only supports disabling rules that can be edited on the quality rule list page.

  • Modify Business Properties: When the field value type corresponding to the business property is single choice or multiple choice, batch modification of business properties is supported.

    • When the field value type corresponding to the business property is multiple choice, appending or modifying property values is supported.

    • When the field value type corresponding to the business property is single choice, direct modification of property values is supported.

  • Associate Knowledge Base Document: After associating a rule with knowledge, you can view the associated knowledge in the quality rule and administration workbench. Supports batch configuration of knowledge bases for monitored objects. To create, see Create and Manage Knowledge Base.

  • Delete: Supports batch deletion of quality rule objects. This action cannot be undone. Please operate with caution. Only supports deleting rules that can be edited on the quality rule list page.

Create Scheduling

Note
  • When setting up scheduling for rule configuration, you can swiftly apply up to 20 scheduling rules per table based on its current schedule.

  • A single rule can have up to 10 scheduling configurations.

  • Automatic deduplication is supported when the scheduling configuration is fully consistent.

  1. On the Quality Rule Details page, select the Scan Configuration tab, then click the Create Scheduling button to open the Create Scheduling dialog box.

  2. In the Create Scheduling dialog box, configure the required parameters.

    Parameter

    Description

    Scheduling Name

    Custom scheduling name, not exceeding 64 characters.

    Scheduling Type

    Supports Recurrency Triggered and Task Triggered.

    • Recurrency Triggered: Supports timed and periodic quality checks on data based on the set scheduling time, suitable for scenarios where data production time is relatively fixed.

      • Recurrence: Running quality rules will occupy certain computing resources. It is recommended to avoid concurrent execution of multiple quality rules at the same time to avoid affecting the normal operation of production tasks. The scheduling cycle includes five cycle types: Day, Week, Month, Hour, and Minute.

    • Task Triggered: Execute the configured quality rule after or before the specified task runs successfully. Supports selecting task types such as engine SQL, offline pipeline, Python, Shell, Virtual, Datax, Spark_jar, Hive_MR, and database SQL nodes to trigger tasks. Suitable for situations where table modification tasks are fixed.

      Note

      Fixed task triggers can only select production environment tasks. If the rule strength is configured with a strong rule, scheduling task verification failure may affect online tasks. Please operate cautiously according to business needs.

      • Trigger Timing: Select the trigger timing for quality checks. Supports selecting Trigger After All Tasks Run Successfully, Trigger After Each Task Runs Successfully, and Trigger Before Each Task Runs.

      • Triggering Task: Supports selecting production task nodes with maintenance permissions for the current user. You can search by node output name.

        Note

        When the trigger timing is set to trigger after all tasks run successfully, it is recommended to select tasks with the same scheduling cycle to avoid rule delay execution and quality check result delay due to different scheduling cycles.

    Schedule Condition

    Disabled by default. After enabling, before the formal scheduling of the quality rule, it will first determine whether the scheduling conditions are met. The formal scheduling will only occur if the conditions are met. If the conditions are not met, this scheduling will be ignored.

    • Business Date/Executed On: If the scheduling type is set to Recurrency Triggered (timed scheduling does not support execution date), Code Check Triggered Scheduling, or Task Triggered, date configuration is supported. You can select Regular Calendar or Custom Calendar. For how to customize a calendar, see Create Public Calendar.

      • If you select Regular Calendar, the conditions can be selected as Month, Week, Date. For example, see the figure below:

        image

      • If you select Custom Calendar, the conditions can be selected as Date Type, Tag. For example, see the figure below:

        image

    • Instance Type: If the scheduling type is set to Code Check Triggered Scheduling or Task Triggered, instance type configuration is supported. You can select Recurring Instance, Data Backfill Instance, One-time Instance. For example, see the figure below:

      image

    Note
    • At least one rule must be configured. To add a rule, click the + Add Rule button.

    • Up to 10 scheduling conditions can be configured.

    • The relationship between scheduling conditions can be configured as and, or.

  3. Click OK to complete the scheduling setup.

Scheduling Configuration List

The scheduling configuration list allows for viewing, editing, cloning, and deleting of scheduling configurations after their creation.

image.png

Area

Description

Filter and Search Area

Supports quick search by scheduling name.

Supports filtering by Recurrency Triggered and Fixed Task Trigger Scheduling.

List Area

Displays the information of Schedule Name, Schedule Type, Last Updated By, and Last Updated Time in the rule configuration list.

Operation Area

You can perform edit, clone, and delete operations on scheduling.

  • Edit: Modify the configured scheduling information.

    Important

    All rule configurations that reference this scheduling will change synchronously. Please operate with caution.

  • Clone: Quickly copy scheduling configuration.

  • Delete: Scheduling referenced by rule configuration cannot be deleted.

Set Alerts

Configure different alert methods for various rules to distinguish between them. For instance, set phone alerts for critical rule exceptions and text message alerts for minor ones. If a rule triggers multiple alert configurations at once, you can determine the effective alert policy.

Note

A single monitored object can have no more than 20 alert configurations.

  1. On the Quality Rule Details page, click the Alert Configuration tab, then click the Create Alert Configuration button to open the Create Alert Configuration dialog box.

  2. In the Create Alert Configuration dialog box, set the necessary parameters.

    Parameter

    Description

    Coverage

    Supports selecting All Rules, All Strong Rules, All Weak Rules, and Custom.

    Note
    • Under a single monitored object, the three ranges of all rules, all strong rules, and all weak rules support configuring one alert each. Newly added rules will automatically match the corresponding alert based on rule strength. To change one of the alert configurations, you can modify the existing configuration.

    • The custom range can select all configured rules under the current monitored object, not exceeding 200.

    Alert Configuration Name

    The alert configuration name under a single monitored object is unique and does not exceed 256 characters.

    Alert Recipients

    Configure alert recipients and alert methods. At least one alert recipient and alert method must be selected.

    • Alert Recipients: Supports selecting custom, shift schedule, and quality owner as alert recipients.

      Supports configuring up to 5 custom alert recipients and up to 3 shift schedules.

    • Alert Method: Supports selecting different receiving methods such as phone, email, text message, DingTalk, Lark, WeCom, and custom channel. This receiving method can be controlled through Configure Channel Settings.

  3. Click OK to finalize the alert configuration.

Alert Configuration List

Upon completing the alert configuration, you can sort, edit, and delete configurations in the alert configuration list.

image.png

OrdinalNumber

Description

① Sorting Area

Supports configuring the alert effective policy when a quality rule matches multiple alert configurations:

  • The First Alert Configuration Hit Takes Effect: When this alert policy is selected, only the first alert configuration hit by the rule takes effect, and other configurations do not take effect. At this time, sorting of configured alerts is supported. Click Rule Sorting, and you can drag and sort by selecting the image.png icon before the alert configuration name or move using the icon in the operation column. The icons from left to right are: top, bottom. After adjusting the alert order, click the Sorting Complete button to complete the sorting.

    image.png

  • All Alert Configurations Take Effect: The alerts in the current alert configuration list take effect for the quality rules under the current monitored object.

    For example, when you configure multiple alert configurations and select all alert configurations to take effect, the system will merge alerts according to alert receiving method + alert recipient + alert rule. Specifically, if the alert recipient is the same recipient, and the alert method is custom and quality owner, the alert messages will be merged according to the merge policy.

    Note

    Shift schedules do not support alert merging.

② List Area

Displays the name, effective range, specific recipients of each alert type, and corresponding alert receiving methods of the alert configuration.

Scope Of Effect: You can click the View icon after the scope of effect rule to view the scope of the rule. Only custom alerts support viewing the object name and rule name in the configuration. If the rule is deleted, the object name cannot be viewed. It is recommended that you update the alert configuration.

③ Operation Area

You can perform editing and deleting operations on the configured alerts.

  • Edit: Supports modifying the configured alert information. If you modify alert recipients and alert methods, please synchronize with relevant personnel in a timely manner to avoid missing business alert information.

  • Delete: After deletion, the rules hit by this alert configuration will no longer take effect. Please operate with caution.

View Quality Report

Click Quality Report to access the Rule Verification Overview and Rule Verification Details for the current quality rule.

  • Quickly filter verification details by abnormal results, partition time, rule, or object name keyword.

  • In the operation column of the rule verification details list, click the image icon to view detailed verification information for the quality rule.

  • In the operation column of the rule verification details list, click the image icon to view the execution logs for the quality rule.

Set Quality Rule Permission Management

  1. Click Permission Management to set up View Details, which allows specified members to view verification records, quality rule details, and quality reports.

    View Details: Choose between All Members or Only Members With Current Object Quality Management Permissions.

  2. Click OK to finalize the permission management settings.

What to do next

Once you have completed the quality rule configuration, you can view it on the data source rule list page. For more details, see View Monitored Object List.