All Products
Search
Document Center

Dataphin:Create Dataphin Table Quality Rules

Last Updated:Jan 21, 2025

Dataphin supports the creation of quality rules to ensure data table integrity, offering a streamlined approach to monitoring data table quality. This topic describes the steps to configure quality rules for Dataphin tables.

Prerequisites

Before configuring quality rules, you must add monitored objects. For more information, see Add Monitored Objects.

Permission Description

  • Super administrators, quality administrators, custom global roles with Quality Rule-Management permissions, and custom project roles with Project Quality Management-Quality Rule Management permissions for the project where the table resides can configure scheduling, alerts, exception archive tables, and scoring weights for quality rules.

  • Quality owners can configure scheduling, alerts, exception archive tables, and scoring weights for quality rules under the monitored objects they oversee.

  • Quality owners and regular users must also have read permissions for Dataphin tables. To request permissions, see Apply, Renew, and Return Table Permissions.

  • Operation permissions vary depending on the object. For details, see Quality Rule Operation Permissions.

Quality Rule Configuration Description

Quality rules support Custom Configuration and From Standard options, with the latter requiring the data standard module to be enabled.

  • Custom Configuration enables the rapid creation of quality rules using built-in or custom templates and supports custom SQL for flexible monitoring needs.

  • From Standard leverages quality rules configured in the data standards associated with the current asset object, ensuring adherence to standard constraints.

Validation Rule Description

When a data table is subject to quality rule validation, triggering a Weak Monitoring Rule prompts an alert message to quickly identify and rectify anomalies. Conversely, triggering a Strong Monitoring Rule halts the associated task to prevent the propagation of erroneous data, alongside an alert message for prompt resolution.

Differences Between Trial Run and Execution

The distinction between a Trial Run and Execution is based on the method of execution and the presentation of results. A Trial Run simulates the application of a quality rule to verify its accuracy and functionality, but its results are not included in the quality report. Conversely, Execution involves the assessment of quality rules during a defined period, with the outcomes being recorded in the quality report for user review and analysis.

Quality Rule Configuration

  1. Navigate to the Dataphin home page, select Administration > Data Quality from the top menu bar.

  2. Click Quality Rule in the left-side navigation pane. On the Dataphin Table page, select the target object to access the Quality Rule Details page and begin configuring quality rules.

    • Custom Configuration

      1. Hover over the data table and click Create Quality Rule. Choose Custom Configuration or click Create Quality Rule to open the Create Quality Rule dialog box.

        image

      2. In the Create Quality Rule dialog box, set the necessary parameters.

        Parameter

        Description

        Basic Information

        Rule Name

        The name of the custom quality rule, not exceeding 256 characters.

        Rule Strength

        Supports Weak Rule and Strong Rule.

        • Weak Rule: If you select Weak Rule, an alert is triggered when the quality rule validation result is abnormal but does not block downstream task nodes.

        • Strong Rule: If you select Strong Rule, an alert is triggered when the quality rule validation result is abnormal. If there are downstream tasks (code inspection scheduling, task trigger scheduling), it will block downstream tasks to prevent data pollution divergence. If there are no downstream tasks (such as periodic quality scheduling), it will only trigger an alert.

        Description

        Description of the custom quality rule. Not exceeding 128 characters.

        Configuration Method

        • Template Creation: Use general system templates and custom business templates to quickly create quality rules.

          • System Template: The parameters built into the template can be configured, suitable for creating general rules.

          • Custom Template: The template has preset parameters and does not require configuration, generally used for creating rules with business logic.

        • SQL: You can flexibly customize quality monitoring rules through SQL, suitable for flexible and complex scenarios.

        Rule Template

        You can select rule templates from the dropdown, such as Completeness, Uniqueness, Timeliness, Validity, Consistency, Stability, SQL.

        • Completeness: Includes Null Value Validation and Empty String Validation.

        • Uniqueness: Includes Uniqueness Validation, Field Group Count Validation, and Duplicate Value Count Validation.

        • Timeliness: Includes Time Comparison With Expression, Time Interval Comparison, and Time Interval Comparison In Two Tables.

        • Validity: Includes Column Format Validation, Column Length Validation, Column Value Domain Validation, Reference Table Validation, and Standard Reference Table Validation (requires activation of the data standard module).

        • Consistency: Includes Columns Value Consistency Validation, Columns Statistical Consistency Validation, Single Field Business Logic Consistency Comparison, Columns In Two Tables Value Consistency Validation, Columns In Two Tables Statistical Consistency Validation, Columns In Two Tables Processing Logic Consistency Validation, and Cross-Source Columns Statistical Consistency Validation.

        • Stability: Includes Table Stability Validation, Table Volatility Validation, Column Stability Validation, and Column Volatility Validation.

        • SQL: Includes Custom Statistic Validation and Custom Detail Value Validation.

        For more information, see the referenced document.

        Rule Type

        The rule type is related to the template and is the most basic property of the template. It can be used for description and filtering functions.

        Monitoring Granularity

        When the configuration type is Use Custom SQL Statement to Create a Dataset, you can configure the monitoring granularity. You can choose Entire Table or specify specific monitoring Field.

        Template Configuration

        Template Information

        When you select a quality rule template, the configuration information of the template is displayed for you. If you need to modify the configuration information, you can do so in Quality Rule Template.

        Rule Configuration

        Rule Configuration

        The rule configuration varies depending on the selected rule template. For more information, see Data Table Parameter Configuration.

        Special configurations are as follows:

        • Validation Table Data Filtering: Disabled by default. After enabling, you can configure the filter condition or partition filter or general data filtering for the validation table. The filter condition will be directly appended to the validation SQL. If the validation table requires partition filtering, it is recommended to configure the partition filter expression in the scheduling configuration. After configuration, the validation partition will be the minimum granularity for viewing the quality report.

        • When the rule template selects Consistency/Two Tables Field Statistical Value Consistency Comparison or Consistency/Cross-Source Two Tables Field Statistical Value Consistency Comparison, you can choose whether to enable Comparison Table Data Filtering. After enabling, you can configure the filter condition or partition filter or general data filtering for the comparison table. The filter condition will be directly appended to the validation SQL.

        Validation Configuration

        Rule Validation

        • After the data quality rule validation, the result will be compared with the abnormal validation configuration. If the conditions are met, the validation result will be considered as failed. It will also trigger alerts and other subsequent processes.

        • The available indicators for abnormal validation are determined by the template and configuration content. It supports multiple conditional and/or conditions. It is recommended to have less than three in actual configuration.

        For more information, see Validation Configuration Description.

        Archive Configuration

        Archive

        By default, Off. After enabling, you can archive abnormal data into files or tables. After quality verification, you can download and parse the archived abnormal data.

        • ARCHIVELOG Mode supports Archive Only Abnormal Fields and Archive Complete Records.

          • Archive Only Abnormal Fields: Only deduplicate and archive the current monitoring fields. It is suitable for situations where abnormal data can be fully determined by a single field.

          • Archive Complete Records: Archive the entire record where the abnormal data is located. It is suitable for situations where the complete record is necessary to locate abnormal data.

            Note

            Because archiving complete records will significantly increase the amount of archived data, it is recommended to use Archive Only Abnormal Fields under normal circumstances.

        • Archive Location supports Default File Server and Archive Table For Exception Data. If an exception archive table has not been created, you can click Manage Exception Archive Table to create one. For more information, see Add Exception Archive Table.

          • Default File Server: Refers to the system file server configured when Dataphin is deployed. You can directly download abnormal data on the Validation Record-Validation Details page or directly access the default file server to obtain abnormal data. When using the default file server, a maximum of 100 abnormal data records can be archived per validation execution, suitable for scenarios with small data volume validation.

          • Archive Table For Exception Data: If you want to store more abnormal data or place abnormal data from different validation records together for subsequent comparison and analysis, it is recommended to specify an archive table. Each quality rule can record up to 10,000 abnormal data records per run. In addition to quickly downloading the abnormal data of a single validation on the validation record page, you can also directly access the archive table and customize the table lifecycle for greater flexibility.

            Note
            • Summarize and download all abnormal data generated by all rules involved in this run, and the download quantity does not exceed 1,000 records. If you want to view more data, it is recommended to archive it to the specified exception archive table and then directly access the archive table to obtain it.

            • The exception archive table needs to meet specific format requirements. Otherwise, errors may occur when writing data, affecting usage. For more information, see Add Exception Archive Table.

        Business Property Configuration

        Property Information

        The specification for filling in business properties depends on the configuration of quality rule properties. For example, the field value type corresponding to the department in charge is an enumeration value (multiple choice). The optional enumeration value range includes the Big Data Department, Business Department, and Technical Department. Therefore, when creating a quality rule, this property value is a dropdown multiple-choice box. The options are enumeration values (multiple choice), and the optional enumeration value range includes the Big Data Department, Business Department, and Technical Department.

        The field value type corresponding to the rule owner is custom input, and the property field length is 256. Therefore, when creating a quality rule, this property value can be up to 256 characters.

        image

        If the filling method of the property field is Range Interval, the configuration method is as follows:

        Range Interval: Commonly used when the value range is continuous numbers or dates. You can choose from four symbols: >, >=, <, <=. For more property configurations, see Create and Manage Quality Rule Properties.

        Scheduling Property Configuration

        Scheduling Method

        You can choose from the configured scheduling options. If you have not yet decided on a scheduling method, you can configure it after creating the quality rule. For more information on creating a new schedule, see Create New Schedule.

        Quality Score Configuration

        Scoring Method

        Supports two scoring methods: quality validation status and data qualification ratio.

        • Quality Validation Status: Scores based on the validation status of the most recent successful validation record of the current rule. A successful validation scores 100 points, while a failed validation scores 0 points.

        • Data Qualification Ratio: Uses the proportion of normal data (i.e., normal rate) in the most recent successful validation record of the current rule as the score. For example, if the data format validity is 80%, the quality score is 80 points.

        Different rule templates support different scoring methods. Only the Quality Validation Status scoring method is supported as follows:

        • Field group count validation and field duplicate value count validation in the uniqueness rule category.

        • Single table field statistical value consistency comparison and cross-source two tables field statistical value consistency comparison in the consistency rule category.

        • Stability rule category.

        • Custom statistical indicator validation in the custom SQL rule category.

        Quality Score Weight

        The quality score weight of the quality rule is used to calculate the quality score of the monitored object. You can choose an integer between 1 and 10.

      3. Click OK to finalize the custom rule configuration.

        Select Preview SQL to review SQL changes by comparing the current configuration with the last saved version.

        Note
        • Incomplete key information may prevent SQL preview availability.

        • The left side displays the SQL preview of the last saved configuration, which is empty if not previously set. The right side shows the SQL preview based on the current configuration.

  • From Standard

    1. Hover over the data table and click Create Quality Rule, then select From Standard.

      image

    2. In the From Standard dialog box, select the desired data standard rules to reference. Filter standard rules by templates such as Validity, Uniqueness, Completeness, and Stability, or search by object name.

      In the referenced data standard rules dialog box, you can modify the rule name, toggle the effective status, view standard details by clicking the associated standard code, or click the image icon to inspect the quality rule.

      Note

      Post-reference, rule detail configurations cannot be altered. However, you can configure the schedule or adjust the rule's strength and other parameters as needed.

    3. Click Add Selected Rules to complete the data standard rule referencing process.

Rule configuration list

Once a quality rule is established, you can view, edit, test, execute, or delete it from the rule configuration list.

image

Area

Description

Filter and Search Area

Supports quick search by object or rule name.

Supports filtering by rule type, rule template, rule intensity, trial run status, effective status, and rule source.

Note

If the quality rule property is configured with searchable and filterable business attributes and is enabled, you can search or filter by this property.

List Area

Displays the object type/name, rule name/ID, trial run status, effective status, rule type, rule template, rule intensity, schedule type, and related knowledge base document information of the rule configuration list. Click the image icon before refresh to select the rule list fields you want to display.

  • If the referenced rule is a data standard rule, you can click the image icon next to the rule name to view the standard details.

  • If it is a standard constraint of a development field, it is marked with an image icon after the rule name.

  • Effective Status: It is recommended to conduct a trial run before enabling the effective status of the rule. Enable the effective status for rules that pass the trial run to avoid blocking online tasks with incorrect rules.

    • After enabling the effective status, the selected rules will be automatically executed according to the configured schedule.

    • After disabling the effective status, the selected rules will not be automatically executed but can be executed manually.

  • Related Knowledge Base Document: Click View Details to view the knowledge base information associated with the rule. This includes table name, validation object, rule, and related knowledge base document information. You can also perform search, view, edit, and delete operations on the knowledge base. For more information, see the referenced document.

Operation Area

You can view, clone, edit, trial run, run, configure schedule, associate knowledge base document, configure quality score, and delete operations.

  • View: View the details of the rule configuration.

  • Clone: Quickly copy the rule.

  • Edit: After editing the rule, a trial run is required again. The information that can be edited for quality rules generated by different features varies. For more information, see the referenced document.

  • Trial Run: Supports selecting Existing Schedule or Custom Validation Scope for trial run of rules. After the trial run, you can click the image icon View Trial Run Log.

  • Run: Supports selecting Existing Schedule or Custom Validation Scope to run rules. After running, you can view the validation results in Quality Record.

  • Scan Configuration: Supports filtering schedule types or using schedule names for quick schedule search in the dialog box. It also supports editing schedules.

  • Associate Knowledge Base Document: After associating knowledge with the rule, you can view the associated knowledge in the quality rule and administration workbench. Supports selecting unassociated knowledge bases. For creation, see the referenced document.

  • Quality Score Configuration: Supports modifying the scoring method and quality score weight of quality specifications.

    Important

    The quality score weight of the quality rule is used to calculate the quality score of the monitored object. Modifying it will affect the quality score results. Please proceed with caution.

  • Delete: Deleting this quality rule object will delete all quality rules under the object. This action cannot be undone. Please proceed with caution.

Batch Operation Area

You can perform batch trial run, run, configure schedule, enable, disable, modify business attributes, associate knowledge base document, configure quality score, export rules, and delete operations.

  • Trial Run: Supports selecting Existing Schedule or Custom Validation Scope for batch trial run of rules. After the trial run, you can click the image icon View Trial Run Log.

  • Run: Supports selecting Existing Schedule or Custom Validation Scope to batch run rules. After running, you can view the validation results in Quality Record.

    Note

    When running in batch, it is recommended to select tables with the same partition. Partition information will be directly executed. If partitions are inconsistent, errors may occur.

  • Configure Schedule: Supports filtering schedule types or using schedule names for quick schedule search in the dialog box. It also supports editing schedules and batch configuring schedules for quality rules. Only supports modifying rules that can be edited on the quality rule list page among the selected rules.

  • Enable: After batch enabling the effective status, the selected rules will be automatically executed according to the configured schedule. Only supports enabling rules that can be edited on the quality rule list page among the selected rules.

  • Disable: After batch disabling the effective status, the selected rules will not be automatically executed but can be executed manually. Only supports disabling rules that can be edited on the quality rule list page among the selected rules.

  • Modify Business Attributes: When the field value type corresponding to the business attribute is single or multiple choice, batch modification of business attributes is supported.

    • When the field value type corresponding to the business attribute is multiple choice, appending or modifying attribute values is supported.

    • When the field value type corresponding to the business attribute is single choice, direct modification of attribute values is supported.

  • Associate Knowledge Base Document: After associating knowledge with the rule, you can view the associated knowledge in the quality rule and administration workbench. Supports batch configuration of knowledge bases for monitored objects. For creation, see the referenced document.

  • Quality Score Configuration: Supports batch modification of the scoring method and quality score weight of quality specifications.

    Important

    The quality score weight of the quality rule is used to calculate the quality score of the monitored object. Modifying it will affect the quality score results. Please proceed with caution.

  • Export Rules: Export the custom SQL quality rules selected under the current monitored object with view permissions.

  • Delete: Supports batch deletion of quality rule objects. This action cannot be undone. Please proceed with caution. Only supports deleting rules with edit permissions.

Create a schedule

Note
  • You can swiftly set up a schedule using existing templates, with the ability to apply up to 20 scheduling rules per table.

  • A maximum of 10 schedules can be configured for the same rule.

  • Automatic deduplication is supported when the schedule configurations are identical.

  • For Hologres partitioned tables, it is advisable to use a fixed task to trigger the schedule.

  • The validation range, issued as a filter condition in the quality validation statement, determines the scope of each validation and serves as the smallest unit for quality reports and downstream processes.

  1. On the Quality Rule Details page, select the Scan Configuration tab, then click Create Schedule to open the Create Schedule dialog box.

  2. In the Create Schedule dialog box, enter the required parameters.

    Parameter

    Description

    Schedule Name

    Custom schedule name, not exceeding 64 characters.

    Schedule Type

    Supports Recurrency Triggered, Data Update Triggered, and Task Triggered.

    • Recurrency Triggered: Based on the set schedule time, it supports timed, periodic quality checks on data. It is suitable for scenarios where data production time is relatively fixed.

      • Recurrence: Running quality rules will occupy certain computing resources. It is recommended to avoid concurrent execution of multiple quality rules at the same time to prevent affecting the normal operation of production tasks. The scheduling cycle includes five types: Day, Week, Month, Hour, and Minute.

      • Fill In Recommended Time: You can click Fill In Recommended Time, and the system will recommend based on the average end time of the current table's production tasks.

    • Data Update Triggered: When all code tasks are executed, it will parse whether the current table's specified validation range is updated during this task run. It is suitable for tables with non-fixed modification tasks or tables that require focused monitoring, where each change needs to be monitored.

      Note

      It is recommended to select the partition updated by the task as the validation range (non-partitioned tables will validate the entire table). The system will automatically detect all data changes and perform validation to avoid omissions.

    • Task Triggered: Execute the configured quality rules after or before the specified task runs successfully. It supports selecting task types such as engine SQL, offline pipeline, Python, Shell, Virtual, Datax, Spark_jar, Hive_MR, and database SQL nodes to trigger tasks. It is suitable for scenarios where table modification tasks are fixed.

      Note

      Fixed task triggers can only select production environment tasks. If the rule intensity is configured with a strong rule, and the scheduling task validation fails, it may affect online tasks. Please operate cautiously according to business needs.

      • Trigger Timing: Select the trigger timing for quality checks. It supports selecting Trigger After All Tasks Run Successfully, Trigger After Each Task Runs Successfully, and Trigger Before Each Task Runs.

      • Triggering Task: Supports selecting production task nodes for which the current user has maintenance permissions. You can search by node output name or select from recommended tasks or all tasks.

        • Recommended Tasks: Displays lineage tasks where the current table serves as the output table, with node output name = section name/project name.table name, i.e., the task range of Asset Directory - Asset Details - Output Information.

        • All Tasks: Displays all production tasks for which the current user has maintenance permissions.

          Note
          • When the trigger timing is set to trigger after all tasks run successfully, it is recommended to select tasks with the same scheduling cycle to avoid delayed rule execution and delayed quality check results due to different scheduling cycles.

          • When the trigger timing is set to trigger before each task runs, the recommended tasks will display lineage tasks where the current table serves as input.

    Schedule Condition

    Disabled by default. When enabled, it will first determine whether the scheduling conditions are met before the quality rule is officially scheduled. If the conditions are met, it will be officially scheduled. If not, this schedule will be ignored.

    • Data Timestamp/Executed On: If the schedule type is Recurrency Triggered (timed scheduling does not support execution date), Data Update Triggered, or Task Triggered, date configuration is supported. You can choose Regular Calendar or Custom Calendar. For how to customize a calendar, see Create a Public Calendar.

      • If you choose Regular Calendar, the conditions can be Month, Week, and Date. For example, see the image below:

        image

      • If you choose Custom Calendar, the conditions can be Date Type and Tag. For example, see the image below:

        image

    • Instance Type: If the schedule type is Data Update Triggered or Task Triggered, instance type configuration is supported. You can choose Recurring Instance, Data Backfill Instance, or One-time Instance. For example, see the image below:

      image

    Note
    • At least one rule must be configured. To add a rule, click the + Add Rule button.

    • A maximum of 10 scheduling conditions can be configured.

    • The relationship between scheduling conditions can be configured as "and" or "or".

    Validation Range

    When the schedule type is Timed Scheduling, Fixed Task Triggered Scheduling, the validation range supports Custom Validation Range. When the schedule type is Data Update Triggered Scheduling, the validation range supports Updated Partition, Custom Validation Range.

    • Updated Partition: If the partition is updated in the check task, the task will be issued directly according to the updated partition.

      Note
      • In dynamic partition scenarios, the partition may not be parsed, and quality validation will not be performed.

      • Volatility validation rules (such as checking partition size, partition row count, field statistics) require specifying a partition and do not support task-updated partition validation range.

      • For non-partitioned tables, if there is data update, the entire table will be validated.

    • Custom Validation Range: For scenarios that cannot be parsed, you can use a custom validation range to specify the validation range expression based on the data timestamp or execution date.

      • Validation Range Expression: It is an input-enabled dropdown box that supports direct input of the required validation range, such as ds='${yyyyMMdd}'. You can also select a built-in validation range expression and then modify it to help you quickly configure. For details on partition expressions, see Built-in Partition Expression Types.

        Note
        • If there are multiple conditions for validation, you can use and or or to connect them, such as province="Zhejiang" and ds<=${yyyyMMdd}.

        • If a filter condition is configured in the quality rule, the relationship between the validation range expression and the filter condition is "and". Both conditions will be filtered together during data validation.

        • The validation range expression supports full table scan.

          Note: Full table scan will consume significant resources, and some do not support full table scan. It is recommended to configure partition expressions to avoid full table scan.

      • Validation Range Budget: Default is the current day's data timestamp.

  3. Click OK to finalize the schedule setup.

Schedule Configuration List

After creating a schedule, you can view, edit, clone, delete, and perform other actions within the schedule configuration list.

image

Area

Description

Filter and Search Area

Supports quick search by schedule name.

Supports filtering by Recurrency Triggered, Data Update Triggered, and Task Triggered.

List Area

Displays the Schedule Name, Schedule Type, Last Updated By, and Last Updated Time of the rule configuration list.

Operation Area

You can edit, clone, or delete the schedule.

  • Edit: You can modify the configured schedule information.

    Important

    All rule configurations referencing this schedule will change synchronously. Please operate cautiously.

  • Clone: Quickly copy the schedule configuration.

  • Delete: Schedules referenced by rule configurations cannot be deleted.

Alert configurations

Configure distinct alert methods for various rules to differentiate alerts, such as phone alerts for strong rule anomalies and text message alerts for weak rule anomalies. If a rule triggers multiple alert configurations simultaneously, set the effective policy for the alert.

Note

A single monitored object can have up to 20 alert configurations.

  1. On the Quality Rule Details page, click the Alert Configurations tab, then click the Create Alert Configuration button to open the Create Alert Configuration dialog box.

  2. In the Create Alert Configuration dialog box, input the necessary parameters.

    Parameter

    Description

    Coverage

    Supports selecting All Rules, All Strong Rules, All Soft Rules, and Custom.

    Note
    • For a single monitored object, the three scopes of all rules, all strong rules, and all soft rules support configuring one alert each. Newly added rules will automatically match the corresponding alert according to the rule strength. If you need to change one of the alert configurations, you can modify the existing configuration.

    • The custom scope can select all configured rules under the current monitored object, not exceeding 200.

    Alert Configuration Name

    For a single monitored object, the alert configuration name is unique and does not exceed 256 characters.

    Alert Recipient

    Configure the alert recipient and alert method. You need to select at least one alert recipient and alert method.

    • Alert Recipient: Supports selecting custom, shift schedule, and quality owner as alert recipients.

      Supports configuring no more than 5 custom alert recipients and no more than 3 shift schedules.

    • Alert Method: Supports selecting different receiving methods such as phone, email, text message, DingTalk, Lark, WeCom, and custom channel. This receiving method can be controlled through configure channel settings.

  3. Click OK to complete the alert configuration.

Alert configuration list

After setting up the alert configuration, you can manage the configurations by sorting, editing, or deleting them in the alert configuration list.

image

OrdinalNumber

Description

① Sort Area

Supports configuring the effective policy for alerts when a quality rule meets multiple alert configurations:

  • The First Hit Alert Configuration Takes Effect: When this alert policy is selected, only the first alert configuration hit by the rule takes effect. Other configurations do not take effect. At this time, you can sort the configured alerts. Click Rule Sort, and you can drag the image.png icon in front of the alert configuration name to sort or select the icon in the operation column to move. The icons from left to right are: move to top, move to bottom. After adjusting the alert order, click the Sort Complete button to complete the sorting.

    image

  • All Alert Configurations Take Effect: The alerts in the current alert configuration list take effect for the quality rules under the current monitored object.

    For example, when you configure multiple alert configurations and choose all alert configurations to take effect, the system will merge alerts according to alert receiving method + alert recipient + alert rule. Specially, if the alert recipient is the same and the alert method is custom and quality owner, alerts will be merged according to the merge policy.

    Note

    Shift schedules do not support alert merging.

② List Area

Displays the name of the alert configuration, the effective scope, the specific recipients of each alert type, and the corresponding alert receiving method.

Effective Scope: Custom alerts support viewing the configured object name and rule name. If the rule is deleted, the object name cannot be viewed. It is recommended to update the alert configuration.

③ Operation Area

You can perform edit and delete operations on the configured alerts.

  • Edit: Supports modifying the configured alert information. If you modify the alert recipient and alert method, please synchronize with relevant personnel in time to avoid missing business alert information.

  • Delete: After deletion, the rules hit by this alert configuration will no longer take effect. Please operate with caution.

Add abnormal data archiving table

The abnormal data archiving table archives records that do not meet quality rule standards.

  1. On the Quality Rule Details page, click the Archive tab, then click the + Add Abnormal Data Archiving Table button to open the Add Abnormal Data Archiving Table dialog box.

  2. In the Add Abnormal Data Archiving Table dialog box, configure the parameters.

    The Add Method offers two options: Create New Table and Select Existing Table. The archiving table will not overwrite the original data table and will include special quality validation fields.

    • Create New Table: Customize the table name, which must belong to the project or module of the archiving table. By default, the name is current_table_name_exception_data. Upon successful creation, a new table will be generated in the same database or data source. The name can include letters, numbers, underscores (_), and periods (.), with a maximum length of 128 characters.

      • If the monitored table is a physical table, the archiving table will be created in the project where the monitored table resides.

      • If the monitored table is a logical dimension table or logical fact table, the archiving table will default to the project of the monitored table. You can also specify a project under the monitored table module, such as projectA.table_name.

      • For logical aggregate tables, it is recommended to specify the project name under the same module for the archiving table, or it will default to a project under the module of the monitored table.

      • The archiving table must include all fields and validation fields of the quality monitoring table. The script format is as follows:

        create table current_table_name_exception_data
         (dataphin_quality_tenant_id      varchar(64)   comment 'Tenant ID' , 
          dataphin_quality_rule_id        varchar(64)   comment 'Quality Rule ID', 
          dataphin_quality_rule_name      varchar(256)  comment 'Quality Rule Name', 
          dataphin_quality_column_name    varchar(1024) comment 'Validation Field Name', 
          dataphin_quality_watch_task_id  varchar(128)  comment 'Monitored Object Task ID', 
          dataphin_quality_rule_task_id   varchar(64)   comment 'Rule Task ID', 
          dataphin_quality_validate_time  varchar(64)   comment 'Quality Validation Time', 
          dataphin_quality_archive_mode   varchar(32)   comment 'Abnormal Data Archiving Mode, ONLY_ERROR_FIELD/FULL_RECORD', 
          dataphin_quality_error_data     string        comment 'Abnormal Data', 
          ljba_id                         bigint        comment  'ljba Primary Key', 
          ljb_id                          bigint        comment  'ljb Primary Key', 
          col_tinyint                     tinyint       comment 'Field type is TINYINT and lowercase',
          col_tinyint_02                  tinyint       comment '2',
          col_smallint                    smallint      comment 'Field type is SMALLINT and lowercase',
          col_smallint_02                 smallint      comment '4',
          col_int                         int           comment 'Field type is INT and lowercase',
          col_int_02                      int           comment '6',
          col_bigint                      bigint        comment 'Field type is BIGINT and lowercase',
          col_bigint_02                   bigint        comment '8',
          col_float                       float         comment 'Field type is FLOAT and lowercase',
          col_float_02                    float         comment '10',
          col_double                      double        comment 'Field type is DOUBLE and lowercase',
          col_double_02                   double        comment '11',
          col_decimal                     decimal(38,18) comment 'Field type is DECIMAL(38,18) and lowercase',
          col_decimal_02                  decimal(38,18) comment '12',
          col_varchar                     varchar(500)   comment 'Field type is VARCHAR(500) and lowercase',
          col_varchar_02                  varchar(500)   comment '13',
          col_char                        char(10)       comment 'Field type is CHAR(10) and lowercase',
          col_char_02                     char(10)       comment '14',
          col_string                      string         comment 'Field type is STRING and lowercase',
          col_string_02                   string         comment '15',
          col_date                        date           comment 'Field type is DATE and lowercase',
          col_date_02                     date           comment '16',
          col_datetime                    datetime       comment 'Field type is DATETIME and lowercase',
          col_datetime_02                 datetime       comment '17',
          col_timestmap                   timestamp      comment 'Field type is TIMESTAMP and lowercase',
          col_timestmap_02                timestamp      comment '18',
          col_boolean                     boolean        comment 'Field type is BOOLEAN and lowercase',
          col_boolean_02                  boolean        comment '19',
          col_binary                      binary         comment 'Field type is BINARY and lowercase',
          col_binary_02                   binary         comment '20',
          col_array                       array<int>     comment 'Field type is ARRAY<int> and lowercase',
          col_array_02                    array<string>  comment '21',
          col_map                         map<string,string>  comment 'Field type is MAP<string, string> and lowercase',
          col_map_02                      map<string,int>     comment '22',
          ds                              string              comment 'Date partition, yyyyMMdd'
         ) 
        partitioned by 
        (dataphin_quality_validate_date string comment 'Validation Date (Partition Field)');
    • Select Existing Table: Choose a table from the same project or data source. The archiving table must include all fields and validation fields of the quality monitoring table. Click View Abnormal Data Archiving Table DDL to see the table creation statement. The script format is as follows:

      create table current_table_name_exception_data
       (dataphin_quality_tenant_id      varchar(64)   comment 'Tenant ID' , 
        dataphin_quality_rule_id        varchar(64)   comment 'Quality Rule ID', 
        dataphin_quality_rule_name      varchar(256)  comment 'Quality Rule Name', 
        dataphin_quality_column_name    varchar(1024) comment 'Validation Field Name', 
        dataphin_quality_watch_task_id  varchar(128)  comment 'Monitored Object Task ID', 
        dataphin_quality_rule_task_id   varchar(64)   comment 'Rule Task ID', 
        dataphin_quality_validate_time  varchar(64)   comment 'Quality Validation Time', 
        dataphin_quality_archive_mode   varchar(32)   comment 'Abnormal Data Archiving Mode, ONLY_ERROR_FIELD/FULL_RECORD', 
        dataphin_quality_error_data     string        comment 'Abnormal Data', 
        ljba_id                         bigint        comment  'ljba Primary Key', 
        ljb_id                          bigint        comment  'ljb Primary Key', 
        col_tinyint                     tinyint       comment 'Field type is TINYINT and lowercase',
        col_tinyint_02                  tinyint       comment '2',
        col_smallint                    smallint      comment 'Field type is SMALLINT and lowercase',
        col_smallint_02                 smallint      comment '4',
        col_int                         int           comment 'Field type is INT and lowercase',
        col_int_02                      int           comment '6',
        col_bigint                      bigint        comment 'Field type is BIGINT and lowercase',
        col_bigint_02                   bigint        comment '8',
        col_float                       float         comment 'Field type is FLOAT and lowercase',
        col_float_02                    float         comment '10',
        col_double                      double        comment 'Field type is DOUBLE and lowercase',
        col_double_02                   double        comment '11',
        col_decimal                     decimal(38,18) comment 'Field type is DECIMAL(38,18) and lowercase',
        col_decimal_02                  decimal(38,18) comment '12',
        col_varchar                     varchar(500)   comment 'Field type is VARCHAR(500) and lowercase',
        col_varchar_02                  varchar(500)   comment '13',
        col_char                        char(10)       comment 'Field type is CHAR(10) and lowercase',
        col_char_02                     char(10)       comment '14',
        col_string                      string         comment 'Field type is STRING and lowercase',
        col_string_02                   string         comment '15',
        col_date                        date           comment 'Field type is DATE and lowercase',
        col_date_02                     date           comment '16',
        col_datetime                    datetime       comment 'Field type is DATETIME and lowercase',
        col_datetime_02                 datetime       comment '17',
        col_timestmap                   timestamp      comment 'Field type is TIMESTAMP and lowercase',
        col_timestmap_02                timestamp      comment '18',
        col_boolean                     boolean        comment 'Field type is BOOLEAN and lowercase',
        col_boolean_02                  boolean        comment '19',
        col_binary                      binary         comment 'Field type is BINARY and lowercase',
        col_binary_02                   binary         comment '20',
        col_array                       array<int>     comment 'Field type is ARRAY<int> and lowercase',
        col_array_02                    array<string>  comment '21',
        col_map                         map<string,string>  comment 'Field type is MAP<string, string> and lowercase',
        col_map_02                      map<string,int>     comment '22',
        ds                              string              comment 'Date partition, yyyyMMdd'
       ) 
      partitioned by 
      (dataphin_quality_validate_date string comment 'Validation Date (Partition Field)');
  3. Click OK to add the abnormal data archiving table.

    Choose Automatically Set As Effective Archiving Table After Creation for convenience when creating future quality rules.

View abnormal data archiving table list

Following successful addition, the first table becomes the default active archiving table. Click the table name to view its structure. You can also set other tables as the active archiving table or delete them.

  • Set As Effective Archiving Table: Designating this table as the active archiving table ensures all new quality rules under this monitored object that select the custom archiving option will archive data to this table.

  • Delete: This action removes only the reference to the archiving table, not the table itself. It can be re-added if necessary.

View quality report

Click Quality Report to access the Rule Validation Overview and Rule Validation Details for the current quality rules.

  • Filter validation details quickly by abnormal results, partition time, rule, or object name keywords.

  • In the rule validation details list, click the image icon to view detailed validation information for the quality rule.

  • In the rule validation details list, click the image icon to examine the execution log of the quality rule.

Set quality rule permission management

  1. Select Permission Management to configure View Details, which allows specified members to access verification records, quality rule details, and quality reports.

    View Details: Choose between All Members or Only Members With Current Object Quality Management Permission.

  2. Click OK to finalize the permission management settings.

What to do next

After configuring the quality rules, you can manage and view them on the Dataphin table rule list page. For more information, see the monitored object list.