All Products
Search
Document Center

Dataphin:Configure integrated pipeline quality monitoring

Last Updated:Jan 21, 2025

Dataphin provides offline integrated pipeline quality monitoring, automatically overseeing the integrity of your data tables. Upon detecting a quality threat, the system promptly issues alerts to designated recipients, ensuring timely awareness of data table quality. This topic outlines the steps to configure quality rules.

Permission description

  • Project developers have the authority to create integration tasks, set up quality rules, and submit tasks for publication.

  • Quality administrators, data source/table quality owners, and super administrators are permitted to create and publish quality rules within integration tasks.

Create offline pipeline quality monitoring

  1. Navigate to the Dataphin home page and select Development from the top menu bar.

  2. Refer to the following illustration to access the New Quality Rule dialog box, where you can configure quality rules and monitor data table integrity (note that configuring quality rules requires the asset quality module to be enabled).

    Select Integration -> choose Project -> click on Offline Integration -> pick Offline Pipeline -> opt for Quality Monitoring -> click New Quality Rule.

    image

  3. In the New Quality Rule dialog box, input the necessary parameters for the quality rule.

    Parameter

    Description

    Rule Template

    Choose a rule template from options such as Table Structure Change Monitoring, Table Stability Validation, and Table Volatility Validation.

    Data Table

    Select the data table from the production environment within the integration pipeline that corresponds to the chosen rule template. Click More Rules to navigate to the Data Quality module for additional rule template configurations. For further details, see create data table quality rules.

    Rule Strength

    Determine the rule's enforcement level:

    • Choosing Strong Rule means an alert will be issued and downstream task nodes will be halted if the rule validation detects an anomaly.

    • Opting for Weak Rule will trigger an alert without interrupting downstream task nodes, even if the rule validation uncovers an anomaly.

    Rule Configuration

    Configuration is necessary when selecting the Table Stability Validation or Table Volatility Validation templates. The specifics of the configuration will vary based on the chosen template. For more information, see data table parameter configuration.

    Rule Validation

    Configuration is also required for the Table Stability Validation or Table Volatility Validation templates.

    • Post-validation, the results are compared against the abnormal validation criteria. If the conditions are met, the result is deemed a failure, triggering subsequent processes such as alerts.

    • The indicators for abnormal validation are template-specific and are influenced by the configuration details. The system supports multiple conditional and/or conditions, although it is advisable to limit these to three for practical configurations.

    For a detailed explanation, refer to validation configuration description.

    Scheduling Method

    Choose from scheduling options such as Recurrency Triggered, Scheduled Before Task Execution, and Scheduled After Task Completion.

    • Recurrency Triggered: Executes periodic quality assessments based on a predefined schedule, ideal for scenarios with regular data production.

      • Recurrence: Includes five cycle types: Day, Week, Month, Hour, and Minute.

    • Scheduled Before Task Execution: Initiates quality rule checks prior to the execution of the integration task.

    • Scheduled After Task Completion: Conducts quality rule checks following the successful completion of the integration task.

    Partition Filter Expression

    For partitioned tables, a partition filter expression is mandatory. This includes the Partition Filter Expression Type and Custom Partition Filter Expression. For additional information, see built-in partition filter expression types and .

    Partition Budget

    Currently, partition calculations are based on the analysis expression.

  4. To finalize the quality rule setup, click OK.

    Note

    After setting the quality rules, verify in the Asset Quality module that the monitored object is configured to receive alerts for the current rule. If necessary, consult alert configuration for guidance.

Once configured, the rule details can be viewed in the Quality Monitoring section. Additionally, you can edit or delete unpublished rules using the respective Edit and Delete options.

View published quality rules

Upon opening the quality rule dialog box within the integration pipeline, Dataphin automatically retrieves all configured and published quality rules for the pipeline's data tables. The integration pipeline is designed solely for the creation of quality rules; it does not support the editing or modification of published rules. To make changes, please visit the Asset Quality page. For more details, refer to create data table quality rules.