Data Management (DMS) allows you to create quality rules for the tables in a database based on the tables or the columns of the tables so as to ensure data consistency, uniqueness, and availability for the tables. This also improves the quality of secondary analysis and development.

Prerequisites

  • The database instance for which you want to enable metadata access control is managed in Security Collaboration mode.
  • You have one of the following permissions:
    • You are a DMS administrator or a database administrator (DBA) in DMS. Alternatively, you assume the schema read-only role. For more information, see System roles.
    • You are the owner or the DBA of the database instance that you want to manage. For more information, see Modify an instance.
    • You have the permissions to query, change, or export the data of a database or a table in the database. For more information, see View owned permissions.

Background information

During the digital transformation of enterprises, more and more production data can be used to perform secondary analysis and development. However, the production data usually comes from various sources and has diverse structures. In addition, the production data has large volumes and is mutually dependent. In this case, you must ensure that the data is valid and meets requirements as expected.

To resolve the preceding issue, DMS provides the Data quality feature. You can use the feature to customize quality rules. For example, create a quality rule where the number of null values in a column must be less than zero. In addition, you can use the Audit Task feature of a task flow to periodically inspect the data quality of development tasks based on your quality rules. This can ensure the consistency, uniqueness, and availability of data.

The following table provides an overview of the operation process.

Step Description
Step 1: Create a quality rule Create quality rules for the table that you want to manage as needed.
Step 2: Create an audit task
  • Create an audit task for the quality rules that you create to periodically inspect the quality of data.Audit Task
  • Create an audit task for a specific development task and specify the quality rules that you create to periodically inspect the quality of data.Audit Task For example, before the obtained production data is used for secondary analysis, you can check whether the data quality is as expected. This ensures a successful data analysis.

Step 1: Create a quality rule

  1. Log on to the DMS console.
  2. In the search box at the top of the DMS console, enter the name of the table that you want to query. Click Details to go to the Tables page.
    Global search: Details
    Note To go to the Tables page, you can also click the Table List icon (Table details) in the SQLConsole and click the name of the table that you want to query.
  3. Click the Data quality tab.
  4. Click Create Rule. In the panel that appears, set the parameters that are described in the following table.
    Parameter Description
    Rule Name The name of the quality rule.
    Rule Type The type of the quality rule. Valid values:
    • Table Rule: the quality rule based on a table. For example, create a quality rule based on a table to check whether the number of rows in the table is as expected.
    • Column Rule: the quality rule based on a column of a table. For example, create a quality rule based on a column of a table to check the number of null values in the column.
    Rule Template The rule template that is preset by DMS. For more information, see Quality rule templates.
    Column Name The name of the column for which the quality rule is created.

    This parameter is displayed only when you set the Rule Type parameter to Column Rule.

    Comparison Method The comparison method. You can compare the actual value of data in the table with the value of the Expected Value parameter that you specify.

    Valid values: Greater, Greater than or equal, Equal, Is less than or equal, Less, and Not equal.

    Expected Value The value that you expect.
    For example, you want to create a quality rule where the number of null values in the column named newtest must be less than or equal to 10. You can set the following parameters:
    • Set the Rule Name parameter to The number of null values in the column named newtest must be less than or equal to 10.
    • Set the Rule Type parameter to Column Rule.
    • Set the Rule Template parameter to The number of hollow values in the column, fixed value.
    • Set the Column Name parameter to newtest.
    • Set the Comparison Method parameter to Is less than or equal.
    • Set the Expected Value parameter to 10.
  5. Click OK.
    The quality rule is created. By default, the rule is in the Enabled state. You can create an audit task based on the rule to check the data quality of the table. For more information, see Step 2: Create an audit task.
    Note You can also perform the following operations:
    • To modify a quality rule, move the pointer over More in the Actions column and select Edit.
    • To view the earlier versions of a quality rule, move the pointer over More in the Actions column and select Logs.
    • To disable a quality rule, click Disable in the Actions column. After the quality rule is disabled, you cannot create an audit task for the quality rule.

      If the quality rule is used by an audit task, the quality rule cannot be disabled. Delete the audit task and try again.

Step 2: Create an audit task

An audit task is used to check the data quality of a table. After you specify a quality rule for the table and a scheduling cycle for the audit task, DMS checks the data quality of the table and generates a report.Audit Task

  1. Log on to the DMS console.
  2. In the search box at the top of the DMS console, enter the name of the table that you want to query. Click Details to go to the Tables page.
    Global search: Details
    Note To go to the Tables page, you can also click the Table List icon (Table details) in the SQLConsole and click the name of the table that you want to query.
  3. Click the Data quality tab.
  4. Find the quality rule for which you want to create an audit task, move the pointer over More in the Actions column, and then select New Task Flow.
    New Task Flow
    A task orchestration page appears.
  5. On the task orchestration page, configure an Audit Task node.
    1. Click the task node that you want to configure. In the right-side pane, click the Content tab and configure the task node.
      Configure the audit taskBy default, the database, the name of the table in the database, and the quality rule for the table are selected.
    2. Click Save.
    Note You can also create quality rules for the task node or add other task nodes to the same task flow. For more information, see Create and connect task nodes in a task flow.
  6. Configure scheduling properties for the task flow.
    You can schedule the task flow to be run as required. For example, you can schedule the task flow to be run by the day, week, or month. For more information, see Configure scheduling properties for a task flow.
  7. Click the Properties tab and turn on Enable message notification.
    After you turn on this switch, the owner and relevant users that you specify for the Owner and Stakeholders parameters can receive the notifications about the status of the task flow. For more information, see Manage notification rules.
    Note You can also configure basic properties as needed, such as Task Flow Name, Owner, and Stakeholders.
  8. Publish the task flow with the latest configurations. For more information, see Publish a task flow.
    After the task flow is published, DMS checks the data quality of the table based on the specified quality rule for the table and the specified scheduling cycle for the audit task.

Quality rule templates

Rule Type Rule Template
Table Rule The number of rows in the table, fixed value: checks whether the number of rows in a table is as expected. For example, specify that the number of rows in a table must be less than 100,000.
Column Rule
  • The maximum value of the column: checks whether the maximum value of a column is as expected. For example, specify that the maximum value of the primary key in a table cannot exceed 2147483647, which is the result of 2³¹ minus 1.
  • Minimum value of column: checks whether the minimum value of a column is as expected. For example, specify that the minimum value of a field in a table cannot be smaller than 3.
  • The number of expressions that contain wildcards, fixed value: checks whether the number of expressions that contain wildcards in a column is as expected. For example, specify that the number of expressions that start with a in a field must be less than zero.
  • The number of regular expressions, fixed value: checks whether the number of regular expressions in a column that meet the format requirements is as expected. For example, specify that the number of regular expressions in a column that meet the format requirements must be greater than 100.
  • The number of hollow values in the column, fixed value: checks whether the number of null values in a column is as expected. For example, specify that the number of null values in a column must be less than or equal to zero.
  • The number of unique values in the column, fixed value: checks whether the number of unique values in a column is as expected. For example, specify that the number of unique values in a column must be greater than or equal to 100. This ensures that 100 different values appear among all data.
  • The number of duplicate values in the column, fixed value: checks whether the number of duplicate values in a column is as expected. For example, specify that the number of duplicate values in a column must be less than 100. This ensures that the number of duplicate values among all data does not exceed 100.