All Products
Search
Document Center

DataWorks:Create a data masking rule

Last Updated:Dec 25, 2025

DataWorks supports multiple data masking scenarios. This topic describes how to select a scenario, create a data masking rule, and perform masked queries in DataWorks.

Background

DataWorks provides two types of data masking: static and dynamic.

  • Dynamic data masking includes scenarios such as Data development / Data map display desensitization, Data analysis and display desensitization, Layer masking of the MaxCompute engine, and Hologres layer masking.

  • Static data masking is used for data integration scenarios.

By default, a data masking rule is inactive after it is created. You must manually activate the rule to enable automatic data masking in the relevant scenarios.

Note

Prerequisites

  • (Optional, for dynamic data masking only) Configure sensitive data detection rules as needed. This lets you associate the fields that require masking when you create data masking rules. For more information, see Sensitive data detection rules.

  • (Optional, for dynamic data masking only) You can use a whitelist to allow specific users to view raw data for a specified period. To do this, you must add the users to a user group. For more information, see Configure user groups.

  • (Optional, for data masking at the MaxCompute engine layer only) To configure data masking at the MaxCompute engine layer, you must add the IP address of Data Security Guard to the MaxCompute network whitelist. This lets you call data masking functions to mask sensitive data in query results that you obtain from sources other than DataWorks, such as the MaxCompute command line client (odpscmd) or Logview. For more information, see Example: Use underlying data masking in E-MapReduce.

Access control

  • Permissions for configuring data masking rules (create, edit, and delete):

    • Tenant administrators and tenant security administrators can manage data masking rules in all scenarios.

    • Workspace administrators and workspace security administrators can manage data masking rules only in scenarios where they have permissions.

  • Permissions for configuring whitelists (create, edit, and delete):

    • Tenant administrators and tenant security administrators can manage whitelists in all scenarios.

    • Workspace administrators and workspace security administrators can manage whitelists only in scenarios where they have permissions.

You must be granted the required role permissions to perform these operations. For more information about authorization, see Manage permissions on workspace-level modules and Manage permissions on global-level modules.

Entry point for data masking rule configuration

  1. Go to the DataStudio page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O&M > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  2. Click the 图标 icon in the upper-left corner. Then, choose All Products > Data Governance > Data Security Guard. On the page that appears, click Try Now to go to the Data Security Guard page.

    Note
    • If your Alibaba Cloud account is granted the required permissions, you can directly access the homepage of Data Security Guard.

    • If your Alibaba Cloud account is not granted the required permissions, you are redirected to the authorization page of Data Security Guard. You can use the features of Data Security Guard only after your Alibaba Cloud account is granted the required permissions.

  1. In the navigation pane on the left, click Rule Configuration > Data Masking Management.

  2. In the navigation pane on the left, select a data masking scenario, and then click Masking Rule on the right to create a rule for that scenario.

Create a dynamic data masking rule: Masking of displayed data in Data Development and Data Map scenario

  1. Select a data masking scenario.

    On the Data Masking Management page, in the Masking Scene section, select Data development / Data map display desensitization > Default Scenario. Then, click + Masking Rule on the right.

  2. Create a data masking rule.

    1. In the Create Data Masking Rule dialog box, configure the parameters for the rule.

    2. image

      1. Select a sensitive field type and specify the rule name.

        Parameter

        Description

        Sensitive Field Type

        Select the type of field to mask.

        • You can select built-in sensitive field types or custom sensitive field types that you added in sensitive data detection. For more information about how to add a sensitive field type, see Sensitive data detection rules.

        • If you have already created a data masking rule for the same scenario, DataWorks filters out the selected sensitive field types to prevent inconsistent rules for the same sensitive field in the same scenario.

        Data Masking Rule Name

        By default, this is the same as the Sensitive Field Type. You can also specify a custom name. The rule name must be unique.

      2. Configure data masking scenarios.

        Select the scenarios to which this rule applies. By default, the scenario that you selected in Step 1 is used. You can change the scenario or add more scenarios as needed.

      3. Configure the data masking method.

        DataWorks supports several methods, such as Pseudonym, Masking out, HASH, Characters to replace, Range transform, Integer, and Empty. You can select a method based on your requirements.

        Pseudonym

        This method replaces a value with a masked value that has the same characteristics, preserving the data format. The following parameters are available.

        Parameter

        Description

        (Optional) Data watermark

        Watermarks help trace the source of data. If a data breach occurs, you can locate the potential source of the leak. Enable or disable Data watermark as needed.

        Note

        Only DataWorks Enterprise Edition supports the data watermark feature.

        Masking characteristic value

        Different characteristic values result in different masking policies. This means the same source data produces different masked results for different characteristic values. If the characteristic value is the same, the same source data always produces the same masked result.

        For example, if the raw data is a123:

        • If the characteristic value is set to 0, the data is masked to b124.

        • If the characteristic value is set to 1, the data is masked to c234.

        The default value is 5. The value range is 0 to 9.

        Substitution character set

        If the Sensitive field type you selected is not a built-in type, you must configure a Substitution character set. Characters in the source data that match this set are replaced with other characters of the same type.

        For example, if the sensitive data before masking consists of numbers from 0 to 3 and letters from a to d, the masked data will also consist of numbers and letters within that range.

        Note

        Characters in the set are replaced with characters from the same range. The character set supports uppercase letters, lowercase letters, and numbers. Separate multiple characters with commas (,). Chinese characters are not supported. If the data to be masked does not match the character set, it is not masked.

        Masking out

        This method conceals parts of the information by replacing characters at specific positions with an asterisk (*). When you use this method, you must select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.

        Parameter (select one)

        Description

        Recommended method

        Select a recommended masking method from the drop-down list. The available methods vary depending on the field to be masked.

        DataWorks provides three built-in methods. They include Show only the first and last characters, Show only the first three and last two characters, and Show only the first three and last four characters. Select a method from the drop-down list as needed.

        Custom

        This provides a more flexible way to configure masking. Configure segments from left to right and specify whether to mask each segment and the number of characters to mask (or not mask). You can add up to 10 segments. You must have at least one segment, and exactly one segment must be Remaining characters.

        For example, mask the first 3 characters and leave the remaining characters unmasked.

        HASH

        When you use HASH encryption for data masking, you must configure the following parameters.

        Parameter

        Description

        Data watermark

        Watermarks help trace the source of data. If a data breach occurs, you can locate the potential source of the leak. Enable or disable Data watermark as needed.

        Note

        Only DataWorks Enterprise Edition supports the data watermark feature.

        Encryption algorithm

        Includes MD5, SHA256, SHA512, and SM3.

        Salt Value

        Set a salt value for the encryption algorithm. The default value is 5. The value range is 0 to 9.

        Note

        A salt is a specific string that is inserted into the data. In cryptography, inserting a specific string at a fixed position in a password makes the resulting hash different from the hash of the original password. This process is called salting.

        Characters to replace

        Character to replace replaces characters at specified positions based on your selected substitution method. The following parameters are available.

        Parameter

        Description

        (Required) Substitution position

        From the drop-down list, you can select Substitute all, Substitute first 3 characters, or Substitute last 4 characters. You can also select Custom to define a custom substitution position.

        If you select Custom, you can define segments from left to right and configure the number of characters to substitute and the substitution method for each segment. You can add up to 10 segments. You must have at least one segment, and exactly one segment must be Remaining characters.

        (Required) Substitution method

        Includes Random substitution, Sample value substitution, and Static field substitution.

        • Random substitution: Randomly substitutes characters at the specified positions. The number of characters remains the same after substitution.

        • Sample value substitution: Select a sample library. The characters at the specified positions are substituted with values from the selected sample library.

        • Static field substitution: In the Substitution value text box, enter the characters to use for substitution. The value can be 1 to 100 characters long and cannot contain null characters. The characters at the specified positions are substituted with this value.

        Range transform

        Range transform is used to mask numeric data. This method replaces values within a specified numeric range with a fixed value. You can define up to 10 ranges.

        Parameter

        Description

        Original value range [m,n)

        The numeric range of the raw data. The value must be greater than or equal to 0 and can have up to two decimal places.

        Masked Value

        The value after masking. The value must be greater than or equal to 0 and can have up to two decimal places.

        Integer

        Integer is used only to mask numeric data.

        Parameter

        Description

        Raw data type

        Only numeric types are supported.

        Decimal places to keep

        The value range is 0 to 5. The remaining part is rounded. For example, if the raw value is 3.1415 and you keep 2 decimal places, the masked value is 3.14.

        Empty

        The Empty masking sets the corresponding sensitive field to an empty string.

    3. Verify the masking result.

      Enter sample raw data (0 to 100 characters) in the Sample data text box. Click Verify. The masked result is displayed in the Data Masking effect field.

    4. Click Save or Save and Apply to create the data masking rule.

After you create the rule:

Create a static data masking rule: Static data masking in Data Integration scenario

  1. On the Data Masking Management page, under Masking Scene, choose Static desensitization of data integration > Default Scenario and click + Masking Rule on the right.

  2. Create a data masking rule.

    1. In the Create Data Masking Rule dialog box, configure the parameters for the rule.

      image

      1. Select a sensitive data type and specify the rule name.

        Parameter

        Description

        Sensitive Data Type

        • Existing: Select an existing sensitive data type (built-in or custom).

        • New type: Enter a name for the new sensitive data type. The name must be unique and cannot be the same as an existing type.

        Note

        Built-in sensitive data types include the following: Mobile phone number, ID card number, Bank card number, Email_Built-in, IP, License plate number, Postal code, Landline number, MAC address, Address, Name, Company name, Ethnicity, Zodiac sign, Gender, and Nationality.

        Data Masking Rule Name

        By default, this is the same as the Sensitive Data Type. You can also specify a custom name. The rule name must be unique.

      2. Configure the data masking method.

        DataWorks supports three methods: Pseudonym, Hash, and Mask out. You can select an appropriate method.

        Pseudonym

        This method replaces a value with a masked value that has the same characteristics, preserving the data format. Pseudonymization is supported for only some existing fields.

        • If the selected Sensitive data type is a built-in type (such as Mobile Phone Number, ID Card Number, Bank Card Number, Email_Built-in, IP, License Plate Number, Postal Code, Landline Number, MAC Address, Address, Name, or Company Name), you must configure the Security domain.

          Security domain: The value is an integer from 0 to 9. Different security domains use different masking policies, so the same source data produces different masked results in different security domains. For example, if the raw data is a123, the data is masked to b124 in security domain 0, but is masked to c234 in security domain 1. Within the same security domain, the same source data is always masked to the same result.

        • If the selected Sensitive data type is not a built-in type, you need to configure the Substitution character set.

          Substitution character set: Characters in the source data that match this set are replaced with other characters of the same type. The character set supports uppercase letters, lowercase letters, and numbers. Use commas (,) to separate multiple characters. Chinese characters are not supported. If the data to be masked contains no characters from this set, it is not masked. For example, if the substitution character set consists of numbers from 0 to 3 and letters from a to d, a matching number in the source data is replaced by another number from 0 to 3, and a matching letter is replaced by another letter from a to d.

        Hash

        The Hash method encrypts raw data into a fixed-length value and requires you to select a Security domain.

        Security domain: The value is an integer from 0 to 9. Each security domain uses a different masking policy. This means that for the same source data, the masked result varies depending on the security domain. However, within the same security domain, the same source data always produces the same masked result.

        For example, if the raw data is a123:

        • If the security domain is set to 0, the data is masked to b124.

        • If the security domain is set to 1, the data is masked to c234.

        Mask out

        The Mask out method conceals partial information by replacing specific characters with asterisks (*). This method requires you to select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.

        • Recommended method: For some fields, you can select a recommended masking method from the drop-down list. The available methods vary depending on the selected field. DataWorks provides three built-in methods: Show only the first and last characters, Show only the first three and last two characters, and Show only the first three and last four characters. You can select the method that you need. For some fields, only the default method can be selected.

        • Custom: This option provides a flexible way to configure masking. You can define up to 10 segments from left to right. For each segment, you can specify whether to apply masking and the number of characters to mask or leave unmasked. You must configure at least one segment, and exactly one segment must be set to Remaining characters.

          • Example 1: Mask the first 3 characters and leave the remaining characters unmasked.

          • Example 2: Mask the last 3 characters and leave the remaining characters unmasked.

    2. Verify the masking result.

      In the Sample data text box, enter sample raw data (0 to 100 characters) and click Verify. The masked result is displayed in the Data Masking effect field.

    3. Click OK to create the data masking rule.

After you create the rule:

  • By default, a data masking rule is inactive after it is created. You must manually activate the rule before it can be applied in the corresponding scenarios. For more information about how to change the rule status, see Activate or deactivate a data masking rule.

  • After you create a data masking rule for data integration, you can use the rule when you create a real-time synchronization task for a single table. For more information, see Configure data masking.

Configure a whitelist for a data masking rule (dynamic data masking only)

In dynamic data masking scenarios, you can configure a whitelist for a data masking rule. After the rule is activated, whitelisted users are not affected by the rule for a specified period and can view the raw, unmasked data.

Note

Before you create a whitelist, you must add the users to a user group. For more information, see Configure user groups.

To add a whitelist, perform the following steps:

  1. On the Data Masking Management page, click Configure Whitelist.

  2. In the upper-right corner, click Whitelist.

  3. In the Create Whitelist dialog box, you can configure the parameters.

    Note
    • Whitelist configuration is not supported in the Hologres layer masking or Static desensitization of data integration scenario.

    • After you set an effective period for a whitelist, sensitive data that meets the whitelist conditions is not masked during this period.

    image

    The parameters are as follows.

    Parameter

    Description

    Sensitive Field Type

    You can select only sensitive field types that are active in the currently selected data masking scenario.

    User Group Range

    Select configured user groups. You can select up to 50 user groups. After you add user groups to the whitelist, the accounts in those groups can retrieve the original, unmasked data. For more information about how to configure a user group, see Configure user groups.

    Effective Time

    Set the effective period for the whitelist. You can choose a short-term or permanent period. Short-term options include 30, 90, 180, and 365 days, or a custom period. For a custom period, you can select the current day or a future time range.

    If a user queries the sensitive information outside the effective period of the whitelist, the data is masked.

    Note

    If you select a short-term period, the data will not be masked from the current time until the specified number of days has passed.

  4. Click Save to save the whitelist configuration.

Activate or deactivate a data masking rule

On the Data Masking Rule page, find the rule and click the Status switch to set it to Enable or Disable.

After the status is set, you can edit or delete the rule, or view its details.

Note
  • You cannot Delete or Edit a desensitization rule while it is in the Enable state. To do so, you must first change its status to Disable. Before changing the status, you must check if the rule is used by any tasks and contact the security administrator for confirmation.

  • When a rule is Disable, you can edit or delete it, but you cannot change its Sensitive data type or Data masking rule name.

  • After you make changes, switch the status back to Enable. This allows tasks that use this rule to resume data masking.

Examples of data masking rule applications