All Products
Search
Document Center

DataWorks:Create a data masking rule

Last Updated:Dec 16, 2025

DataWorks supports multiple data masking scenarios. This topic describes how to create data masking rules for different scenarios and perform masked queries in DataWorks.

Background information

DataWorks provides two types of data masking: static data masking and dynamic data masking.

  • Dynamic data masking supports scenarios such as display masking in Data Development and Data Map, display masking in DataAnalysis, engine-layer masking in MaxCompute, and engine-layer masking in Hologres.

  • Static data masking refers to the data integration static data masking scenario.

Data masking rules are disabled by default after they are created. To automatically mask data in a specific scenario, you must manually enable the corresponding rule.

Note

Prerequisites

  • (Optional, for dynamic data masking only) You can configure sensitive data detection rules as needed. This helps you associate fields that require masking when you create data masking rules. For more information, see Sensitive data detection rules.

  • (Optional, for dynamic data masking only) You can use a whitelist to allow specific users to bypass data masking rules during a specified period. Add the users to a user group to allow them to view raw data. For more information, see Configure a user group.

  • (Optional, for MaxCompute engine-layer masking only) If you configure the MaxCompute engine-layer masking scenario, sensitive data is masked when it is queried from outside DataWorks, such as from the MaxCompute command line client (odpscmd) or from Logview. To do this, you must request a network whitelist for MaxCompute to call the masking functions. For more information, see Example: Use E-MapReduce for underlying data masking.

Access control

  • Configure data masking rules (create, edit, and delete):

    • Tenant administrators and tenant security administrators can perform operations on data masking rules in all scenarios.

    • Workspace administrators and workspace security administrators can perform operations on data masking rules only in scenarios where they have permissions.

  • Configure a data masking whitelist (create, edit, and delete):

    • Tenant administrators and tenant security administrators can configure whitelists for all scenarios.

    • Workspace administrators and workspace security administrators can configure whitelists only for scenarios for which they have permissions.

You must be granted the required role permissions to perform these operations. For more information about authorization, see Workspace-level module access control and Global module access control.

Go to the data masking rule configuration page

  1. Go to the DataStudio page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O&M > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  2. Click the 图标 icon in the upper-left corner. Then, choose All Products > Data Governance > Data Security Guard. On the page that appears, click Try Now to go to the Data Security Guard page.

    Note
    • If your Alibaba Cloud account is granted the required permissions, you can directly access the homepage of Data Security Guard.

    • If your Alibaba Cloud account is not granted the required permissions, you are redirected to the authorization page of Data Security Guard. You can use the features of Data Security Guard only after your Alibaba Cloud account is granted the required permissions.

  1. In the navigation pane on the left, choose Rule Configuration > Data Masking Management.

  2. Select a data masking scenario on the left. Then, on the right, click the Data Masking Rules tab to create a rule for that scenario.

Create a dynamic data masking rule: Data Development/Data Map display masking scenario

  1. Select a data masking scenario.

    On the Data Masking Management page, set Data Masking Scenario to Data Development/Data Map display masking > Default Scenario, and then click the Data Masking Rules tab.

  2. Create a data masking rule.

    1. In the Create Data Masking Rule dialog box, configure the parameters for the rule.

    2. image

      1. Select a sensitive field and specify the rule name.

        Parameter

        Description

        Sensitive field type

        Select the type of sensitive field to which this rule applies.

        • You can select built-in sensitive field types or sensitive field types that you manually added in sensitive data detection. For more information about how to add a sensitive field, see Sensitive data detection rules.

        • If you have already created a data masking rule for the same scenario, DataWorks filters out the sensitive field types that have been selected. This prevents inconsistent data masking rules for the same sensitive field in the same scenario.

        Data masking rule name

        By default, the rule name is the same as the Sensitive field type. You can also specify a custom name. The rule name must be unique.

      2. Configure the data masking scenario.

        Select the data masking scenarios to which this rule applies. By default, the scenario that you selected in Step 1 is used. You can change the scenario or add multiple scenarios as needed.

      3. Configure the data masking method.

        DataWorks supports data masking methods such as Format-preserving encryption, Masking, HASH encryption, Character substitution, Range transformation, Rounding, and Set to null. You can select a method as needed.

        Format-preserving encryption (formerly pseudonymization algorithm)

        Format-preserving encryption replaces a value with a masked value that has the same characteristics. The data format remains the same after masking. The following parameters are related to this data masking method.

        Parameter

        Description

        (Optional) Data watermark

        A data watermark provides data traceability. If a data breach occurs, the watermark can help you locate the possible source of the leak. You can enable or disable Data watermark as needed.

        Note

        Only DataWorks Enterprise Edition and higher supports the data watermark feature.

        (Optional) Substitution feature value

        Different substitution feature values correspond to different masking policies. This means that for the same raw data, different substitution feature values produce different masked results. If the substitution feature value is the same, the same raw data produces the same masked data.

        For example, if the raw data is a123:

        • If you set the substitution feature value to 0, the data is masked as b124.

        • If you set the substitution feature value to 1, the data is masked as c234.

        The default substitution feature value is 5. The valid values are 0 to 9.

        (Optional) Substitution character set

        If the detection rule for the selected Sensitive field type is not a built-in rule, you must configure a Substitution character set. After you configure a substitution character set, any character in the set is replaced with another character of the same type.

        For example, if the sensitive data before masking consists of digits from 0 to 3 and letters from a to d, the masked data will also consist of digits and letters within this range.

        Note

        Characters in the character set are replaced with characters from the same range. The substitution character set supports uppercase letters, lowercase letters, and digits. Separate multiple characters with commas (,). Chinese characters are not supported. If the data to be masked does not fall within the character set range, it is not masked.

        Masking

        Masking conceals part of the information by replacing characters at specific positions with asterisks (*). When you use this method, you must select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.

        Parameter (select one)

        Description

        Recommended method

        Select a recommended masking method from the drop-down list. The available masking methods vary depending on the field to be masked.

        DataWorks provides three built-in masking methods. These Methods Include Show Only The First And Last Characters, Show Only The First Three And Last Two Characters, And Show Only The First Three And Last Four Characters. You can select a method from the drop-down list on the interface.

        Custom

        This provides a more flexible way to configure masking. You must configure from left to right whether to mask each segment and the number of characters to mask or not mask. You can add up to 10 segments. You must have at least one segment, and exactly one segment must be set to Remaining characters.

        For example, mask the first 3 characters and do not mask the remaining characters.

        HASH encryption

        When you use HASH encryption for data masking, you must configure the following parameters.

        Parameter

        Description

        (Optional) Data watermark

        A data watermark provides data traceability. If a data breach occurs, the watermark can help you locate the possible source of the leak. You can enable or disable Data watermark as needed.

        Note

        Only DataWorks Enterprise Edition and higher supports the data watermark feature.

        (Required) Encryption algorithm

        Includes MD5, SHA256, SHA512, and SM3.

        (Required) Salt value

        Set the salt value for each encryption algorithm. The default value is 5. The valid values are 0 to 9.

        Note

        A salt is a specific string that is inserted. In cryptography, salting is the process of inserting a specific string at a fixed position in a password. This makes the hashed result different from the result of hashing the original password.

        Character substitution

        Character substitution is a data masking rule that replaces characters at a specified position based on the selected substitution method. This rule has the following parameters.

        Parameter

        Description

        (Required) Substitution position

        From the drop-down list, you can select Replace all, Replace first 3, or Replace last 4. You can also select Custom to define a custom substitution position.

        When you select Custom, you can define custom segments. You must define the segments in order from left to right and configure the number of digits to replace and the replacement method for each segment. You can add from 1 to 10 segments, and exactly one segment must be Remaining Digits.

        (Required) Substitution method

        Includes Random substitution, Sample value substitution, and Static value substitution.

        • Random substitution: Randomly replaces characters at the specified position. The number of characters remains the same after substitution.

        • Sample value substitution: You must select a sample library. The characters at the specified position are replaced with values from the selected sample library.

        • Static value substitution: In the Substitution value text box, enter the characters to use for substitution. The string can be 1 to 100 characters long and cannot contain null characters. The characters at the specified position are replaced with this substitution value.

        Range transformation

        Range transformation applies only to numeric data. It masks data within a specified numeric range with a fixed value. You can add from 1 to 10 ranges.

        Parameter

        Description

        Original value range [m,n)

        The numeric range of the data before masking. The value must be greater than or equal to 0 and can have up to two decimal places.

        ValueMasked value

        The value after masking. The value must be greater than or equal to 0 and can have up to two decimal places.

        Rounding

        Rounding applies only to numeric data.

        Parameter

        Description

        Raw data type

        Only numeric types are supported.

        Decimal places to keep

        You can keep 0 to 5 decimal places. The remaining part is rounded. For example, if the original value is 3.1415 and you keep two decimal places, the masked value is 3.14.

        Set to null

        The Set to null masking method sets the corresponding sensitive field to an empty string.

    3. Verify the masking result.

      Enter sample raw data (up to 100 characters) in the Sample Data text box and click Verify. The masked data is displayed in the Masking Effect field.

    4. Click Save or Save and Enable to create the data masking rule.

After you create the data masking rule:

  • In dynamic data masking scenarios, you can configure a whitelist for the data masking rule. Whitelisted users can view unmasked data within a specified time frame. For more information about how to add a whitelist, see Configure a data masking whitelist (for dynamic data masking only).

  • Data masking rules are disabled by default after they are created. You must manually enable a rule for it to be applied in the corresponding data masking scenario. For more information about how to change the status of a data masking rule, see Enable or disable a data masking rule.

Create a static data masking rule: Data integration static data masking scenario

  1. On the Data Masking Management page, set the Data Masking Scenario to Data integration static data masking > Default Scenario, and click + Data Masking Rule on the right.

  2. Create a data masking rule.

    1. In the Create Data Masking Rule dialog box, configure the parameters for the rule.

      image

      1. Select a sensitive data type and specify the rule name.

        Parameter

        Description

        Sensitive data type

        • Select existing: Select an existing sensitive data type as needed. This includes built-in and custom sensitive data types.

        • Add type: Enter a name for the sensitive data type. The name must be unique and cannot be the same as an existing type.

        Note

        Built-in sensitive data types include the following: Phone Number, ID Card Number, Bank Card Number, Mailbox_Built-in, IP, License Plate Number, Postal Code, Landline Number, MAC address, Address, Name, Company Name, Ethnicity, Zodiac Sign, Gender, and Nationality.

        Data masking rule name

        By default, the rule name is the same as the Sensitive data type. You can also specify a custom name. The data masking rule name must be unique.

      2. Configure the data masking method.

        DataWorks supports three data masking methods: Pseudonymization, Hashing, and Masking. You can select a method based on your requirements.

        Alias

        Pseudonymization replaces a value with a masked value that has the same characteristics. The data format remains the same after masking. Only some existing fields support pseudonymization.

        • If the selected Sensitive data type is a built-in type (such as Phone Number, ID Card Number, Bank Card Number, Email Address, IP, License Plate Number, Postal Code, Landline Number, MAC address, Address, Name, or Company Name), you must configure a Security domain.

          Security domain: The value must be an integer from 0 to 9. Different security domains use different masking policies. This means the same raw data produces different masked results in different security domains. For example, if the raw data is a123, setting the security domain to 0 masks it as b124, and setting it to 1 masks it as c234. If the security domain is the same, the same raw data always produces the same masked result.

        • If the selected Sensitive data type is not a built-in type, you must configure a Substitution character set.

          Substitution character set: Replaces characters in sensitive data with other characters of the same type from this set. The set supports uppercase letters, lowercase letters, and digits. Use a comma (,) to separate multiple characters. Chinese characters are not supported. If a character in the sensitive data is of a type not represented in the substitution set, that character is not masked. For example, if the substitution set is defined with digits from 0 to 3 and letters from a to d, the masked sensitive data will also be composed of digits and letters from within this range.

        Hashing

        The Hashing method encrypts raw data into fixed-length data and requires you to select a Security domain.

        Security domain: The value is an integer from 0 to 9. Different security domains have different masking policies. This means that for the same raw data, the masking result is consistent within the same security domain but varies across different security domains.

        For example, if the raw data is `a123`:

        • If you set the security domain to 0, the data is masked as `b124`.

        • If you set the security domain to 1, the data is masked as `c234`.

        Masking

        Masking conceals information by replacing specific characters with asterisks (*). To use this method, you must select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.

        • Recommended Method: For some fields, you can select a recommended masking method from the drop-down list. The available methods vary depending on the field. DataWorks provides three built-in masking methods: Show Only The First And Last Characters, Show Only The First Three And Last Two Characters, and Show Only The First Three And Last Four Characters. You can select the method that you need. For some fields, you can select only the default method.

        • Custom: This option provides a flexible way to configure masking. You must configure each segment from left to right, specifying whether to apply masking and the number of characters to mask or retain. You can add up to 10 segments. You must have at least one segment, and exactly one segment must be set to Remaining characters.

          • Example 1: Mask the first three characters and leave the remaining characters unmasked.

          • Example 2: Mask the last three characters and leave the remaining characters unmasked.

    2. Verify the masking result.

      In the Sample Data text box, enter sample raw data (0 to 100 characters). Click Verify Masking. The masked data appears in the Masking Effect field.

    3. Click OK to create the data masking rule.

After you create the data masking rule:

  • Data masking rules are disabled by default after they are created. You must manually enable a rule for it to be applied in the corresponding data masking scenario. For more information about how to change the status of a data masking rule, see Enable or disable a data masking rule.

  • After you create a data integration masking rule, you can use it when you create a real-time synchronization task for a single table. For more information, see Configure data masking.

Configure a data masking whitelist (for dynamic data masking only)

In dynamic data masking scenarios, you can configure a whitelist of users for a data masking rule. After the rule is enabled, whitelisted users are not subject to the rule during a specified period and can view the raw, unmasked data.

Note

Before you create a whitelist, you must add the users that you want to whitelist to a user group. For more information about how to configure a user group, see Configure a user group.

To add a whitelist:

  1. On the Data Masking Management page, click the Whitelist Configuration tab.

  2. In the upper-right corner, click Whitelist.

  3. In the Create Whitelist dialog box, configure the required parameters.

    Note
    • Whitelists are not supported in the Hologres engine-layer masking or Data integration static data masking scenarios.

    • After you set an effective period for a whitelist, sensitive data that meets the whitelist conditions is not masked during the specified period.

    image

    The following describes the parameter settings.

    Parameter

    Description

    Sensitive field type

    You can only select sensitive field types that are enabled in the current data masking scenario.

    User group scope

    Select a configured user group. You can select up to 50 user groups. After a user group is added to the whitelist, the accounts in the group can retrieve raw, unmasked data. For more information about how to configure a user group, see Configure a user group.

    Effective period

    Set the effective period for the whitelist as needed. You can choose a short-term or permanent period. Short-term options include 30 days, 90 days, 180 days, and 365 days. You can also set a custom period, which can start on the current day or a future date.

    After you set the period, if a user queries sensitive information outside the effective period of the whitelist, the data will be masked.

    Note

    If you select a short-term period, data will not be masked from the current time until the specified number of days has passed.

  4. Click Save to apply the whitelist.

Enable or disable a data masking rule

On the Data Masking Rules page, you can set the status of a rule to Enabled or Disabled by clicking the Status switch.

You can then edit, delete, or view the details of the rule.

Note
  • You cannot perform the Delete or Edit operations on a desensitization rule that is enabled. You must first set the rule to Disabled. When you do so, you must check if the rule is used by any related tasks and contact the security administrator for confirmation.

  • When a rule is Disabled, you can edit or delete it. However, the Sensitive field type and Data masking rule name cannot be modified.

  • After you modify the rule, you can set its status to Enabled. Tasks that use this rule can then resume data masking.

Data masking rule application examples