All Products
Search
Document Center

DataWorks:Create a data masking rule

Last Updated:Mar 25, 2026

When a support agent looks up a customer record, they should see enough of a phone number to confirm identity — but not the full number. When an analyst queries a sales table, salary columns should appear as asterisks. DataWorks Data Security Guard lets you enforce these controls by creating data masking rules that automatically hide or transform sensitive field values before they reach the user.

This topic explains how to create dynamic and static masking rules, configure whitelists, and manage rule lifecycle.

How data masking works

DataWorks supports two masking categories:

CategoryWhen masking occursApplicable scenarios
Dynamic maskingAt query time, in real timeData Development, Data Map display, Data Analysis display, MaxCompute engine layer, Hologres layer
Static maskingAt data extraction timeData Integration tasks only

To put a masking rule into effect, complete these steps in order:

  1. Open Data Security Guard and navigate to the masking scenario that matches your use case.

  2. Create a masking rule — select the sensitive field type and choose a masking method.

  3. (Dynamic only, optional) Configure a whitelist to let specific users view raw data for a limited period.

  4. Activate the rule. Rules are inactive by default after creation.

Prerequisites

Before you begin, make sure you have:

Open Data Security Guard

  1. Go to the DataStudio page. Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose Data Development and O&M > Data Development. Select the target workspace from the drop-down list and click Go to Data Development.

  2. Click the 图标 icon in the upper-left corner. Choose All Products > Data Governance > Data Security Guard. On the page that appears, click Try Now.

    If your Alibaba Cloud account already has the required permissions, you go directly to the Data Security Guard homepage. Otherwise, you are redirected to the authorization page first.
  3. In the left navigation pane, click Rule Configuration > Data Masking Management.

  4. In the left navigation pane, select a masking scenario, then click Masking Rule on the right.

Create a dynamic data masking rule

Dynamic masking rules share the same configuration steps across all scenarios. The following example uses the Data development / Data map display desensitization > Default Scenario. Adjust the scenario selection to match your requirements.

For a full list of available scenarios, see Data masking scenarios.

  1. On the Data Masking Management page, under Masking Scene, select Data development / Data map display desensitization > Default Scenario. Click + Masking Rule.

  2. In the Create Data Masking Rule dialog box, configure the rule. Step 2: Configure masking scenarios. Select the scenarios this rule applies to. The scenario you selected in step 1 is pre-filled. Add or change scenarios as needed. Step 3: Choose a masking method. Select one of the following methods based on how you want sensitive data to appear after masking. The table below shows what each method does and what the output looks like. ### Pseudonym Replaces each character with a substitute that preserves the original data format. ### Masking out Replaces characters at specified positions with asterisks (*). Choose a built-in mode or define a custom pattern. ### HASH Encrypts data into a fixed-length hash value. Identical source values always produce the same hash within the same salt value. ### Characters to replace Replaces characters at specified positions with substitute values. ### Range transform Masks numeric data by replacing values within a defined range with a fixed value. Supports up to 10 ranges. ### Integer Masks numeric data by rounding to a specified number of decimal places. ### Empty Replaces the sensitive field value with an empty string.

    Step 1: Select the sensitive field type and set the rule name.

    ParameterDescription
    Sensitive Field TypeThe type of field to mask. Select from built-in types or custom types you added in sensitive data detection. If a masking rule for the same sensitive field type already exists in this scenario, DataWorks filters it out to prevent conflicting rules.
    Data Masking Rule NameDefaults to the sensitive field type name. Enter a custom name if needed. Must be unique.
    MethodWhat it doesExample output
    PseudonymReplaces each character with a substitute that preserves the original format (digit for digit, letter for letter)a123b124
    Masking outReplaces characters at specified positions with asterisks (*)13800138000138****8000
    HASHEncrypts data into a fixed-length hash valuea1235d41402abc...
    Characters to replaceReplaces characters at specified positions with substitute valuesJohn Smith**** Smith
    Range transformReplaces numeric values within a defined range with a fixed valueValues in [0, 1000)500
    IntegerRounds numeric data to a specified number of decimal places3.14153.14
    EmptyReplaces the sensitive field value with an empty stringsecret → ``

    Pseudonym

    This method replaces a value with a masked value that has the same characteristics, preserving the data format. The following parameters are available.

    ParameterDescription
    Data watermark (optional)Embeds a hidden watermark to help trace the source of a data breach. Only available in DataWorks Enterprise Edition.
    Masking characteristic valueControls which masking variant to apply. The same source data always produces the same masked result for a given value, but different values produce different results. For example, with raw data a123: value 0 produces b124; value 1 produces c234. Default: 5. Range: 0–9.
    Substitution character setRequired if the selected sensitive field type is not a built-in type. Defines the character pool for substitution — uppercase letters, lowercase letters, and numbers supported (separate with commas). Characters in source data that match this set are replaced with other characters of the same type. Chinese characters are not supported. If source data contains no matching characters, it is not masked.

    Masking out

    This method conceals parts of the information by replacing characters at specific positions with an asterisk (*). When you use this method, you must select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.

    ParameterDescription
    Recommended methodSelect from three built-in modes: Show only the first and last characters, Show only the first three and last two characters, or Show only the first three and last four characters.
    CustomDefine up to 10 segments from left to right. For each segment, specify whether to mask it and how many characters to include. You must have at least one segment, and exactly one segment must be set to Remaining characters

    HASH

    When you use HASH encryption for data masking, you must configure the following parameters.

    ParameterDescription
    Data watermarkEmbeds a hidden watermark to trace data leaks. Only available in DataWorks Enterprise Edition.
    Encryption algorithmSelect MD5, SHA256, SHA512, or SM3.
    Salt ValueA string inserted into the data before hashing, making the result differ from a plain hash of the original value. Default: 5. Range: 0–9.

    Characters to replace

    Character to replace replaces characters at specified positions based on your selected substitution method. The following parameters are available.

    ParameterDescription
    Substitution positionSelect Substitute all, Substitute first 3 characters, Substitute last 4 characters, or Custom. Custom mode supports up to 10 segments; exactly one must be Remaining characters
    Substitution methodRandom substitution — replaces with random characters of the same count. Sample value substitution — replaces with values from a sample library. Static field substitution — replaces with a fixed string you enter (1–100 characters, no null characters).

    Range transform

    Range transform is used to mask numeric data. This method replaces values within a specified numeric range with a fixed value. You can define up to 10 ranges.

    ParameterDescription
    Original value range [m,n)The numeric range of raw data to match. Must be ≥ 0 with up to 2 decimal places.
    Masked ValueThe fixed value to display in place of any raw value in the range. Must be ≥ 0 with up to 2 decimal places.

    Integer

    Integer is used only to mask numeric data.

    ParameterDescription
    Raw data typeOnly numeric types are supported.
    Decimal places to keepRange: 0–5. The remaining digits are rounded. For example, raw value 3.1415 with 2 decimal places kept produces 3.14.

    Empty

    The Empty masking sets the corresponding sensitive field to an empty string.

    image

  3. Verify the masking result. In the Sample data text box, enter up to 100 characters of sample raw data. Click Verify. The masked output appears in the Data Masking effect field.

  4. Click Save or Save and Apply to create the rule.

After creation:

Create a static data masking rule

Static data masking applies during data extraction in Data Integration scenarios. It supports three masking methods: Pseudonym, Hash, and Mask out.

  1. On the Data Masking Management page, under Masking Scene, select Static desensitization of data integration > Default Scenario. Click + Masking Rule.

  2. In the Create Data Masking Rule dialog box, configure the rule. Step 2: Choose a masking method. ### Pseudonym Replaces each character with a substitute that preserves the original format. ### Hash Encrypts raw data into a fixed-length value. Requires a Security domain (integer 0–9). The same source data produces the same result within a given domain but different results across domains. For example, if the raw data is a123: ### Mask out Replaces characters at specified positions with asterisks (*).

    Step 1: Select the sensitive data type and set the rule name.

    ParameterDescription
    Sensitive Data TypeSelect Existing to use a built-in or custom sensitive data type, or select New type to define a new one with a unique name. Built-in types include: Mobile phone number, ID card number, Bank card number, Email_Built-in, IP, License plate number, Postal code, Landline number, MAC address, Address, Name, Company name, Ethnicity, Zodiac sign, Gender, and Nationality.
    Data Masking Rule NameDefaults to the sensitive data type name. Enter a custom name if needed. Must be unique.

    Pseudonym

    This method replaces a value with a masked value that has the same characteristics, preserving the data format. Pseudonymization is supported for only some existing fields.

    • If the selected Sensitive data type is a built-in type (such as Mobile Phone Number, ID Card Number, Bank Card Number, Email_Built-in, IP, License Plate Number, Postal Code, Landline Number, MAC Address, Address, Name, or Company Name), you must configure the Security domain.

      Security domain: The value is an integer from 0 to 9. Different security domains use different masking policies, so the same source data produces different masked results in different security domains. For example, if the raw data is a123, the data is masked to b124 in security domain 0, but is masked to c234 in security domain 1. Within the same security domain, the same source data is always masked to the same result.

    • If the selected Sensitive data type is not a built-in type, you need to configure the Substitution character set.

      Substitution character set: Characters in the source data that match this set are replaced with other characters of the same type. The character set supports uppercase letters, lowercase letters, and numbers. Use commas (,) to separate multiple characters. Chinese characters are not supported. If the data to be masked contains no characters from this set, it is not masked. For example, if the substitution character set consists of numbers from 0 to 3 and letters from a to d, a matching number in the source data is replaced by another number from 0 to 3, and a matching letter is replaced by another letter from a to d.

    Hash

    The Hash method encrypts raw data into a fixed-length value and requires you to select a Security domain.

    Security domain: The value is an integer from 0 to 9. Each security domain uses a different masking policy. This means that for the same source data, the masked result varies depending on the security domain. However, within the same security domain, the same source data always produces the same masked result.

    For example, if the raw data is a123:

    • If the security domain is set to 0, the data is masked to b124.

    • If the security domain is set to 1, the data is masked to c234.

    Mask out

    The Mask out method conceals partial information by replacing specific characters with asterisks (*). This method requires you to select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.

    • Recommended method: Select from three built-in modes: Show only the first and last characters, Show only the first three and last two characters, or Show only the first three and last four characters. For some fields, only the default method is available.

    • Custom: Define up to 10 segments from left to right. You must have at least one segment, and exactly one segment must be set to Remaining characters. Example configurations: mask the first 3 characters and leave the rest unmasked; mask the last 3 characters and leave the rest unmasked.

      • Example 1: Mask the first 3 characters and leave the remaining characters unmasked.

      • Example 2: Mask the last 3 characters and leave the remaining characters unmasked.

    image

  3. Verify the masking result. In the Sample data text box, enter up to 100 characters of sample raw data. Click Verify. The masked output appears in the Data Masking effect field.

  4. Click OK to create the rule.

After creation:

Configure a whitelist for a data masking rule (dynamic data masking only)

A whitelist lets designated users query raw, unmasked data for a specified period, even after the masking rule is active. Whitelists are not supported in the Hologres layer masking or Static desensitization of data integration scenarios.

Permissions for configuring whitelists (create, edit, and delete):

  • Tenant administrators and tenant security administrators can manage whitelists in all scenarios.

  • Workspace administrators and workspace security administrators can manage whitelists only in scenarios where they have permissions.

Add users to a user group before configuring the whitelist. See Configure user groups.
  1. On the Data Masking Management page, click Configure Whitelist.

  2. In the upper-right corner, click Whitelist.

  3. In the Create Whitelist dialog box, configure the following parameters.

    For short-term options, the whitelist takes effect immediately and expires after the specified number of days.
    ParameterDescription
    Sensitive Field TypeSelect from sensitive field types that are active in the current masking scenario.
    User Group RangeSelect user groups whose members can view raw data. Up to 50 user groups per whitelist. See Configure user groups.
    Effective TimeThe period during which whitelisted users can view raw data. Options: 30, 90, 180, or 365 days; a custom date range (current day or future); or permanent. Outside this period, data is masked normally.

    image

  4. Click Save.

Activate or deactivate a data masking rule

On the Data Masking Rule page, find the rule and toggle the Status switch to Enable or Disable.

Rules must be in the Enable state to apply in their configured scenarios.

StateWhat you can do
EnableRule is active. Cannot be deleted or edited. Disable it first, and confirm with the security administrator that no active tasks depend on it.
DisableRule is inactive. You can edit or delete it, but you cannot change the Sensitive data type or Data masking rule name. After editing, switch back to Enable for changes to take effect.

What's next