DataWorks supports multiple data masking scenarios. This topic describes how to select a scenario, create a data masking rule, and perform masked queries in DataWorks.
Background
DataWorks provides two types of data masking: static and dynamic.
Dynamic data masking includes scenarios such as Data development / Data map display desensitization, Data analysis and display desensitization, Layer masking of the MaxCompute engine, and Hologres layer masking.
Static data masking is used for data integration scenarios.
By default, a data masking rule is inactive after it is created. You must manually activate the rule to enable automatic data masking in the relevant scenarios.
For more information about how to activate or deactivate a data masking rule, see Activate or deactivate a data masking rule.
For more information about data masking scenarios, see Data masking scenarios.
Prerequisites
(Optional, for dynamic data masking only) Configure sensitive data detection rules as needed. This lets you associate the fields that require masking when you create data masking rules. For more information, see Sensitive data detection rules.
(Optional, for dynamic data masking only) You can use a whitelist to allow specific users to view raw data for a specified period. To do this, you must add the users to a user group. For more information, see Configure user groups.
(Optional, for data masking at the MaxCompute engine layer only) To configure data masking at the MaxCompute engine layer, you must add the IP address of Data Security Guard to the MaxCompute network whitelist. This lets you call data masking functions to mask sensitive data in query results that you obtain from sources other than DataWorks, such as the MaxCompute command line client (odpscmd) or Logview. For more information, see Example: Use underlying data masking in E-MapReduce.
Access control
Permissions for configuring data masking rules (create, edit, and delete):
Tenant administrators and tenant security administrators can manage data masking rules in all scenarios.
Workspace administrators and workspace security administrators can manage data masking rules only in scenarios where they have permissions.
Permissions for configuring whitelists (create, edit, and delete):
Tenant administrators and tenant security administrators can manage whitelists in all scenarios.
Workspace administrators and workspace security administrators can manage whitelists only in scenarios where they have permissions.
You must be granted the required role permissions to perform these operations. For more information about authorization, see Manage permissions on workspace-level modules and Manage permissions on global-level modules.
Entry point for data masking rule configuration
Go to the DataStudio page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
Click the
icon in the upper-left corner. Then, choose . On the page that appears, click Try Now to go to the Data Security Guard page. NoteIf your Alibaba Cloud account is granted the required permissions, you can directly access the homepage of Data Security Guard.
If your Alibaba Cloud account is not granted the required permissions, you are redirected to the authorization page of Data Security Guard. You can use the features of Data Security Guard only after your Alibaba Cloud account is granted the required permissions.
In the navigation pane on the left, click .
In the navigation pane on the left, select a data masking scenario, and then click Masking Rule on the right to create a rule for that scenario.
Dynamic data masking: The rule configuration is similar across all dynamic data masking scenarios. This topic uses the masking of displayed data in Data Development and Data Map scenario as an example to describe the key configuration steps. You can select a data masking scenario based on your requirements. For more information, see Create a dynamic data masking rule: Masking of displayed data in Data Development and Data Map scenario.
Static data masking: For more information, see Create a static data masking rule: Static data masking in Data Integration scenario.
Create a dynamic data masking rule: Masking of displayed data in Data Development and Data Map scenario
Select a data masking scenario.
On the Data Masking Management page, in the Masking Scene section, select . Then, click + Masking Rule on the right.
Create a data masking rule.
In the Create Data Masking Rule dialog box, configure the parameters for the rule.

Select a sensitive field type and specify the rule name.
Parameter
Description
Sensitive Field Type
Select the type of field to mask.
You can select built-in sensitive field types or custom sensitive field types that you added in sensitive data detection. For more information about how to add a sensitive field type, see Sensitive data detection rules.
If you have already created a data masking rule for the same scenario, DataWorks filters out the selected sensitive field types to prevent inconsistent rules for the same sensitive field in the same scenario.
Data Masking Rule Name
By default, this is the same as the Sensitive Field Type. You can also specify a custom name. The rule name must be unique.
Configure data masking scenarios.
Select the scenarios to which this rule applies. By default, the scenario that you selected in Step 1 is used. You can change the scenario or add more scenarios as needed.
Configure the data masking method.
DataWorks supports several methods, such as Pseudonym, Masking out, HASH, Characters to replace, Range transform, Integer, and Empty. You can select a method based on your requirements.
Pseudonym
This method replaces a value with a masked value that has the same characteristics, preserving the data format. The following parameters are available.
Parameter
Description
(Optional) Data watermark
Watermarks help trace the source of data. If a data breach occurs, you can locate the potential source of the leak. Enable or disable Data watermark as needed.
NoteOnly DataWorks Enterprise Edition supports the data watermark feature.
Masking characteristic value
Different characteristic values result in different masking policies. This means the same source data produces different masked results for different characteristic values. If the characteristic value is the same, the same source data always produces the same masked result.
For example, if the raw data is a123:
If the characteristic value is set to 0, the data is masked to b124.
If the characteristic value is set to 1, the data is masked to c234.
The default value is 5. The value range is 0 to 9.
Substitution character set
If the Sensitive field type you selected is not a built-in type, you must configure a Substitution character set. Characters in the source data that match this set are replaced with other characters of the same type.
For example, if the sensitive data before masking consists of numbers from 0 to 3 and letters from a to d, the masked data will also consist of numbers and letters within that range.
NoteCharacters in the set are replaced with characters from the same range. The character set supports uppercase letters, lowercase letters, and numbers. Separate multiple characters with commas (,). Chinese characters are not supported. If the data to be masked does not match the character set, it is not masked.
Masking out
This method conceals parts of the information by replacing characters at specific positions with an asterisk (*). When you use this method, you must select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.
Parameter (select one)
Description
Recommended method
Select a recommended masking method from the drop-down list. The available methods vary depending on the field to be masked.
DataWorks provides three built-in methods. They include Show only the first and last characters, Show only the first three and last two characters, and Show only the first three and last four characters. Select a method from the drop-down list as needed.
Custom
This provides a more flexible way to configure masking. Configure segments from left to right and specify whether to mask each segment and the number of characters to mask (or not mask). You can add up to 10 segments. You must have at least one segment, and exactly one segment must be Remaining characters.
For example, mask the first 3 characters and leave the remaining characters unmasked.
HASH
When you use HASH encryption for data masking, you must configure the following parameters.
Parameter
Description
Data watermark
Watermarks help trace the source of data. If a data breach occurs, you can locate the potential source of the leak. Enable or disable Data watermark as needed.
NoteOnly DataWorks Enterprise Edition supports the data watermark feature.
Encryption algorithm
Includes MD5, SHA256, SHA512, and SM3.
Salt Value
Set a salt value for the encryption algorithm. The default value is 5. The value range is 0 to 9.
NoteA salt is a specific string that is inserted into the data. In cryptography, inserting a specific string at a fixed position in a password makes the resulting hash different from the hash of the original password. This process is called salting.
Characters to replace
Character to replace replaces characters at specified positions based on your selected substitution method. The following parameters are available.
Parameter
Description
(Required) Substitution position
From the drop-down list, you can select Substitute all, Substitute first 3 characters, or Substitute last 4 characters. You can also select Custom to define a custom substitution position.
If you select Custom, you can define segments from left to right and configure the number of characters to substitute and the substitution method for each segment. You can add up to 10 segments. You must have at least one segment, and exactly one segment must be Remaining characters.
(Required) Substitution method
Includes Random substitution, Sample value substitution, and Static field substitution.
Random substitution: Randomly substitutes characters at the specified positions. The number of characters remains the same after substitution.
Sample value substitution: Select a sample library. The characters at the specified positions are substituted with values from the selected sample library.
Static field substitution: In the Substitution value text box, enter the characters to use for substitution. The value can be 1 to 100 characters long and cannot contain null characters. The characters at the specified positions are substituted with this value.
Range transform
Range transform is used to mask numeric data. This method replaces values within a specified numeric range with a fixed value. You can define up to 10 ranges.
Parameter
Description
Original value range [m,n)
The numeric range of the raw data. The value must be greater than or equal to 0 and can have up to two decimal places.
Masked Value
The value after masking. The value must be greater than or equal to 0 and can have up to two decimal places.
Integer
Integer is used only to mask numeric data.
Parameter
Description
Raw data type
Only numeric types are supported.
Decimal places to keep
The value range is 0 to 5. The remaining part is rounded. For example, if the raw value is 3.1415 and you keep 2 decimal places, the masked value is 3.14.
Empty
The Empty masking sets the corresponding sensitive field to an empty string.
Verify the masking result.
Enter sample raw data (0 to 100 characters) in the Sample data text box. Click Verify. The masked result is displayed in the Data Masking effect field.
Click Save or Save and Apply to create the data masking rule.
After you create the rule:
In dynamic data masking scenarios, you can configure a whitelist for the rule. Whitelisted users can query raw data within a specified period. For more information about how to add a whitelist, see Configure a whitelist for a data masking rule (dynamic data masking only).
By default, a data masking rule is inactive after it is created. You must manually activate the rule before it can be applied in the corresponding scenarios. For more information about how to change the rule status, see Activate or deactivate a data masking rule.
Create a static data masking rule: Static data masking in Data Integration scenario
On the Data Masking Management page, under Masking Scene, choose and click + Masking Rule on the right.
Create a data masking rule.
In the Create Data Masking Rule dialog box, configure the parameters for the rule.

Select a sensitive data type and specify the rule name.
Parameter
Description
Sensitive Data Type
Existing: Select an existing sensitive data type (built-in or custom).
New type: Enter a name for the new sensitive data type. The name must be unique and cannot be the same as an existing type.
NoteBuilt-in sensitive data types include the following: Mobile phone number, ID card number, Bank card number, Email_Built-in, IP, License plate number, Postal code, Landline number, MAC address, Address, Name, Company name, Ethnicity, Zodiac sign, Gender, and Nationality.
Data Masking Rule Name
By default, this is the same as the Sensitive Data Type. You can also specify a custom name. The rule name must be unique.
Configure the data masking method.
DataWorks supports three methods: Pseudonym, Hash, and Mask out. You can select an appropriate method.
Pseudonym
This method replaces a value with a masked value that has the same characteristics, preserving the data format. Pseudonymization is supported for only some existing fields.
If the selected Sensitive data type is a built-in type (such as Mobile Phone Number, ID Card Number, Bank Card Number, Email_Built-in, IP, License Plate Number, Postal Code, Landline Number, MAC Address, Address, Name, or Company Name), you must configure the Security domain.
Security domain: The value is an integer from 0 to 9. Different security domains use different masking policies, so the same source data produces different masked results in different security domains. For example, if the raw data is a123, the data is masked to b124 in security domain 0, but is masked to c234 in security domain 1. Within the same security domain, the same source data is always masked to the same result.
If the selected Sensitive data type is not a built-in type, you need to configure the Substitution character set.
Substitution character set: Characters in the source data that match this set are replaced with other characters of the same type. The character set supports uppercase letters, lowercase letters, and numbers. Use commas (,) to separate multiple characters. Chinese characters are not supported. If the data to be masked contains no characters from this set, it is not masked. For example, if the substitution character set consists of numbers from 0 to 3 and letters from a to d, a matching number in the source data is replaced by another number from 0 to 3, and a matching letter is replaced by another letter from a to d.
Hash
The Hash method encrypts raw data into a fixed-length value and requires you to select a Security domain.
Security domain: The value is an integer from 0 to 9. Each security domain uses a different masking policy. This means that for the same source data, the masked result varies depending on the security domain. However, within the same security domain, the same source data always produces the same masked result.
For example, if the raw data is a123:
If the security domain is set to 0, the data is masked to b124.
If the security domain is set to 1, the data is masked to c234.
Mask out
The Mask out method conceals partial information by replacing specific characters with asterisks (*). This method requires you to select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.
Recommended method: For some fields, you can select a recommended masking method from the drop-down list. The available methods vary depending on the selected field. DataWorks provides three built-in methods: Show only the first and last characters, Show only the first three and last two characters, and Show only the first three and last four characters. You can select the method that you need. For some fields, only the default method can be selected.
Custom: This option provides a flexible way to configure masking. You can define up to 10 segments from left to right. For each segment, you can specify whether to apply masking and the number of characters to mask or leave unmasked. You must configure at least one segment, and exactly one segment must be set to Remaining characters.
Example 1: Mask the first 3 characters and leave the remaining characters unmasked.
Example 2: Mask the last 3 characters and leave the remaining characters unmasked.
Verify the masking result.
In the Sample data text box, enter sample raw data (0 to 100 characters) and click Verify. The masked result is displayed in the Data Masking effect field.
Click OK to create the data masking rule.
After you create the rule:
By default, a data masking rule is inactive after it is created. You must manually activate the rule before it can be applied in the corresponding scenarios. For more information about how to change the rule status, see Activate or deactivate a data masking rule.
After you create a data masking rule for data integration, you can use the rule when you create a real-time synchronization task for a single table. For more information, see Configure data masking.
Configure a whitelist for a data masking rule (dynamic data masking only)
In dynamic data masking scenarios, you can configure a whitelist for a data masking rule. After the rule is activated, whitelisted users are not affected by the rule for a specified period and can view the raw, unmasked data.
Before you create a whitelist, you must add the users to a user group. For more information, see Configure user groups.
To add a whitelist, perform the following steps:
On the Data Masking Management page, click Configure Whitelist.
In the upper-right corner, click Whitelist.
In the Create Whitelist dialog box, you can configure the parameters.
NoteWhitelist configuration is not supported in the Hologres layer masking or Static desensitization of data integration scenario.
After you set an effective period for a whitelist, sensitive data that meets the whitelist conditions is not masked during this period.

The parameters are as follows.
Parameter
Description
Sensitive Field Type
You can select only sensitive field types that are active in the currently selected data masking scenario.
User Group Range
Select configured user groups. You can select up to 50 user groups. After you add user groups to the whitelist, the accounts in those groups can retrieve the original, unmasked data. For more information about how to configure a user group, see Configure user groups.
Effective Time
Set the effective period for the whitelist. You can choose a short-term or permanent period. Short-term options include 30, 90, 180, and 365 days, or a custom period. For a custom period, you can select the current day or a future time range.
If a user queries the sensitive information outside the effective period of the whitelist, the data is masked.
NoteIf you select a short-term period, the data will not be masked from the current time until the specified number of days has passed.
Click Save to save the whitelist configuration.
Activate or deactivate a data masking rule
On the Data Masking Rule page, find the rule and click the Status switch to set it to Enable or Disable.
After the status is set, you can edit or delete the rule, or view its details.
You cannot Delete or Edit a desensitization rule while it is in the Enable state. To do so, you must first change its status to Disable. Before changing the status, you must check if the rule is used by any tasks and contact the security administrator for confirmation.
When a rule is Disable, you can edit or delete it, but you cannot change its Sensitive data type or Data masking rule name.
After you make changes, switch the status back to Enable. This allows tasks that use this rule to resume data masking.
Examples of data masking rule applications
After you create a data masking rule for data integration, you can use the rule when you create a real-time synchronization task for a single table. For more information, see Configure data masking.