DataWorks supports multiple data masking scenarios. This topic describes how to create data masking rules for different scenarios and perform masked queries in DataWorks.
Background information
DataWorks provides two types of data masking: static data masking and dynamic data masking.
Dynamic data masking supports scenarios such as display masking in Data Development and Data Map, display masking in DataAnalysis, engine-layer masking in MaxCompute, and engine-layer masking in Hologres.
Static data masking refers to the data integration static data masking scenario.
Data masking rules are disabled by default after they are created. To automatically mask data in a specific scenario, you must manually enable the corresponding rule.
To enable or disable a data masking rule, see Enable or disable a data masking rule.
For more information about data masking scenarios, see Data masking scenarios.
Prerequisites
(Optional, for dynamic data masking only) You can configure sensitive data detection rules as needed. This helps you associate fields that require masking when you create data masking rules. For more information, see Sensitive data detection rules.
(Optional, for dynamic data masking only) You can use a whitelist to allow specific users to bypass data masking rules during a specified period. Add the users to a user group to allow them to view raw data. For more information, see Configure a user group.
(Optional, for MaxCompute engine-layer masking only) If you configure the MaxCompute engine-layer masking scenario, sensitive data is masked when it is queried from outside DataWorks, such as from the MaxCompute command line client (odpscmd) or from Logview. To do this, you must request a network whitelist for MaxCompute to call the masking functions. For more information, see Example: Use E-MapReduce for underlying data masking.
Access control
Configure data masking rules (create, edit, and delete):
Tenant administrators and tenant security administrators can perform operations on data masking rules in all scenarios.
Workspace administrators and workspace security administrators can perform operations on data masking rules only in scenarios where they have permissions.
Configure a data masking whitelist (create, edit, and delete):
Tenant administrators and tenant security administrators can configure whitelists for all scenarios.
Workspace administrators and workspace security administrators can configure whitelists only for scenarios for which they have permissions.
You must be granted the required role permissions to perform these operations. For more information about authorization, see Workspace-level module access control and Global module access control.
Go to the data masking rule configuration page
Go to the DataStudio page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
Click the
icon in the upper-left corner. Then, choose . On the page that appears, click Try Now to go to the Data Security Guard page. NoteIf your Alibaba Cloud account is granted the required permissions, you can directly access the homepage of Data Security Guard.
If your Alibaba Cloud account is not granted the required permissions, you are redirected to the authorization page of Data Security Guard. You can use the features of Data Security Guard only after your Alibaba Cloud account is granted the required permissions.
In the navigation pane on the left, choose .
Select a data masking scenario on the left. Then, on the right, click the Data Masking Rules tab to create a rule for that scenario.
Dynamic data masking: The rule configuration is similar for all scenarios. This topic uses the Data Development/Data Map display masking scenario as an example to describe the key configuration parameters. When you use this feature, select a data masking scenario as needed. For more information, see Create a dynamic data masking rule: Data Development/Data Map display masking scenario.
Static data masking: For more information, see Create a static data masking rule: Data integration static data masking scenario.
Create a dynamic data masking rule: Data Development/Data Map display masking scenario
Select a data masking scenario.
On the Data Masking Management page, set Data Masking Scenario to , and then click the Data Masking Rules tab.
Create a data masking rule.
In the Create Data Masking Rule dialog box, configure the parameters for the rule.

Select a sensitive field and specify the rule name.
Parameter
Description
Sensitive field type
Select the type of sensitive field to which this rule applies.
You can select built-in sensitive field types or sensitive field types that you manually added in sensitive data detection. For more information about how to add a sensitive field, see Sensitive data detection rules.
If you have already created a data masking rule for the same scenario, DataWorks filters out the sensitive field types that have been selected. This prevents inconsistent data masking rules for the same sensitive field in the same scenario.
Data masking rule name
By default, the rule name is the same as the Sensitive field type. You can also specify a custom name. The rule name must be unique.
Configure the data masking scenario.
Select the data masking scenarios to which this rule applies. By default, the scenario that you selected in Step 1 is used. You can change the scenario or add multiple scenarios as needed.
Configure the data masking method.
DataWorks supports data masking methods such as Format-preserving encryption, Masking, HASH encryption, Character substitution, Range transformation, Rounding, and Set to null. You can select a method as needed.
Format-preserving encryption (formerly pseudonymization algorithm)
Format-preserving encryption replaces a value with a masked value that has the same characteristics. The data format remains the same after masking. The following parameters are related to this data masking method.
Parameter
Description
(Optional) Data watermark
A data watermark provides data traceability. If a data breach occurs, the watermark can help you locate the possible source of the leak. You can enable or disable Data watermark as needed.
NoteOnly DataWorks Enterprise Edition and higher supports the data watermark feature.
(Optional) Substitution feature value
Different substitution feature values correspond to different masking policies. This means that for the same raw data, different substitution feature values produce different masked results. If the substitution feature value is the same, the same raw data produces the same masked data.
For example, if the raw data is a123:
If you set the substitution feature value to 0, the data is masked as b124.
If you set the substitution feature value to 1, the data is masked as c234.
The default substitution feature value is 5. The valid values are 0 to 9.
(Optional) Substitution character set
If the detection rule for the selected Sensitive field type is not a built-in rule, you must configure a Substitution character set. After you configure a substitution character set, any character in the set is replaced with another character of the same type.
For example, if the sensitive data before masking consists of digits from 0 to 3 and letters from a to d, the masked data will also consist of digits and letters within this range.
NoteCharacters in the character set are replaced with characters from the same range. The substitution character set supports uppercase letters, lowercase letters, and digits. Separate multiple characters with commas (,). Chinese characters are not supported. If the data to be masked does not fall within the character set range, it is not masked.
Masking
Masking conceals part of the information by replacing characters at specific positions with asterisks (*). When you use this method, you must select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.
Parameter (select one)
Description
Recommended method
Select a recommended masking method from the drop-down list. The available masking methods vary depending on the field to be masked.
DataWorks provides three built-in masking methods. These Methods Include Show Only The First And Last Characters, Show Only The First Three And Last Two Characters, And Show Only The First Three And Last Four Characters. You can select a method from the drop-down list on the interface.
Custom
This provides a more flexible way to configure masking. You must configure from left to right whether to mask each segment and the number of characters to mask or not mask. You can add up to 10 segments. You must have at least one segment, and exactly one segment must be set to Remaining characters.
For example, mask the first 3 characters and do not mask the remaining characters.
HASH encryption
When you use HASH encryption for data masking, you must configure the following parameters.
Parameter
Description
(Optional) Data watermark
A data watermark provides data traceability. If a data breach occurs, the watermark can help you locate the possible source of the leak. You can enable or disable Data watermark as needed.
NoteOnly DataWorks Enterprise Edition and higher supports the data watermark feature.
(Required) Encryption algorithm
Includes MD5, SHA256, SHA512, and SM3.
(Required) Salt value
Set the salt value for each encryption algorithm. The default value is 5. The valid values are 0 to 9.
NoteA salt is a specific string that is inserted. In cryptography, salting is the process of inserting a specific string at a fixed position in a password. This makes the hashed result different from the result of hashing the original password.
Character substitution
Character substitution is a data masking rule that replaces characters at a specified position based on the selected substitution method. This rule has the following parameters.
Parameter
Description
(Required) Substitution position
From the drop-down list, you can select Replace all, Replace first 3, or Replace last 4. You can also select Custom to define a custom substitution position.
When you select Custom, you can define custom segments. You must define the segments in order from left to right and configure the number of digits to replace and the replacement method for each segment. You can add from 1 to 10 segments, and exactly one segment must be Remaining Digits.
(Required) Substitution method
Includes Random substitution, Sample value substitution, and Static value substitution.
Random substitution: Randomly replaces characters at the specified position. The number of characters remains the same after substitution.
Sample value substitution: You must select a sample library. The characters at the specified position are replaced with values from the selected sample library.
Static value substitution: In the Substitution value text box, enter the characters to use for substitution. The string can be 1 to 100 characters long and cannot contain null characters. The characters at the specified position are replaced with this substitution value.
Range transformation
Range transformation applies only to numeric data. It masks data within a specified numeric range with a fixed value. You can add from 1 to 10 ranges.
Parameter
Description
Original value range [m,n)
The numeric range of the data before masking. The value must be greater than or equal to 0 and can have up to two decimal places.
ValueMasked value
The value after masking. The value must be greater than or equal to 0 and can have up to two decimal places.
Rounding
Rounding applies only to numeric data.
Parameter
Description
Raw data type
Only numeric types are supported.
Decimal places to keep
You can keep 0 to 5 decimal places. The remaining part is rounded. For example, if the original value is 3.1415 and you keep two decimal places, the masked value is 3.14.
Set to null
The Set to null masking method sets the corresponding sensitive field to an empty string.
Verify the masking result.
Enter sample raw data (up to 100 characters) in the Sample Data text box and click Verify. The masked data is displayed in the Masking Effect field.
Click Save or Save and Enable to create the data masking rule.
After you create the data masking rule:
In dynamic data masking scenarios, you can configure a whitelist for the data masking rule. Whitelisted users can view unmasked data within a specified time frame. For more information about how to add a whitelist, see Configure a data masking whitelist (for dynamic data masking only).
Data masking rules are disabled by default after they are created. You must manually enable a rule for it to be applied in the corresponding data masking scenario. For more information about how to change the status of a data masking rule, see Enable or disable a data masking rule.
Create a static data masking rule: Data integration static data masking scenario
On the Data Masking Management page, set the Data Masking Scenario to , and click + Data Masking Rule on the right.
Create a data masking rule.
In the Create Data Masking Rule dialog box, configure the parameters for the rule.

Select a sensitive data type and specify the rule name.
Parameter
Description
Sensitive data type
Select existing: Select an existing sensitive data type as needed. This includes built-in and custom sensitive data types.
Add type: Enter a name for the sensitive data type. The name must be unique and cannot be the same as an existing type.
NoteBuilt-in sensitive data types include the following: Phone Number, ID Card Number, Bank Card Number, Mailbox_Built-in, IP, License Plate Number, Postal Code, Landline Number, MAC address, Address, Name, Company Name, Ethnicity, Zodiac Sign, Gender, and Nationality.
Data masking rule name
By default, the rule name is the same as the Sensitive data type. You can also specify a custom name. The data masking rule name must be unique.
Configure the data masking method.
DataWorks supports three data masking methods: Pseudonymization, Hashing, and Masking. You can select a method based on your requirements.
Alias
Pseudonymization replaces a value with a masked value that has the same characteristics. The data format remains the same after masking. Only some existing fields support pseudonymization.
If the selected Sensitive data type is a built-in type (such as Phone Number, ID Card Number, Bank Card Number, Email Address, IP, License Plate Number, Postal Code, Landline Number, MAC address, Address, Name, or Company Name), you must configure a Security domain.
Security domain: The value must be an integer from 0 to 9. Different security domains use different masking policies. This means the same raw data produces different masked results in different security domains. For example, if the raw data is a123, setting the security domain to 0 masks it as b124, and setting it to 1 masks it as c234. If the security domain is the same, the same raw data always produces the same masked result.
If the selected Sensitive data type is not a built-in type, you must configure a Substitution character set.
Substitution character set: Replaces characters in sensitive data with other characters of the same type from this set. The set supports uppercase letters, lowercase letters, and digits. Use a comma (,) to separate multiple characters. Chinese characters are not supported. If a character in the sensitive data is of a type not represented in the substitution set, that character is not masked. For example, if the substitution set is defined with digits from 0 to 3 and letters from a to d, the masked sensitive data will also be composed of digits and letters from within this range.
Hashing
The Hashing method encrypts raw data into fixed-length data and requires you to select a Security domain.
Security domain: The value is an integer from 0 to 9. Different security domains have different masking policies. This means that for the same raw data, the masking result is consistent within the same security domain but varies across different security domains.
For example, if the raw data is `a123`:
If you set the security domain to 0, the data is masked as `b124`.
If you set the security domain to 1, the data is masked as `c234`.
Masking
Masking conceals information by replacing specific characters with asterisks (*). To use this method, you must select a masking mode. DataWorks provides several built-in masking modes and supports custom modes.
Recommended Method: For some fields, you can select a recommended masking method from the drop-down list. The available methods vary depending on the field. DataWorks provides three built-in masking methods: Show Only The First And Last Characters, Show Only The First Three And Last Two Characters, and Show Only The First Three And Last Four Characters. You can select the method that you need. For some fields, you can select only the default method.
Custom: This option provides a flexible way to configure masking. You must configure each segment from left to right, specifying whether to apply masking and the number of characters to mask or retain. You can add up to 10 segments. You must have at least one segment, and exactly one segment must be set to Remaining characters.
Example 1: Mask the first three characters and leave the remaining characters unmasked.
Example 2: Mask the last three characters and leave the remaining characters unmasked.
Verify the masking result.
In the Sample Data text box, enter sample raw data (0 to 100 characters). Click Verify Masking. The masked data appears in the Masking Effect field.
Click OK to create the data masking rule.
After you create the data masking rule:
Data masking rules are disabled by default after they are created. You must manually enable a rule for it to be applied in the corresponding data masking scenario. For more information about how to change the status of a data masking rule, see Enable or disable a data masking rule.
After you create a data integration masking rule, you can use it when you create a real-time synchronization task for a single table. For more information, see Configure data masking.
Configure a data masking whitelist (for dynamic data masking only)
In dynamic data masking scenarios, you can configure a whitelist of users for a data masking rule. After the rule is enabled, whitelisted users are not subject to the rule during a specified period and can view the raw, unmasked data.
Before you create a whitelist, you must add the users that you want to whitelist to a user group. For more information about how to configure a user group, see Configure a user group.
To add a whitelist:
On the Data Masking Management page, click the Whitelist Configuration tab.
In the upper-right corner, click Whitelist.
In the Create Whitelist dialog box, configure the required parameters.
NoteWhitelists are not supported in the Hologres engine-layer masking or Data integration static data masking scenarios.
After you set an effective period for a whitelist, sensitive data that meets the whitelist conditions is not masked during the specified period.

The following describes the parameter settings.
Parameter
Description
Sensitive field type
You can only select sensitive field types that are enabled in the current data masking scenario.
User group scope
Select a configured user group. You can select up to 50 user groups. After a user group is added to the whitelist, the accounts in the group can retrieve raw, unmasked data. For more information about how to configure a user group, see Configure a user group.
Effective period
Set the effective period for the whitelist as needed. You can choose a short-term or permanent period. Short-term options include 30 days, 90 days, 180 days, and 365 days. You can also set a custom period, which can start on the current day or a future date.
After you set the period, if a user queries sensitive information outside the effective period of the whitelist, the data will be masked.
NoteIf you select a short-term period, data will not be masked from the current time until the specified number of days has passed.
Click Save to apply the whitelist.
Enable or disable a data masking rule
On the Data Masking Rules page, you can set the status of a rule to Enabled or Disabled by clicking the Status switch.
You can then edit, delete, or view the details of the rule.
You cannot perform the Delete or Edit operations on a desensitization rule that is enabled. You must first set the rule to Disabled. When you do so, you must check if the rule is used by any related tasks and contact the security administrator for confirmation.
When a rule is Disabled, you can edit or delete it. However, the Sensitive field type and Data masking rule name cannot be modified.
After you modify the rule, you can set its status to Enabled. Tasks that use this rule can then resume data masking.
Data masking rule application examples
After you create a data integration masking rule, you can use it when you create a real-time synchronization task for a single table. For more information, see Configure data masking.