Dynamic data masking lets you hide sensitive column values in query results without modifying the underlying stored data. If a MaxCompute project user has permission to query sensitive data but should not see the full values, configure underlying data masking to automatically mask those values at query time.
How it works
MaxCompute does not have a built-in dynamic data masking feature. Instead, it integrates with DataWorks Data Security Guard, which supplies the masking engine.
After you activate Data Security Guard and configure a data masking rule, the rule intercepts query results at the MaxCompute engine layer before they reach the client. The masked results are returned to the query tool — whether Java Database Connectivity (JDBC) or the local client (odpscmd) — while the original data in storage remains unchanged.
The following data types can be masked: phone numbers, ID card numbers, bank card numbers, license plate numbers, and IP addresses.
When to use underlying masking vs. upper-layer masking
DataWorks provides two masking modes. Choose based on where in the stack you want masking to apply.
| Underlying masking (MaxCompute engine layer) | Upper-layer masking | |
|---|---|---|
| Where it applies | At the MaxCompute engine layer, before results reach the client | At the DataWorks interface layer (Data Development, Data Map) |
| Entry points covered | JDBC and odpscmd | DataWorks interface only |
| When it activates | When underlying data masking is enabled and rules are configured | When no underlying masking rule is active for the project |
| Best for | Enforcing masking across all query entry points | Masking only within the DataWorks interface |
If underlying data masking is enabled and rules are configured, the Underlying data masking for MaxCompute rule takes precedence. If underlying data masking is not enabled or no rules are configured, the Upper-layer masking scenario rule applies instead.
Limitations
-
Available in DataWorks Professional Edition and higher only. DataWorks Basic Edition is not supported. To upgrade, see Upgrade versions.
-
Supported regions: China (Beijing), China (Shanghai), China (Hangzhou), China (Chengdu), China (Shenzhen), China (Beijing) Gov Cloud, China (Shanghai) Finance Cloud, China (Hong Kong), Singapore, Germany (Frankfurt), Malaysia (Kuala Lumpur), US (Silicon Valley), and Indonesia (Jakarta).
-
Does not support masking primary key fields in MaxCompute tables.
-
Only applies to MaxCompute projects that contain data created more than 24 hours ago.
-
Underlying data masking for MaxCompute is supported only at the session level.
Prerequisites
Before you begin, make sure you have:
-
A MaxCompute project with the data to be masked. See Create a MaxCompute project and Import data.
-
Data Security Guard activated in DataWorks. See Go to Data Security Guard. On the Terms of Service page, select I have read and agree to the preceding terms, then click Activate Now.
Enable underlying data masking
The setup has four steps: select a masking scenario, create a data masking rule, optionally configure a whitelist, and enable masking for the target project.
Step 1: Select a masking scenario
-
In the navigation pane, click Rule Configuration > Data Masking Management.
-
In the Masking Scenario area, select Underlying masking scenario > MaxCompute engine layer masking_New.
To also see masking applied within the DataWorks interface (Data Development and Data Map), enable Data Development/Data Map Display Masking. For details on creating masking scenarios, see Create a data masking scenario.
Step 2: Create a data masking rule
Create a new data masking rule to define which sensitive fields to mask and how.
Step 3: (Optional) Configure a whitelist
A whitelist exempts specific users from masking rules during a defined period. Users who query sensitive data outside the effective period are still masked.
-
On the Data Masking Management page, click Whitelist Configuration.
-
In the upper-right corner, click +Whitelist.
-
In the Create Whitelist dialog box, set Sensitive Field Type, User Group Scope, and Effective Period.
Step 4: Enable masking for the project
-
Click MaxCompute engine layer masking_New to show all MaxCompute projects for which underlying data masking is enabled.
-
Toggle the Status switch for the target project to enable the underlying data masking rule.
Verify masking results
After enabling masking, run a query to confirm that sensitive values are masked in the results.
The following example uses the odpscmd client in the China (Hangzhou) region.
Underlying data masking for MaxCompute is applied at the session level when querying through odpscmd.
-
Run a query against a table that contains sensitive data:
select * from table; -
Review the query results. Sensitive field values should appear masked.
