All Products
Search
Document Center

MaxCompute:Dynamic data masking

Last Updated:Dec 15, 2025

To allow specific users to view only masked sensitive data in a MaxCompute project, you can enable dynamic data masking. This feature prevents sensitive data leakage by hiding or replacing sensitive data in real time during data access or display. This topic uses examples to describe how to enable dynamic data masking in MaxCompute.

Feature description

MaxCompute provides a dynamic data masking feature to protect sensitive data, such as personally identifiable information (PII), by masking the data when it is displayed. This feature is applicable in scenarios such as business development testing, data sharing, and operations and maintenance (O&M). Dynamic data masking differs from Access Control List (ACL) column-level permissions. When you use ACL column-level permissions, users must modify their SQL queries to exclude columns that they do not have permission to access. In contrast, data masking policies do not require users to alter their existing SQL queries. When a user accesses data, the system automatically masks the data in the relevant columns based on the user's identity and role. This ensures that data remains masked during queries, downloads, associations, and user-defined function (UDF) computations, which prevents sensitive data leakage.

Data masking policies support various methods, such as masking, hashing, character replacement, numerical rounding, and date rounding, to meet various masking requirements for identity information, bank card numbers, addresses, and phone numbers. MaxCompute implements data masking at the point closest to data retrieval from storage, which ensures high performance and robust security.

image

Scope

  • Supported regions

    This feature is in public preview and is available in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Ulanqab), China (Shenzhen), China (Chengdu), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), and US (Silicon Valley).

  • Supported driver versions

    MaxCompute connection method

    Driver version

    Support for data masking

    Java SDK

    0.48.0-public and later

    Support

    odpscmd

    0.47.1 and later

    Support

    JDBC

    3.4.3 and later

    Support

    MaxFrame

    Unlimited

    Support

    PyODPS

    Unlimited

    Support

    Go SDK

    Unlimited

    Support

  • Internal tables

    • MaxCompute supports data masking policies for internal tables, but not for foreign tables.

    • The dynamic data masking feature and row-level permissions feature are mutually exclusive. You cannot configure both a data masking policy and a row-level permission policy for the same table.

    • If you use a masking policy to mask Chinese characters, the character encoding must be UTF-8.

  • Views

    • Traditional views support data masking policies. The data masking policies for traditional views are synchronized with those of the source tables. If you attach or detach data masking policies from the source tables, the changes also take effect on the traditional views.

    • When you create a materialized view, the data masking policy of the source table is applied to the materialized view. If you attach a new masking policy to or detach an existing masking policy from the source table after the materialized view is created, the masking policy of the materialized view remains unchanged.

  • Data masking policies

    When users access sensitive data, if multiple data masking policies are in effect at the same time, the system masks the sensitive data based on the policy with the highest priority. For more information, see Priorities of predefined data masking policies.

Data masking flow

Project owners and users who are assigned the Super_Administrator or Admin role can manage data masking policies. When a user accesses a table that contains sensitive data, the system checks the masking policies that are associated with the user or role and returns either the masked data or the plaintext data.

image

Data masking commands

Enable or disable the data masking feature for a project

odps.data.masking.policy.enable is a project-level property that is used to enable or disable the data masking feature. Only the project owner or a user who is assigned the project-level Super_Administrator or Admin role can enable or disable this feature. For more information, see Assign a built-in management role to a user.

  • Enable the data masking feature for a project.

    setproject odps.data.masking.policy.enable=true;
  • Disable the data masking feature for a project.

    setproject odps.data.masking.policy.enable=false;

Create and delete a data masking policy

  • Command format

    • Create a data masking policy.

      CREATE DATA MASKING POLICY [IF NOT EXISTS] <policy_name> 
      TO { USER <user_list> | ROLE <role_list> | default } 
      USING <Predefined Masking Policy>;
    • Delete a data masking policy.

      DROP DATA MASKING POLICY <policy_name>;
  • Parameter descriptions

    Parameter

    Required

    Description

    policy_name

    Yes

    The name of the data masking policy. The policy name is case-insensitive and cannot contain special characters. It can contain only letters, digits, and underscores (_). We recommend that the name starts with a letter. The name cannot exceed 128 bytes in length.

    USER | ROLE | default

    Yes

    Select one of the three parameters.

    • USER: Masks data for specific users. For <user_list>, specify the names of the users. You can run the list users; command in MaxCompute to obtain role information.

    • ROLE: Masks data for specific roles. For <role_list>, specify the names of the roles. You can run the list roles; command in MaxCompute to obtain role information.

    • default: The default data masking policy. When a user or role accesses a sensitive column, if no data masking policy is matched, the default policy is used to mask the data in the sensitive column.

    Predefined Masking Policy

    Yes

    The predefined data masking policy. For more information, see Predefined data masking policies.

  • Examples

    • Example 1: Create an MD5 hash masking policy for the users userA, userB, and userC.

      CREATE data masking policy IF NOT EXISTS masking_test_001
      TO USER (userA, userB, userC)
      USING MASKED_MD5(0);
    • Example 2: Create an MD5 hash masking policy for the project developer and O&M roles.

      CREATE data masking policy IF NOT EXISTS masking_test_001
      TO ROLE (role_project_deploy, role_project_dev)
      USING MASKED_MD5(0);

Apply a data masking policy to a sensitive data column

  • Command format

    --Apply a data masking policy to a sensitive data column of a table.
    APPLY DATA MASKING POLICY <policy_name> BIND TO TABLE <table_name> COLUMN <column_name>;
    
    --Cancel a specified data masking policy for a sensitive data column of a table.
    APPLY DATA MASKING POLICY <policy_name> UNBIND FROM TABLE <table_name> COLUMN <column_name>;
    
    --Cancel all data masking policies for a sensitive data column of a table.
    APPLY DATA MASKING POLICY UNBIND ALL FROM TABLE <table_name> COLUMN <column_name>;
    
    --Cancel all data masking policies for all sensitive data columns of a table.
    APPLY DATA MASKING POLICY UNBIND ALL FROM TABLE <table_name>;
  • Parameter descriptions

    Parameter

    Required

    Description

    policy_name

    Yes

    The name of the data masking policy.

    table_name

    Yes

    The name of the table that contains sensitive data.

    column_name

    Yes

    The name of the column that contains sensitive data.

View data masking policies

  • Command format

    --View the creation information of a data masking policy.
    DESC DATA MASKING POLICY <policy_name>;
    
    --View the extended information about a table, including the information about sensitive data columns and applied data masking policies.
    DESC EXTENDED <table_name>;
    
    --Display the names of all data masking policies in the current project.
    LIST DATA MASKING POLICY;
    
    --Display the names of the data masking policies that are attached to a specified user.
    LIST DATA MASKING POLICY TO USER <user_name>;
    
    --Display the names of the data masking policies that are attached to a specified role.
    LIST DATA MASKING POLICY TO ROLE <role_name>;
    
    --Display the names of all data masking policies that are attached to a specified table.
    LIST DATA MASKING POLICY ON <table_name>;
    
    --Display the names of all data masking policies that are attached to a specified column of a specified table.
    LIST DATA MASKING POLICY ON <table_name> TO COLUMN <column_name>;
  • Parameter descriptions

    Parameter

    Required

    Description

    policy_name

    Yes

    The name of the data masking policy.

    table_name

    Yes

    The name of the table that contains sensitive data.

    column_name

    Yes

    The name of the column that contains sensitive data.

    user_name

    Yes

    The username.

    role_name

    Yes

    The name of the role.

Predefined data masking policies

Predefined data masking policies include masking, hashing, character replacement, and rounding. To protect data, you can select a predefined data masking policy based on the data type of the sensitive data.

Policy type

Policy name

Command format

Description

General

No masking

UNMASKED

Does not mask data.

Supported types: all types.

Set to null

MASKED_NULLIFY

Sets the value to NULL. After data masking, NULL is returned.

  • Supported types: all types.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string |
    +------------+
    | Michael Johnson |
    +------------+
    
    --Data masking policy
    MASKED_NULLIFY
    
    --After masking
    +------------+
    | col_string | 
    +------------+
    | NULL       | 
    +------------+

Assign default value

MASKED_DV

Uses default values to mask data. For more information about the default value of each data type, see Default values of data types for the MASKED_DV policy.

  • Supported types: all types.

  • Example

    --Before masking, data type is timestamp
    +---------------+
    | col_timestamp |
    +---------------+
    | 2024-05-01 11:12:13 |
    +---------------+
    --Data masking policy
    MASKED_DV
    --After masking, the default time zone of the project is China Standard Time (UTC+8)
    +---------------+
    | col_timestamp | 
    +---------------+
    | 1970-01-01 08:00:00 | 
    +---------------+

Keep only year

MASKED_DATE_YEAR

Keeps only the year part of a time value. The non-year parts are set to January 1, 00:00 (UTC).

  • Supported types: DATE, DATETIME, TIMESTAMP_NTZ, TIMESTAMP.

  • Example

    --Before masking, data type is timestamp
    +---------------+
    | col_timestamp |
    +---------------+
    | 2024-05-01 11:12:13 |
    +---------------+
    --Data masking policy
    MASKED_DATE_YEAR
    --After masking, the default time zone of the project is China Standard Time (UTC+8)
    +---------------+
    | col_timestamp | 
    +---------------+
    | 2024-01-01 08:00:00 | 
    +---------------+

Rounding

MASKED_POINT_RESERVE(<num>)

Rounds the value and keeps 0 to 5 decimal places.

  • num: the number of decimal places to keep. The value of num is an integer from 0 to 5.

  • Supported types: DECIMAL, FLOAT, DOUBLE.

  • Example

    --Before masking, data type is float
    +-----------+
    | col_float |
    +-----------+
    | 1.12345   |
    +-----------+
    --Data masking policy, keep 2 decimal places
    MASKED_POINT_RESERVE(2)
    --After masking
    +------------+
    | col_float  | 
    +------------+
    | 1.12       | 
    +------------+

Mask

Mask start and end of string

MASKED_STRING_MASKED_BA(<before>, <after>)

Masks the start and end parts of a string with *, and displays the middle part in plaintext.

  • before: The first N characters of a string are replaced by asterisks (*). N is specified by the before parameter. The value of before is an integer greater than or equal to 0. If the value of before is 0, the start part of the string is not masked.

  • after: The end of the string is replaced by *. The mask length is specified by the after parameter. The value of after is an integer greater than or equal to 0.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string |
    +------------+
    | Michael Johnson |
    +------------+
    --Data masking policy, replace the first 3 and last 3 characters with asterisks, and display the middle part in plaintext
    MASKED_STRING_MASKED_BA(3, 3)
    --After masking
    +------------+
    | col_string | 
    +------------+
    | ***hael John*** | 
    +------------+

Mask middle of string

MASKED_STRING_UNMASKED_BA(<before>, <after>)

Displays the start and end parts of a string in plaintext and replaces the characters in the middle part with asterisks (*).

  • before: the number of plaintext characters at the start of the string. The value of before is an integer greater than or equal to 0.

  • after: the number of plaintext characters at the end of the string. The value of after is an integer greater than or equal to 0.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string |
    +------------+
    | Michael Johnson |
    +------------+
    --Data masking policy, keep the first and last characters in plaintext
    MASKED_STRING_UNMASKED_BA(1, 1)
    --After masking
    +------------+
    | col_string | 
    +------------+
    | M*************n | 
    +------------+

Hash

SHA256 hashing

MASKED_SHA256(<salt>)

Masks data using the SHA256 hash algorithm.

  • salt: a salt value. The value of num is an integer from 0 to 9.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string | 
    +------------+
    | 4562-1234-5678-9123 | 
    +------------+
    --Data masking policy
    MASKED_SHA256(0)
    --After masking
    +------------+
    | col_string | 
    +------------+
    | zwGMB1aCF1t705EfcwdDorql4MZb46XBqQJw/2RVx8U= | 
    +------------+

SHA512 hashing

MASKED_SHA512(<salt>)

Masks data using the SHA512 hash algorithm.

  • salt: a salt value. The value of num is an integer from 0 to 9.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string | 
    +------------+
    | 4562-1234-5678-9123 | 
    +------------+
    --Data masking policy
    MASKED_SHA512(0)
    --After masking
    +------------+
    | col_string | 
    +------------+
    | 3PPywfEIp08WuTUI8FZCCfdVuRu68wZTVwWWVAf4pboACUnH6w9kFMLpl2AARaGW/mvWvg26p0EIqmE0fAEiuA== | 
    +------------+

MD5 hashing

MASKED_MD5(<salt>)

Masks data using the MD5 hash algorithm.

  • salt: a salt value. The value of num is an integer from 0 to 9.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string | 
    +------------+
    | 4562-1234-5678-9123 | 
    +------------+
    --Data masking policy
    MASKED_MD5(0)
    --After masking
    +------------+
    | col_string | 
    +------------+
    | mK/o08tew5g7S3XV/BkFfw== | 
    +------------+

SM3 hashing

MASKED_SM3(<salt>)

Masks data using the SM3 hash algorithm.

  • salt: a salt value. The value of num is an integer from 0 to 9.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string | 
    +------------+
    | 4562-1234-5678-9123 | 
    +------------+
    --Data masking policy
    MASKED_SM3(0)
    --After masking
    +------------+
    | col_string | 
    +------------+
    | Q2TfwUh4B8QQH8jPL6DfdoGysx/CXBxn2T14dDwQtQw= | 
    +------------+

Character replacement

Random character replacement

MASKED_REPLACE_RANDOM(<position>)

Replaces data with random characters. The string length does not change after replacement. Random values include digits and letters.

  • position: the position for character replacement. The value is an integer.

    • position = 0: All data is replaced with random values.

    • position > 0: The first `position` characters are replaced with random values.

    • position < 0: The last `position` characters are replaced with random values.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string |
    +------------+
    | Michael Johnson |
    +------------+
    --Data masking policy, replace the first 7 characters of the string with random characters
    MASKED_REPLACE_RANDOM(7)
    --After masking
    +------------+
    | col_string | 
    +------------+
    | 4DlJQxi Johnson | 
    +------------+

Random character replacement at start and end

MASKED_REPLACE_RANDOM_BA(<before>, <after>)

Replaces the start and end of the data with random characters. The string length does not change after replacement. Random values include digits and letters.

  • before: The start of the data is replaced with random characters. The replacement length is specified by the before parameter. The value of before is an integer greater than or equal to 0. If the value of before is 0, the start is not replaced.

  • after: The end of the data is replaced with random characters. The replacement length is specified by the after parameter. The value of after is an integer greater than or equal to 0.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string |
    +------------+
    | Michael Johnson |
    +------------+
    --Data masking policy, replace the first 4 and last 4 characters of the string with random characters
    MASKED_REPLACE_RANDOM_BA(4, 4)
    --After masking
    +------------+
    | col_string | 
    +------------+
    | r0xEael JohnWNr | 
    +------------+

Fixed character replacement

MASKED_REPLACE_FIXED(<position>, <fixed_string>)

  • Replaces data with a fixed string.

  • position: the position for character replacement. The value is an integer.

    • position = 0: All data is replaced with the fixed_string.

    • position > 0: The first `position` characters are replaced with fixed_string.

    • position < 0: The last `-position` characters are replaced with fixed_string.

  • fixed_string: a fixed string. The length of the fixed string must be 100 characters or less. The fixed string cannot contain spaces.

  • Supported types: STRING, VARCHAR, CHAR, BINARY.

  • Example

    --Before masking, data type is string
    +------------+
    | col_string |
    +------------+
    | Michael Johnson |
    +------------+
    --Data masking policy, replace the first 7 characters of the string with the fixed string "Oli"
    MASKED_REPLACE_FIXED(7, "Oli")
    --After masking
    +------------+
    | col_string | 
    +------------+
    | Oli Johnson | 
    +------------+

Examples

Mask sensitive personal information

This example shows how to configure a data masking policy to mask sensitive personal information.

  1. Prepare data.

    Create a table for personal information and insert sensitive data.

    -- Create a table for sensitive information.
    CREATE TABLE if NOT EXISTS personal_info (
      id bigint COMMENT 'The unique ID of the user.',
      name string COMMENT 'The name of the user.',
      age int COMMENT 'The age of the user.',
      gender string COMMENT 'The gender of the user.',
      height float COMMENT 'The height of the user.',
      birthday date COMMENT 'The birth date of the user.',
      phone_number string COMMENT 'The phone number of the user.',
      email string COMMENT 'The email address of the user.',
      address string COMMENT 'The address of the user.',
      salary decimal(18, 2) COMMENT 'The salary of the user.',
      create_time timestamp COMMENT 'The time when the user information was created.',
      update_time timestamp COMMENT 'The time when the user information was updated.',
      is_deleted boolean COMMENT 'A flag that indicates whether the user information is deleted.'
    );
    -- Insert sensitive data.
    INSERT INTO personal_info VALUES
        (1, 'Zhang San', 18, 'Male', 178.56, '1990-01-01', '13800000000', 'zhangsan@example.com', 'Haidian District, Beijing', 5000.00, '2023-04-19 11:32:00', '2023-04-19 11:32:00', false),
        (2, 'Li Si', 20, 'Female', 162.70, '1992-02-02', '13900000000', 'lisi@example.com', 'Pudong New Area, Shanghai', 6000.00, '2023-04-19 11:32:00', '2023-04-19 11:32:00',false),
        (3, 'Wang Wu', 22, 'Male', 185.21, '1994-03-03', '14000000000', 'wangwu@example.com', 'Nanshan District, Shenzhen', 7000.00, '2023-04-19 11:32:00', '2023-04-19 11:32:00', false);
  2. Configure data masking policies.

    • Retain only the first character of the name, and replace the other characters with *.

      CREATE data masking policy IF NOT EXISTS masking_name
      TO USER (RAM$xxx@test.aliyunid.com:xxx)
      USING MASKED_STRING_UNMASKED_BA(1, 0);
      
      apply data masking policy masking_name bind TO
      TABLE personal_info COLUMN name;
    • Round the values in the height field.

      CREATE data masking policy IF NOT EXISTS masking_height
      TO USER (RAM$xxx@test.aliyunid.com:xxx)
      USING MASKED_POINT_RESERVE(0);
      
      apply data masking policy masking_height bind TO
      TABLE personal_info COLUMN height;
    • Keep only the year of the birth date.

      CREATE data masking policy IF NOT EXISTS masking_birthday
      TO USER (RAM$xxx@test.aliyunid.com:xxx)
      USING MASKED_DATE_YEAR;
      
      apply data masking policy masking_birthday bind TO
      TABLE personal_info COLUMN birthday;
    • Use the SM3 algorithm to hash phone numbers for default users.

      CREATE DATA MASKING POLICY default_sm3
      TO DEFAULT
      USING MASKED_SM3(0);
      
      apply data masking policy default_sm3 bind TO
      TABLE personal_info COLUMN phone_number;
  3. Use an account that is subject to the data masking policy to query the masked data.

    SELECT id, name, height, birthday, phone_number FROM personal_info;
    
    -- Before masking
    +------------+------+--------+----------+--------------+
    | id         | name | height | birthday | phone_number |
    +------------+------+--------+----------+--------------+
    | 1          | Zhang San | 178.56 | 1990-01-01 | 13800000000  |
    | 2          | Li Si | 162.7  | 1992-02-02 | 13900000000  |
    | 3          | Wang Wu | 185.21 | 1994-03-03 | 14000000000  |
    +------------+------+--------+----------+--------------+
    
    -- After masking
    +------------+------------+------------+------------+--------------+
    | id         | name       | height     | birthday   | phone_number | 
    +------------+------------+------------+------------+--------------+
    | 1          | Zhang*     | 179.0      | 1990-01-01 | lvYJaH4ElL2ilpQx/8tfMUw7xP22yblIgmfWp0/msUQ= | 
    | 2          | Li*        | 163.0      | 1992-01-01 | 9fFWacNSwCRZLAjMHqunlfwkqhTbP2ubuDOeOSh4N1c= | 
    | 3          | Wang*      | 185.0      | 1994-01-01 | k/0JoQCSarJg9ATJ5tyVnhQf1jIBxHXRbB+cvUm4OmE= | 
    +------------+------------+------------+------------+--------------+

Mask data for all users and roles by default

The following example shows that when a user or role matches multiple data masking policies, the policy with the highest priority takes effect.

Use the MASKED_SHA256(5) masking policy for default users.

CREATE DATA MASKING POLICY  default_hash_policy
TO DEFAULT
USING MASKED_SHA256(5);

Use the UNMASKED policy for special users A and B.

CREATE DATA MASKING POLICY  ab_unmask_policy
TO USER (A, B)
USING UNMASKED;

Result: Users A and B can access plaintext data. Other users can access only the data that is masked using the SHA256 hash algorithm.

Note

Users A and B hit both the MASKED_SHA256(5) and UNMASKED policies. The UNMASKED policy takes effect because it has a higher priority (for more information, see Priorities of predefined masking policies). Other users hit the MASKED_SHA256(5) policy.

Appendixes

Priorities of predefined masking policies

When a user accesses sensitive data, if multiple data masking policies are in effect, the policy with the higher priority is executed.

For example, if User A accesses the col_string column and matches two data masking policies, MASKED_REPLACE_RANDOM(3) with a priority of 3 and MASKED_SM3 with a priority of 4, the system executes the policy with the higher priority. In this case, MASKED_REPLACE_RANDOM(3) is used, and User A sees the data with random characters replaced.

Priority

Predefined data masking policy

0 (High)

UNMASKED

1

MASKED_POINT_RESERVE(num)

2

MASKED_DATE_YEAR

3

MASKED_STRING_MASKED_BA(before, after)

MASKED_STRING_UNMASKED_BA(before, after)

MASKED_REPLACE_RANDOM(position)

MASKED_REPLACE_RANDOM_BA(before, after)

MASKED_REPLACE_FIXED(position)

4

MASKED_SHA256

MASKED_SHA512

MASKED_MD5

MASKED_SM3

5

MASKED_DV

6 (Low)

MASKED_NULLIFY

Default values for the MASKED_DV data masking policy

Data type

Default value

bigint

0

double

0.0

decimal

0

string

""

datetime

DATETIME'1970-01-01 00:00:00' (UTC)

boolean

false

tinyint

0

smallint

0

int

0

binary

''

float

0.0

double

0.0

decimal

0

varchar(n)

""

char(n)

" " (n spaces)

date

DATE'1970-01-01'

timestamp

TIMESTAMP'1970-01-01 00:00:00' (UTC)

timestamp_ntz

TIMESTAMP'1970-01-01 00:00:00' (UTC)

array

{Default value of the child class}

map

{key:value} (key and value correspond to the default values of the child class)

json

""

struct

(Default value of the child class)