This topic provides an overview of the dynamic data masking feature provided by the PolarDB proxy.
Prerequisites
The version of the PolarDB proxy must be V2.4.12 or later. For more information about how to view and upgrade the version of the PolarDB proxy, see Upgrade the cluster version.
Data masking solutions
Data masking solution | Description | Advantage | Limits |
---|---|---|---|
Dynamic data masking | When your application initiates a data query request, the PolarDB proxy masks the sensitive data that is queried before the PolarDB proxy returns the data to the application. Before your application queries data, you need only to specify the database account and the name of the database, table, or column that requires data masking. |
| Compared with mirror databases, production databases have lower query performance because the PolarDB proxy masks the sensitive real-time data in the production databases. |
Static data masking | The PolarDB proxy exports all data in a production database to a mirror database, and encrypts or masks the sensitive data during the export. | Your application queries data from mirror databases instead of production databases. In this case, data masking does not affect the services that require access to production databases. |
|
How it works

- The data masking rules take effect only when you use the
testAcc
account to query data from a database. - The PolarDB proxy masks only the data that is queried in the
name
andage
columns.
If your application uses the testAcc
account to connect to a database and queries data in the name
, age
, and hobby
columns of a table, the PolarDB proxy masks data in the name
and age
columns and returns the masked data together with the unmasked data in the hobby
column.
The PolarDB proxy uses different methods to mask different types of data. The following table describes data masking methods.
Data type | Data masking method | Example |
---|---|---|
Integer data types: TINYINT, SMALLINT, MEDIUMINT, INT, and BIGINT | The PolarDB proxy returns a random value in the format defined in the data type of the raw data. |
|
Decimal data types: DECIMAL, FLOAT, and DOUBLE |
| |
Date and time data types: DATE, TIME, DATETIME, TIMESTAMP, and YEAR |
| |
Other data types | The PolarDB proxy replaces the data with asterisks (*). |
|
Additional considerations
- The dynamic data masking feature applies only to cluster endpoints. Cluster endpoints consist of the default cluster endpoint and custom cluster endpoints. If you use the primary endpoint to connect to a database and query data from the database, the dynamic data masking feature does not take effect. For more information about how to view a cluster endpoint, see View an endpoint.
- If query results contain data that must be masked and the size of a single row exceeds 16 MB, the query session is closed.
For example, you want to query data in the
name
anddescription
columns of theperson
table. In this table, the sensitive data in thename
column must be masked. The size of the data in a row of thedescription
column exceeds 16 MB. In this case, when you execute theSELECT name, description FROM person
statement, the query session is closed. - If a column in which you want to mask the sensitive data is used as the value of an input parameter in a function, data masking does not take effect.
For example, a data masking rule is created to mask the sensitive data in the
name
column. When you execute theSELECT CONCAT(name, '') FROM person
statement, your application can still read the raw values of thename
column. - If a column in which you want to mask the sensitive data is used together with the UNION operator, data masking may not take effect.
For example, a data masking rule is created to mask the sensitive data in the
name
column. When you execute theSELECT hobby FROM person UNION SELECT name FROM person
statement, your application can still read the raw values of thename
column.
Enable the dynamic data masking feature
For more information, see Manage data masking rules.
Appendix: Impacts on cluster performance
The dynamic data masking feature affects the performance of clusters in the following scenarios.
Scenario | Impact on performance | |
---|---|---|
Whether your account is included in the data masking rule | Whether your query hits the data masking rule | |
No | No | Data masking does not take effect on queries made by your account. This way, the performance of your cluster is not affected. |
Yes | No | The PolarDB proxy analyzes only the column definition data in the result set and does not mask the raw data in the query results. This results in performance overhead of approximately 6%. After the dynamic data masking feature is enabled, the read-only QPS decreases by approximately 6%. |
Yes | The PolarDB proxy analyzes the column definition data in the result set and masks the raw data in the query results. In this case, performance overhead is based on the size of the result set. A larger number of rows in the query results cause greater performance overhead. If the query result of a single row is returned, the performance overhead of approximately 6% occurs. |