Data Security Guard is a service in DataWorks that ensures data security. It can be used to identify and mask sensitive data, add watermarks to data, manage data permissions, identify data risks, and trace leak sources. This topic describes how to activate and use Data Security Guard.

Limits

In Data Security Guard, you can use the sensitive data identification feature and dynamic data masking feature to identify and dynamically mask sensitive data in only E-MapReduce (EMR), MaxCompute, Cloudera's Distribution including Apache Hadoop (CDH), and Hologres compute engines. You need to take note of the following limits on the use of Data Security Guard for an EMR compute engine:
  • You can use the data masking feature only when you preview data in DataMap. The data masking feature is not supported in DataStudio or DataAnalysis. The sensitive data identification and data masking features are supported only for specific types of EMR clusters and EMR tables. The following table lists the details.
    Note The Supported icon indicates that the data preview feature is supported, and the Not supported icon indicates that the data preview feature is not supported.
    EMR cluster typeMetadata storage typeData storage type: OSSData storage type: OSS-HDFSData storage type: HDFS
    DataLake clustersData Lake Formation (DLF)SupportedSupportedNot supported
    RDS instanceSupportedSupportedSupported
    MySQLSupportedSupportedSupported
    Custom clustersDLFSupportedSupportedNot supported
    RDS instanceSupportedSupportedSupported
    MySQLSupportedSupportedSupported
    Other clusters--Not supported
    Note The sensitive data identification and data masking features are available only in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), China (Hong Kong), and Germany (Frankfurt).
  • If you want to use Data Security Guard in an EMR cluster, you must upgrade exclusive resource groups for scheduling. You can join the DataWorks DingTalk group and contact technical support personnel to request for an upgrade.
  • By default, Data Security Guard uses an Alibaba Cloud account for data sampling. If LDAP authentication is enabled for your EMR cluster and Ranger or DLF-Auth is used to manage table permissions, you must configure mappings between the Alibaba Cloud account and the cluster account. This ensures that the Alibaba Cloud account has the required permissions to access tables in the EMR cluster. For more information, see Configure mappings between workspace members and cluster accounts.

Go to the Data Security Guard page

  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Workspaces.
  3. In the top navigation bar, select the region where your workspace resides. On the Workspaces page, find your workspace and click DataStudio in the Actions column.
  4. Click the Icon icon in the upper-left corner and choose All Products > Data Governance > Data Security Guard.
  5. Click Try now to go to the Data Security Guard page.
    Note
    • If you have activated Data Security Guard by using your Alibaba Cloud account, the Data Security Guard homepage appears.
    • If you have not activated Data Security Guard by using your Alibaba Cloud account, the page for activating Data Security Guard appears.

Activate Data Security Guard

Log on with your Alibaba Cloud account. On the Terms of Service page, select I have read and agree to all the preceding terms and click Activate.

Important You must use an Alibaba Cloud account to activate Data Security Guard.

Use Data Security Guard

After you activate Data Security Guard, you can use the service. Use Data Security Guard
No.GUI elementDescription
1More iconProvides access to the services that you can use, such as DataStudio, Data Integration, Operation Center, and Data Security Guard.
2User informationThe logon user. You can view and modify the user information, including the email address, mobile phone number, AccessKey ID, and AccessKey secret.
3Left-side navigation paneThe navigation pane for different features of Data Security Guard. For more information about the features of Data Security Guard, see Identify sensitive data, Create a data masking rule, View data activities, View data risks (old version), Trace leak sources, and Identify sensitive data.
4Data Security Guard homepage
  • Data Recognition: displays the total number of fields in a selected project that trigger the configured data identification rules in the last seven days.
  • Data Activities: displays the number of data access activities that are detected on each day in a specified time range for the fields that trigger the configured data identification rules.
    Note The supported time ranges are the last seven days and the last 30 days.

    You can click View Details to go to the Data Activities page.

  • Data Risks: displays the number of data risks that are identified in the last seven days and the number of data risks that have not been handled.
  • Data traceability: allows you to upload the files that contain leaked data, and extracts the watermarks of the leaked data to find the users who may leak the data.
5Switch to the guide pageClick Guide in the upper-right corner to go to the service guide page and view the service information.