Data Security Guard overview and workflow | DataWorks - DataWorks

Workflow

Data Security Guard follows a three-phase workflow: preparation, active protection, and post-event audit.

Step 1: Preparation

Classify and grade data assets and configure detection rules before sensitive data is generated.

Operation	Description	References
Configure data classification and grading	Classify data into sensitivity levels based on value, content sensitivity, and impact scope. Management principles and development requirements vary by level. DataWorks provides built-in templates. You can also create custom classification and grading names.	Configure sensitive data classification and categorization
Configure sensitive data detection rules	Define sensitive column types and configure rules based on data source and purpose. Content matching a rule is flagged as sensitive. Supported detection methods: Data content detection: Uses built-in rules, custom models, sample libraries, or regular expressions. Metadata detection: Matches column names and comments using wildcards, prefixes, suffixes, or containment patterns. Combined detection: Combines multiple conditions with AND/OR operators.	Configure data detection rules and run detection tasks Detection with a custom model Identify data with a sample library
Other configurations	System settings: Configure watermark tracing periods, tag classification results to MaxCompute column labels, specify email addresses and webhook URLs for receiving alert notifications on detection results, and enable real-time detection for unidentified columns. User group settings: Batch-add accounts with identical permissions to a group. You can then allowlist the group to let members access unmasked data.	System configuration Configure user groups

Step 2: Active protection

After rules are enabled, DataWorks automatically detects matching sensitive data. View results in the Data Security Guard modules below.

Operation	Description	References
Data masking management	Configure masking rules for detected sensitive data. Masking policies vary by sensitivity level. Masking types: Dynamic masking: Masks data at query time on the results page. Static masking: Masks and stores data in a specified location. Methods include format-preserving encryption, redaction, hashing, character replacement, interval transformation, rounding, and nullification. To grant access to raw data, configure an allowlist. Select the masking type and method that fit your requirements.	Create a data masking rule
Risk identification management	Built-in risk rules take effect immediately. You can also create custom rules with threshold comparisons such as data volume and frequency. Active rules automatically detect risky operations and send alerts.	Risk identification management
Risk monitoring and handling	View detected risk details and mark them as risk-free or handled.	View data risks

Step 3: Post-event audit and tracing

Process sensitive data and perform security management based on risk monitoring results.

Operation

Description

References

Data operation audit

Data Security Guard logs all sensitive data operations, including IP addresses, database users, timestamps, and lineage information.

You can manually correct misidentified sensitive data.

Data watermark tracing

Extract watermark information from leaked data files to trace the source of a data leak.

Data traceability

Limitations

Edition limits

Data Security Guard requires DataWorks Standard Edition or higher. For information about how to activate DataWorks, see Purchase and activate DataWorks. Supported features vary by edition, as listed in the Comparison of DataWorks editions.

Permission limits

Only Alibaba Cloud accounts or RAM users with one of the following permissions can enable Data Security Guard:

Tenant administrator
Security administrator (tenant level)
RAM users with the AdministratorAccess or AliyunDataWorksFullAccess role. Grant permissions to a RAM user.

Note

Tenant administrators and tenant-level security administrators can use all Data Security Guard features.
Workspace-level security administrators can only access features for their assigned workspaces. Grant additional workspace permissions through Manage members and roles.

Feature usage

For EMR, MaxCompute, and Hologres engines, only data detection and dynamic masking are supported.

EMR engine limits:

Sensitive data detection and masking support only the following EMR cluster types and storage types:

Note

In the following table, indicates that the feature is supported, and indicates that the feature is not supported.

EMR cluster type	Metadata storage type	Data storage type: OSS	Data storage type: OSS-HDFS	Data storage type: HDFS
New DataLake cluster	Data Lake Formation (DLF)
	RDS instance
	MySQL
Custom cluster	Data Lake Formation (DLF)
	RDS instance
	MySQL
Other clusters	--

Note

This feature is currently available only in the following regions: China (Hangzhou), China (Shanghai), China East 2 Finance, China (Beijing), China North 2 Finance, China (Zhangjiakou), China (Ulanqab), China (Shenzhen), China South 1 Finance, China (Chengdu), China North 2 Ali Gov 1, China (Hong Kong), US (Silicon Valley), Singapore, Malaysia (Kuala Lumpur), Germany (Frankfurt), Indonesia (Jakarta), SAU (Riyadh - Partner Region), Thailand (Bangkok), and US (Virginia).

To use Data Security Guard with EMR clusters, you must upgrade the exclusive resource group for scheduling. You can join the DataWorks DingTalk group and contact technical support to request an upgrade.
Data Security Guard uses the Alibaba Cloud account for data sampling by default. If your cluster uses LDAP authentication with Ranger or DLF-Auth for table permissions, configure account mapping and ensure the mapped account can access EMR tables. Configure access identity mapping.

Access Data Security Guard

Log on to the DataWorks console. In the target region, click Data Governance > Security Center in the left-side navigation pane. On the page that appears, click Go to Security Center.
In the left-side navigation pane, click Data Security > Sensitive Data Management and then click Try Now to access Data Security Guard.
Note
- If your Alibaba Cloud account is already authorized, you are directed to the Data Security Guard homepage.
- If your Alibaba Cloud account is not authorized, you are redirected to the Data Security Guard authorization page. To use Data Security Guard features for the first time, go to Data Security > Sensitive Data Management, select Data Security Guard in the pop-up dialog, and then complete the authorization.