Configure Risk Detection Rules to Protect Data Security - DataWorks

Risk identification rules use multi-dimensional association analysis to proactively detect risky operations on sensitive data — such as bulk queries outside business hours or mass data exports. DataWorks includes built-in rules for the most common risk scenarios that you can enable immediately, and supports custom rules for organization-specific needs.

Limitations

Limitation	Details
Edition requirement	Risk identification rules require DataWorks Professional Edition or later. Built-in rules are only available in Enterprise Edition.
Alerting methods	Only Email and WebHook alerting are supported. WebHook supports DingTalk groups, WeCom, and Lark. Pushing alerts to WeCom or Lark requires Enterprise Edition.

How risk identification rules work

All risk identification rules use statistical association rules: DataWorks aggregates events and compares the count against a threshold within a time window. A risk is triggered only when the threshold is exceeded — for example, a rule can be configured to detect a risk if a low-privilege user accesses more than 10,000 sensitive data entries outside of work hours.

Each rule can combine up to 10 detection conditions across the following dimensions. All conditions within a rule use AND logic.

Detection dimension	What it scopes
Data location	The MaxCompute engine, project, and tables where the operation occurs
Data properties	The classification level, category, or sensitive field type of the data
User information	A specific user group, RAM role, or username
Operation time	The day of the week and time range when the operation occurs

Navigate to risk identification rules

Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose Data Development and O&M > Data Development. Select the target workspace and click Go to Data Development.
Click the icon in the upper-left corner. Choose All Products > Data Governance > Data Security Guard, then click Try Now.
If your Alibaba Cloud account already has the required permissions, you are taken directly to the Data Security Guard homepage. If not, you are redirected to an authorization page and must obtain the required permissions before proceeding.
In the left-side navigation pane, choose Rule Configuration > Risk Identification Rules.

Built-in risk identification rules

Enterprise Edition includes five built-in rules covering the most common sensitive data risk scenarios. These rules are ready to use without configuration.

Rule name	Type	Level	When it triggers
Querying large volumes of sensitive data outside of work hours	Data access risk	Low	A query returns more than 10,000 entries during off-hours (Mon–Fri: 19:00–24:00; Sat–Sun: 00:00–24:00)
Similar SQL queries	Data access risk	Low	Five or more similar SQL queries run within 10 minutes
Batch querying large volumes of sensitive data	Data access risk	Medium	A single query returns more than 10,000 entries
Batch exporting large volumes of sensitive data	Data export risk	High	A single export contains more than 10,000 entries
Exporting large volumes of sensitive data outside of work hours	Data export risk	High	An export contains more than 10,000 entries during off-hours (Mon–Fri: 22:00–24:00; Sat–Sun: 00:00–24:00)

Create a risk identification rule

Before creating a rule, define what your rule needs to detect:

What operation triggers the risk (data access, export, deletion, update, table operations, or authorization)
Where the risky operation occurs (which engine, project, or tables)
Who performs the operation (a specific user group, RAM role, or username)
When the operation occurs (specific days or hours)
How much constitutes a risk (data volume or frequency threshold)

Step 1: Prepare detection condition prerequisites

Some detection dimensions require you to configure dependent resources in advance. Check the following table and complete any setup that applies to your rule.

Detection dimension	Subcategory	What to configure in advance
Data properties	Data classification level	Configure sensitive data classification and levels
Data properties	Data category	Configure sensitive data detection rules and run detection tasks
Data properties	Sensitive field type	Configure sensitive data detection rules and run detection tasks
User information	User group	Configure user groups
User information	RAM role	Create a RAM user

Step 2: Set basic information

In the upper-right corner of the Risk Identification Rules page, click + Risk Identification Rules. In the Create Risk Identification Rule dialog box, configure the following parameters.

Parameter	Description
Rule Name (required)	A name for the rule. Must be 1–30 characters and cannot contain special characters.
Rule Type (required)	The type of operation to monitor. Valid values: Data Access, Data Export, Data Deletion, Data Update, Library Table Operations, Data Authorization.
Rule Level (required)	The severity level: Low, Medium, or High. Set to High for rules covering critical or highly sensitive data.
Description (optional)	A description of 1–100 characters.

Click Next.

Step 3: Configure detection conditions and thresholds

Detection conditions

Add one or more conditions to scope the rule. Click + Add Comparison Relationship within a dimension to add multiple conditions. All conditions use AND logic, and up to 10 conditions can be added per rule.

Data location — Specify the engine, project, and tables to monitor.

Parameter	Description	Required
Filter selected location	= detects risks only in the selected location. ≠ excludes the selected location from detection.	Yes
Compute engine name	Currently, only MaxCompute is supported. Each comparison supports one engine. Click + Add Comparison to specify multiple engines.	Yes
Project name	The project within the selected engine. The drop-down list shows up to 100 projects and supports fuzzy matching. Each comparison supports one project. Click + Add Comparison to specify multiple projects.	Yes
Table name	One or more table names, separated by commas. Each name can be up to 30 characters; all names combined cannot exceed 100 characters. The wildcard `` is supported (for example, `name` matches all tables ending in `name`). If left blank, the rule applies to all tables in the selected project.	No

Data properties — Specify which data properties to target.

Parameter	Description
Property Type	The property to filter on: Data classification level, Data category, or Sensitive field type. Each property type must be configured in advance. See Step 1.
Filter selected property	= detects risks only for data with the selected property. ≠ excludes data with the selected property.

User information — Scope the rule to specific users or groups.

Parameter	Description
Information category	The user scope to target: User group, RAM role, or Username.
Filter selected user information	= detects risks only for the selected user. ≠ excludes the selected user from detection.

Operation time — Restrict detection to specific time ranges.

Parameter	Description
Select time range	Click a day of the week and an hour to define the time range. Precision is to the hour. Multiple time ranges can be added, but they must not overlap — for example, if Monday is selected in one condition, Monday cannot be selected in another condition.
Filter selected time	= detects risks only during the selected time. ≠ excludes the selected time from detection.

Thresholds

Thresholds define when a risk is triggered. Click + Add Threshold Comparison to add multiple threshold conditions.

Threshold category	Triggers when	Range	Default
Single data volume	A single operation's data volume exceeds the threshold	1–10,000,000 entries	1
Cumulative occurrences	The same event occurs more than the threshold count within the time window	1–10,000 times	10
Cumulative data volume	The total data volume of operations within the time window exceeds the threshold. DataWorks automatically categorizes and tracks individual events.	1–10,000,000 entries	1

Time window — The period over which events are counted. Default: 10 minutes. Valid ranges: 1–59 minutes, 1–23 hours, or 1–7 days. Required only when Threshold category is set to Cumulative occurrences.

Click Next.

Step 4: Configure alerting

Select an alerting method to receive notifications when a risk is detected. Available options: Email and WebHook.

Configure email and WebHook settings in System Settings before selecting an alerting method here.

Click Save. The rule is created.

Important

Custom rules are disabled by default after creation. On the Risk Identification Rules page, click Re-enable next to the rule to activate it.

Manage risk identification rules

The Risk Identification Rules page lists all rules with their type, level, and status.

Area	Available operations
1 — Filter bar	Filter rules by Risk Type, Risk Level, Built-in or Not Built-in, or Risk Rule Name. Name search supports fuzzy matching.
2 — Rule list	View basic information: See risk type, level, status, and detection statistics including Risks Hit, Pending Risks, and Handled Risks. View and edit details: Click View Details to see the full rule configuration and make changes. Re-enable a rule: Click the re-enable icon to activate a disabled rule. This operation is only available for disabled rules.
3 — Batch operations	Select multiple rules and perform Batch Enable, Batch Disable, or Batch Delete. Built-in rules cannot be deleted. Custom rules can only be deleted when in the Disabled state.

What's next

After a rule is enabled, go to the Data Risks page to view detected risks and take action. For details, see View data risks.