Risk identification rules use multi-dimensional association analysis to proactively detect risky operations on sensitive data — such as bulk queries outside business hours or mass data exports. DataWorks includes built-in rules for the most common risk scenarios that you can enable immediately, and supports custom rules for organization-specific needs.
Limitations
| Limitation | Details |
|---|---|
| Edition requirement | Risk identification rules require DataWorks Professional Edition or later. Built-in rules are only available in Enterprise Edition. |
| Alerting methods | Only Email and WebHook alerting are supported. WebHook supports DingTalk groups, WeCom, and Lark. Pushing alerts to WeCom or Lark requires Enterprise Edition. |
How risk identification rules work
All risk identification rules use statistical association rules: DataWorks aggregates events and compares the count against a threshold within a time window. A risk is triggered only when the threshold is exceeded — for example, a rule can be configured to detect a risk if a low-privilege user accesses more than 10,000 sensitive data entries outside of work hours.
Each rule can combine up to 10 detection conditions across the following dimensions. All conditions within a rule use AND logic.
| Detection dimension | What it scopes |
|---|---|
| Data location | The MaxCompute engine, project, and tables where the operation occurs |
| Data properties | The classification level, category, or sensitive field type of the data |
| User information | A specific user group, RAM role, or username |
| Operation time | The day of the week and time range when the operation occurs |
Navigate to risk identification rules
Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose Data Development and O&M > Data Development. Select the target workspace and click Go to Data Development.
Click the
icon in the upper-left corner. Choose All Products > Data Governance > Data Security Guard, then click Try Now.If your Alibaba Cloud account already has the required permissions, you are taken directly to the Data Security Guard homepage. If not, you are redirected to an authorization page and must obtain the required permissions before proceeding.
In the left-side navigation pane, choose Rule Configuration > Risk Identification Rules.
Built-in risk identification rules
Enterprise Edition includes five built-in rules covering the most common sensitive data risk scenarios. These rules are ready to use without configuration.
| Rule name | Type | Level | When it triggers |
|---|---|---|---|
| Querying large volumes of sensitive data outside of work hours | Data access risk | Low | A query returns more than 10,000 entries during off-hours (Mon–Fri: 19:00–24:00; Sat–Sun: 00:00–24:00) |
| Similar SQL queries | Data access risk | Low | Five or more similar SQL queries run within 10 minutes |
| Batch querying large volumes of sensitive data | Data access risk | Medium | A single query returns more than 10,000 entries |
| Batch exporting large volumes of sensitive data | Data export risk | High | A single export contains more than 10,000 entries |
| Exporting large volumes of sensitive data outside of work hours | Data export risk | High | An export contains more than 10,000 entries during off-hours (Mon–Fri: 22:00–24:00; Sat–Sun: 00:00–24:00) |
Create a risk identification rule
Before creating a rule, define what your rule needs to detect:
What operation triggers the risk (data access, export, deletion, update, table operations, or authorization)
Where the risky operation occurs (which engine, project, or tables)
Who performs the operation (a specific user group, RAM role, or username)
When the operation occurs (specific days or hours)
How much constitutes a risk (data volume or frequency threshold)
Step 1: Prepare detection condition prerequisites
Some detection dimensions require you to configure dependent resources in advance. Check the following table and complete any setup that applies to your rule.
| Detection dimension | Subcategory | What to configure in advance |
|---|---|---|
| Data properties | Data classification level | Configure sensitive data classification and levels |
| Data properties | Data category | Configure sensitive data detection rules and run detection tasks |
| Data properties | Sensitive field type | Configure sensitive data detection rules and run detection tasks |
| User information | User group | Configure user groups |
| User information | RAM role | Create a RAM user |
Step 2: Set basic information
In the upper-right corner of the Risk Identification Rules page, click + Risk Identification Rules. In the Create Risk Identification Rule dialog box, configure the following parameters.

| Parameter | Description |
|---|---|
| Rule Name (required) | A name for the rule. Must be 1–30 characters and cannot contain special characters. |
| Rule Type (required) | The type of operation to monitor. Valid values: Data Access, Data Export, Data Deletion, Data Update, Library Table Operations, Data Authorization. |
| Rule Level (required) | The severity level: Low, Medium, or High. Set to High for rules covering critical or highly sensitive data. |
| Description (optional) | A description of 1–100 characters. |
Click Next.
Step 3: Configure detection conditions and thresholds
Detection conditions
Add one or more conditions to scope the rule. Click + Add Comparison Relationship within a dimension to add multiple conditions. All conditions use AND logic, and up to 10 conditions can be added per rule.
Data location — Specify the engine, project, and tables to monitor.
| Parameter | Description | Required |
|---|---|---|
| Filter selected location | = detects risks only in the selected location. ≠ excludes the selected location from detection. | Yes |
| Compute engine name | Currently, only MaxCompute is supported. Each comparison supports one engine. Click + Add Comparison to specify multiple engines. | Yes |
| Project name | The project within the selected engine. The drop-down list shows up to 100 projects and supports fuzzy matching. Each comparison supports one project. Click + Add Comparison to specify multiple projects. | Yes |
| Table name | One or more table names, separated by commas. Each name can be up to 30 characters; all names combined cannot exceed 100 characters. The wildcard * is supported (for example, *name matches all tables ending in name). If left blank, the rule applies to all tables in the selected project. | No |
Data properties — Specify which data properties to target.
| Parameter | Description |
|---|---|
| Property Type | The property to filter on: Data classification level, Data category, or Sensitive field type. Each property type must be configured in advance. See Step 1. |
| Filter selected property | = detects risks only for data with the selected property. ≠ excludes data with the selected property. |
User information — Scope the rule to specific users or groups.
| Parameter | Description |
|---|---|
| Information category | The user scope to target: User group, RAM role, or Username. |
| Filter selected user information | = detects risks only for the selected user. ≠ excludes the selected user from detection. |
Operation time — Restrict detection to specific time ranges.
| Parameter | Description |
|---|---|
| Select time range | Click a day of the week and an hour to define the time range. Precision is to the hour. Multiple time ranges can be added, but they must not overlap — for example, if Monday is selected in one condition, Monday cannot be selected in another condition. |
| Filter selected time | = detects risks only during the selected time. ≠ excludes the selected time from detection. |
Thresholds
Thresholds define when a risk is triggered. Click + Add Threshold Comparison to add multiple threshold conditions.
| Threshold category | Triggers when | Range | Default |
|---|---|---|---|
| Single data volume | A single operation's data volume exceeds the threshold | 1–10,000,000 entries | 1 |
| Cumulative occurrences | The same event occurs more than the threshold count within the time window | 1–10,000 times | 10 |
| Cumulative data volume | The total data volume of operations within the time window exceeds the threshold. DataWorks automatically categorizes and tracks individual events. | 1–10,000,000 entries | 1 |
Time window — The period over which events are counted. Default: 10 minutes. Valid ranges: 1–59 minutes, 1–23 hours, or 1–7 days. Required only when Threshold category is set to Cumulative occurrences.
Click Next.
Step 4: Configure alerting
Select an alerting method to receive notifications when a risk is detected. Available options: Email and WebHook.
Configure email and WebHook settings in System Settings before selecting an alerting method here.
Click Save. The rule is created.
Custom rules are disabled by default after creation. On the Risk Identification Rules page, click Re-enable next to the rule to activate it.
Manage risk identification rules
The Risk Identification Rules page lists all rules with their type, level, and status.

| Area | Available operations |
|---|---|
| 1 — Filter bar | Filter rules by Risk Type, Risk Level, Built-in or Not Built-in, or Risk Rule Name. Name search supports fuzzy matching. |
| 2 — Rule list | View basic information: See risk type, level, status, and detection statistics including Risks Hit, Pending Risks, and Handled Risks. View and edit details: Click View Details to see the full rule configuration and make changes. Re-enable a rule: Click the re-enable icon to activate a disabled rule. This operation is only available for disabled rules. |
| 3 — Batch operations | Select multiple rules and perform Batch Enable, Batch Disable, or Batch Delete. Built-in rules cannot be deleted. Custom rules can only be deleted when in the Disabled state. |
What's next
After a rule is enabled, go to the Data Risks page to view detected risks and take action. For details, see View data risks.