Fraud Detection management uses multi-dimensional association analysis and algorithms. This intelligent technology helps you proactively identify risky operations and receive alerts. You can use fraud detection rules and perform comprehensive auditing with visualization tools. DataWorks includes built-in fraud detection rules for many scenarios. You can use these rules out of the box or create custom rules as needed. This topic describes how to create and manage fraud detection rules.
Background information
Data entered into DataWorks is filtered by Data Security Guard. DataWorks provides the comprehensive Fraud Detection management feature to detect sensitive data in various scenarios. This feature offers the following benefits:
Ease of use
The feature includes four risk types: Data access risk, Data export risk, Data operation risk, and Other risk types. It also supports combining multiple dimensions, such as Access time, Sensitivity type, and Access volume, to detect various types of risks.
High accuracy
The feature uses event aggregation and statistical comparison. By comparing the number of event occurrences within a time window against a threshold, the feature detects risks more accurately and reduces false positives. For example, a risk is detected only if the same event occurs more than three times within 10 minutes.
Fine-grained management
The feature supports configuring High, Medium, and Low risk levels for fine-grained risk management.
Flexible rules
The feature has built-in rules for common scenarios that you can use directly. You can also create custom fraud detection rules as needed. For more information, see Built-in fraud detection rules and Create a fraud detection rule.
Limits
Version limits
Only DataWorks Professional Edition and later versions support the Fraud Detection management feature.
Only DataWorks Enterprise Edition supports built-in fraud detection rules.
Alerting methods
Only email and WebHook alerting methods are supported.
NoteDataWorks supports WebHook URLs for DingTalk groups, WeCom, and Lark. Only the Enterprise Edition supports pushing alert information to WeCom or Lark.
Go to Fraud Detection management
Go to Data Security Guard.
Go to the DataStudio page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
Click the
icon in the upper-left corner. Then, choose . On the page that appears, click Try Now to go to the Data Security Guard page. NoteIf your Alibaba Cloud account is granted the required permissions, you can directly access the homepage of Data Security Guard.
If your Alibaba Cloud account is not granted the required permissions, you are redirected to the authorization page of Data Security Guard. You can use the features of Data Security Guard only after your Alibaba Cloud account is granted the required permissions.
Go to the Fraud Detection management.
On the Data Security Guard page, choose in the navigation pane on the left. You are redirected to the Fraud Detection management page where you can create and manage fraud detection rules.
Fraud Detection management has built-in rules for many common scenarios that you can use directly. You can also create custom fraud detection rules as needed. For more information, see Built-in fraud detection rules and Create a fraud detection rule.
Built-in fraud detection rules
The Fraud Detection management feature supports the built-in rules listed in the following table.
Rule name | Rule type | Rule level | Rule configuration |
Querying large volumes of sensitive data outside of work hours | Data access risk | Low | This rule is hit when the data volume of a query exceeds 10,000 during the following time periods.
|
Similar SQL queries | Data access risk | Low | This rule is hit when five or more similar SQL queries are run within 10 minutes. |
Batch querying large volumes of sensitive data | Data access risk | Medium | This rule is hit when the data volume of a single query exceeds 10,000. |
Batch exporting large volumes of sensitive data | Data export risk | High | This rule is hit when the data volume of a single export exceeds 10,000. |
Exporting large volumes of sensitive data outside of work hours | Data export risk | High | This rule is hit when the data volume of an export exceeds 10,000 during the following time periods.
|
Create a fraud detection rule
Plan and prepare to create the rule.
Based on your scenario, you can detect risky data across dimensions such as Data location, Data properties, User information, and Operation time to configure more fine-grained detection conditions. When you use subcategories of Data properties and User information to configure detection conditions, perform the following preparatory steps.
Detection dimension
Subcategory
Description
Data properties
Data classification level
To detect risky data of a specific level, you must define data classification levels in advance. For more information, see Configure sensitive data classification and levels.
Data category
To detect risky data of a specific category, you must define data categories in advance. For more information, see Configure data detection rules and run detection tasks.
Sensitive field type
To detect risky data in specific sensitive fields, you must define sensitive field types in advance. For more information, see Configure data detection rules and run detection tasks.
User information
User group
To detect risky data for a specific user group under the current logon account, you must configure user groups in advance. For more information, see Configure user groups.
RAM role
To detect risky data for a RAM user under the current logon account, you must add a RAM user to your Alibaba Cloud account in advance. For more information, see Create a RAM user.
In the upper-right corner of the Fraud Detection Management page, click + Fraud Detection Rule.
In the Create Fraud Detection Rule dialog box, configure the parameters for the rule.
NoteCurrently, you can create only statistical association rules. A statistical association rule aggregates and counts single events and compares the count against a threshold. A risk is detected if the number of events exceeds the threshold. For example, a rule can be configured to detect a risk if a low-privilege user accesses more than 10,000 sensitive data entries outside of work hours.
Configure the basic information for the rule.

Parameter
Description
(Required) Rule Name
The name of the new fraud detection rule. The name must be 1 to 30 characters in length and cannot contain special characters.
(Required) Rule Type
The type of the fraud detection rule. Valid values:
Data access risk: A potential risk exists when data is accessed.
Data export risk: A potential risk exists when data is exported.
Data deletion risk: A potential risk exists when data is deleted.
Data update risk: A potential risk exists when data is updated.
Table and library operation risk: A potential risk exists when operations are performed on tables and libraries.
Data authorization risk: A potential risk exists when data permissions are granted.
(Required) Rule Level
The level of the fraud detection rule. Valid values are Low, Medium, and High. You can set the rule level to High for rules that detect important data.
(Optional) Description
The description of the fraud detection rule. The description can be 1 to 100 characters in length.
Click Next.
Configure detection conditions and thresholds.
Configure detection conditions.
DataWorks lets you detect risky data across dimensions such as Data location, Data properties, User information, and Operation time. This lets you configure more fine-grained detection conditions based on your scenario.
NoteYou can add up to 10 conditions. Click + Add Comparison within a selected dimension to add multiple detection conditions. The logical relationship between multiple conditions is AND.
Data location
Used to specify the location scope for detecting risky data.

Parameter
Description
Required
Filter selected location
Specifies whether to filter risky data in the selected location. Valid values:
≠: Filters the destination location. The rule does not detect risky data in the selected location.
=: Detects only in the destination location. The rule detects risky data only in the selected location.
Yes
Data engine name
Select the engine scope for the rule.
NoteCurrently, only risky data in the MaxCompute engine can be detected.
You can select only one engine for each comparison. To specify multiple engines, click + Add Comparison to configure multiple detection conditions.
Yes
Project name
Select the destination project for the rule. The Project name must be a project within the selected engine. You can select a project from the drop-down list or enter a project name to search.
NoteThe drop-down list displays up to 100 project names.
The search supports fuzzy matching. Enter a keyword to search for projects whose names contain the keyword.
You can select only one project for each comparison. To specify multiple projects, click + Add Comparison to configure multiple detection conditions.
Yes
Table name
Enter the destination tables for the rule. You can enter one or more table names, separated by commas (,). Note the following when entering table names:
A single table name can be up to 30 characters long. The total length of all table names cannot exceed 100 characters.
The wildcard character (
*) is supported. For example,*namematches all tables with names ending inname.
No. If you do not configure this parameter, the rule detects risky data in all tables within the selected project by default.
Data properties
Used to specify the property scope for detecting risky data.

Parameter
Description
Property
Select the property category for detecting risky data based on your business needs. The following property categories are supported:
Data classification level: Used to specify which level of risky data to detect. You must define data classification levels in advance. For more information, see Configure sensitive data classification and levels.
Data category: Used to specify which category of risky data to detect. You must define data categories in advance. For more information, see Configure data detection rules and run detection tasks.
Sensitive field type: Used to specify which type of sensitive field to detect risky data in. You must define sensitive field types in advance. For more information, see Configure data detection rules and run detection tasks.
Filter selected property
Specifies whether to filter risky data with the selected property. Valid values:
≠: Filters the destination property. The rule does not detect risky data with the selected property.
=: Detects only the destination property. The rule detects risky data only with the selected property.
User information
Used to specify the user information scope for detecting risky data.

Parameter
Description
Information category
Select the user information category for detecting risky data. Valid values:
User group: The name of a user group under the current logon account. You must configure user groups in advance. For more information, see Configure user groups.
RAM role: A RAM user under the current logon account. You must add a RAM user to your Alibaba Cloud account in advance. For more information, see Create a RAM user.
Username: The current logon user.
Filter selected user information
≠: Filters the destination user information. The rule does not detect risky data for the selected user.
=: Detects only the destination user information. The rule detects risky data only for the selected user.
Operation time
Used to specify the operation time scope for detecting risky data.

Parameter
Description
Select time range
Click a day of the week and an hour to select the desired time range. You can select any time from Monday to Sunday, with precision to the hour. You can add multiple time ranges. The added time ranges are mutually exclusive. For example, if you select Monday in Condition 1, you cannot select Monday in Condition 2.
Filter selected time
≠: Filters the destination operation time. The rule does not detect risky data during the selected operation time.
=: Detects only the destination operation time. The rule detects risky data only during the selected operation time.
Configure thresholds.
DataWorks supports event aggregation and statistics. You can detect risky data by comparing the number of event occurrences within a time window against a threshold. Click + Add Threshold Comparison to configure multiple threshold conditions.

Parameter
Description
Threshold category
Single data volume: Detects risky data based on the volume of data in an operation. An operation hits the risk if the data volume exceeds the set threshold. The data volume is an integer from 1 to 10,000,000. The unit is entries. The default value is 1.
Cumulative occurrences: Detects risky data based on the number of times a single event occurs within a specified time range. A risk is hit if the number of occurrences of a single event exceeds the set threshold within the specified time range. The number of occurrences is an integer from 1 to 10,000. The unit is times. The default value is 10.
Cumulative data volume: Detects risky data based on the volume of data operated on within a specified time range. An operation hits the risk if the data volume exceeds the set threshold. The data volume is an integer from 1 to 10,000,000. The unit is entries. The default value is 1.
NoteDataWorks automatically categorizes and detects single events.
Time window
The time range that limits the number of event occurrences. The default value is 10 minutes. Valid values:
Minute: The value ranges from 1 to
59.Hour: The value ranges from 1 to
23.Day: The value ranges from 1 to
7.
NoteThis parameter is required only when Threshold category is set to Cumulative occurrences.
Click Next.
Configure the alerting method.
After a data risk is detected, you can promptly receive alert information based on the configured alerting method to handle the risk. You can select Email and WebHook as alerting methods.
NoteBefore you select an alerting method, make sure that you have configured email and WebHook settings in System Settings.
Click Save. The rule is created.
Custom rules are disabled by default after they are created. On the Fraud Detection Management page, you must click Re-enable next to the destination rule to manually enable it.
Manage fraud detection rules
On the Fraud Detection Management page, you can view the list of created rules and their details. You can also edit a specific rule. 
Area | Description |
1 | In this area, you can filter the rule list by conditions such as Risk Type, Risk Level, Is Built-in, and Risk Rule Name. Note The search by name supports fuzzy matching. Enter a keyword to search for fraud detection rules whose names contain the keyword. |
2 | In this area, you can perform the following operations:
|
3 | In this area, you can perform batch operations on destination rules. Currently, you can perform batch operations such as Batch Enable, Batch Disable, and Batch Delete. Click the Note DataWorks does not support deleting built-in fraud detection rules. You can only delete custom rules that are in the Disabled state. |
What to do next
After a fraud detection rule is created and enabled, you can navigate to the Data Risks page to view the details of risks detected by the rule and handle them promptly. For more information, see View data risks.