The Fraud Detection management feature provides multi-dimensional association analysis and algorithms. This intelligent analysis helps you use fraud detection rules to identify risky operations and receive alerts. You can perform centralized audits through visualization. DataWorks includes built-in fraud detection rules for many scenarios. You can use these rules directly or create custom rules as needed. This topic describes how to create and manage fraud detection rules.
Background information
After data is ingested into DataWorks, it is filtered and processed by Data Security Guard. To address the challenges of sensitive data detection in various scenarios, DataWorks provides the comprehensive Fraud Detection management feature. This feature offers the following benefits:
Easy to use
This feature identifies four risk types: Data access risks, Data export risks, Data operation risks, and Other risk types. It detects various risks by combining multiple dimensions, such as access time, sensitivity level, and access volume.
High accuracy
The feature uses event aggregation and statistical comparison. By comparing the number of event occurrences within a time window against a threshold, it detects risks more accurately and reduces false positives. For example, a risk is hit only if the same event occurs more than three times within 10 minutes.
Fine-grained management
You can configure High, Medium, and Low risk levels for fine-grained risk management.
Flexible rules
The feature provides built-in rules for common scenarios that you can use directly. You can also create custom fraud detection rules as needed. For more information, see Built-in fraud detection rules and Create a fraud detection rule.
Limits
Version limits
The Fraud Detection management feature is available only in DataWorks Professional Edition and later versions.
Built-in fraud detection rules are available only in DataWorks Enterprise Edition and later versions.
Alerting methods
Supported alerting methods include email and WebHook.
NoteDataWorks supports WebHook URLs for DingTalk groups, WeCom, and Lark. The feature that lets you push alert information to WeCom or Lark is available only in DataWorks Enterprise Edition and later versions.
Go to Fraud Detection management
Go to Data Security Guard.
Go to the DataStudio page.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
Click the
icon in the upper-left corner. Then, choose . On the page that appears, click Try Now to go to the Data Security Guard page. NoteIf your Alibaba Cloud account is granted the required permissions, you can directly access the homepage of Data Security Guard.
If your Alibaba Cloud account is not granted the required permissions, you are redirected to the authorization page of Data Security Guard. You can use the features of Data Security Guard only after your Alibaba Cloud account is granted the required permissions.
Go to the Fraud Detection management console.
In the navigation pane on the left of the Data Security Guard page, choose . On the page that appears, you can create and manage fraud detection rules.
The Fraud Detection management feature includes built-in rules for many common scenarios that you can use directly. You can also create custom fraud detection rules as needed. For more information, see Built-in fraud detection rules and Create a fraud detection rule.
Built-in fraud detection rules
The Fraud Detection management feature supports the built-in rules described in the following table.
Rule Name | Rule Type | Rule Level | Rule Configuration |
Querying large volumes of sensitive data outside of work hours | Data access risk | Low | This rule is hit if the data volume of a query exceeds 10,000 during the following time periods:
|
Similar SQL queries | Data access risk | Low | This rule is hit if five or more similar SQL queries are run within 10 minutes. |
Batch querying of large volumes of sensitive data | Data access risk | Medium | This rule is hit if the data volume of a single query exceeds 10,000. |
Batch exporting of large volumes of sensitive data | Data export risk | High | This rule is hit if the data volume of a single export exceeds 10,000. |
Exporting large volumes of sensitive data outside of work hours | Data export risk | High | This rule is hit if the data volume of an export exceeds 10,000 during the following time periods:
|
Create a fraud detection rule
Plan and prepare to create the rule.
You can detect risk data based on dimensions such as Data Location, Data Properties, User Information, and Operation Time to configure more fine-grained detection conditions for your scenario. If you use different subcategories of Data Properties and User Information to configure detection conditions, you must first complete the following preparations.
Risk detection dimension
Subcategory
Description
Data Properties
Data security level
To detect risk data of a specific level, you must define data security levels in advance. For more information, see Configure sensitive data classification and security levels.
Data category
To detect risk data of a specific category, you must define data categories in advance. For more information, see Configure data detection rules and run detection tasks.
Sensitive field type
To detect risk data for a specific sensitive field type, you must define sensitive field types in advance. For more information, see Configure data detection rules and run detection tasks.
User Information
User group
To detect risk data for a specific user group under the current logon account, you must configure user groups in advance. For more information, see Configure user groups.
RAM role
To detect risk data for RAM users under the current logon account, you must add RAM users to your Alibaba Cloud account in advance. For more information, see Create a RAM user.
Click + Fraud Detection Rule in the upper-right corner of the Fraud Detection Management page.
In the Create Fraud Detection Rule dialog box, you can configure the rule.
NoteCurrently, you can only create Statistical Association Rules. A Statistical Association Rule is used for aggregation, statistical analysis, and threshold comparison on a single event type. The rule is hit when the number of events exceeds the specified threshold. For example, you can set a rule that is hit if a user with low permission accesses more than 10,000 sensitive data records outside of work hours.
Configure basic information for the rule.

Parameter
Description
(Required) Rule Name
The name of the new fraud detection rule. The name must be 1 to 30 characters in length and cannot contain special characters.
(Required) Rule Type
The type of the fraud detection rule. Valid values:
Data access risk: A potential risk exists when data is accessed.
Data export risk: A potential risk exists when data is exported.
Data deletion risk: A potential risk exists when data is deleted.
Data update risk: A potential risk exists when data is updated.
Database and table operation risk: A potential risk exists during database and table operations.
Data authorization risk: A potential risk exists when data permissions are granted.
(Required) Rule Level
The level of the fraud detection rule. Valid values are Low, Medium, and High. You can set the level to High for rules that detect important data.
(Optional) Description
The description of the fraud detection rule. The description can be 1 to 100 characters in length.
Click Next.
Configure fraud detection conditions and thresholds.
Configure fraud detection conditions.
DataWorks lets you detect data risks across dimensions such as Data Location, Data Properties, User Information, and Operation Time. This helps you configure more fine-grained detection conditions for your scenario.
NoteYou can add up to 10 conditions. To add more detection conditions for a selected dimension, click + Add Comparison. All conditions are evaluated using a logical AND.
Data Location
This dimension is used to set the location scope for detecting risky data.

Parameter
Description
Required
Filter selected location
Specifies whether to filter risk data at the selected location. Valid values:
≠: Filters the destination location. The rule does not detect risk data at the selected location.
=: Detects only at the destination location. The rule detects risk data only at the selected location.
Yes
Data engine name
Select the engine scope for the detection rule.
NoteCurrently, only risk data in the MaxCompute engine can be detected.
You can select only one engine for each comparison. To specify multiple engines, click + Add Comparison to configure multiple detection conditions.
Yes
Project name
Select the destination project for the detection rule. The Project name must be a project within the selected engine. You can select a project from the drop-down list or search for it by name.
NoteThe drop-down list displays a maximum of 100 project names.
The search supports fuzzy matching. You can enter a keyword to search for projects whose names contain the keyword.
You can select only one project for each comparison. To specify multiple projects, click + Add Comparison to configure multiple detection conditions.
Yes
Table name
Enter the destination tables for the detection rule. You can enter one or more table names, separated by commas (,). Note the following when entering table names:
A single table name cannot exceed 30 characters. The total length of all table names cannot exceed 100 characters.
You can use the wildcard character (
*). For example,*namedetects data in all tables with the suffixname.
No. If you do not configure this parameter, the rule detects risk data in all tables of the selected project by default.
Data Properties
This dimension is used to filter the property scope for detecting risky data.

Parameter
Description
Property
Select the property category for detecting risk data based on your business needs. The following property categories are supported:
Data security level: Used to specify the security level of risk data to detect. You must define data security levels in advance. For more information, see Configure sensitive data classification and security levels.
Data category: Used to specify the category of risk data to detect. You must define data categories in advance. For more information, see Configure data detection rules and run detection tasks.
Sensitive field type: Used to specify the type of sensitive field to detect. You must define sensitive field types in advance. For more information, see Configure data detection rules and run detection tasks.
Filter selected property
Specifies whether to filter risk data with the selected property. Valid values:
≠: Filters the destination property. The rule does not detect risk data with the selected property.
=: Detects only the destination property. The rule detects risk data only with the selected property.
User Information
This dimension is used to filter the user information scope for detecting risky data.

Parameter
Description
Information category
Select the user information category for detecting risk data. Valid values:
User group: The name of a user group under the current logon account. You must configure user groups in advance. For more information, see Configure user groups.
RAM role: A RAM user under the current logon account. You must add RAM users to your Alibaba Cloud account in advance. For more information, see Create a RAM user.
Username: The current logon user.
Filter selected user information
≠: Filters the destination user information. The rule does not detect risk data for the selected user.
=: Detects only the destination user information. The rule detects risk data only for the selected user.
Operation Time
This dimension is used to filter the operation time range for detecting risky data.

Parameter
Description
Select time range
Click a day and hour to select the desired time range. You can select any time from Monday to Sunday, with precision to the hour. You can add multiple time ranges. The added time ranges are mutually exclusive. For example, if you select Monday in Condition 1, you cannot select Monday in Condition 2.
Filter selected time
≠: Filters the destination operation time. The rule does not detect risk data during the selected operation time.
=: Detects only the destination operation time. The rule detects risk data only during the selected operation time.
Configure thresholds.
DataWorks supports event aggregation, allowing you to detect potential risks by comparing the number of event occurrences within a time window against a threshold. Click + Add Threshold Comparison to configure multiple threshold conditions for fraud detection.

Parameter
Description
Threshold category
Single data volume: Detects risk data based on the volume of data in an operation. If the data volume of an operation exceeds the specified threshold, the operation is considered a risk. The data volume can be an integer from 1 to 10,000,000. The unit is records. The default value is 1.
Cumulative occurrences: Detects risk data based on the number of times a single event occurs within a specified time range. If the number of occurrences of a single event exceeds the specified threshold within the time range, it is considered a risk. The number of occurrences can be an integer from 1 to 10,000. The unit is times. The default value is 10.
Cumulative data volume: Detects risk data based on the volume of data operated on within a specified time range. If the data volume of an operation exceeds the specified threshold, the operation is considered a risk. The data volume can be an integer from 1 to 10,000,000. The unit is records. The default value is 1.
NoteDataWorks automatically categorizes and detects single events for you.
Time window
The time range for the number of event occurrences. The default value is 10 minutes. Valid values:
Minute: The value can range from 1 to
59.Hour: The value can range from 1 to
23.Day: The value can range from 1 to
7.
NoteThis parameter is required only when Threshold category is set to Cumulative occurrences.
Click Next.
Configure the alerting method.
When a data risk is detected, you can promptly receive alert information through a configured alerting method to address the risk. You can select Email and WebHook as the alerting methods.
NoteBefore you select an alerting method, ensure that you have completed the email and WebHook configurations in System Settings.
Click Save to create the rule.
Custom rules are disabled by default after creation. On the Fraud Detection Rules page, click Re-enable for the rule that you want to enable.
Manage fraud detection rules
On the Fraud Detection management page, you can view a list of created rules and their details. You can also edit a rule. 
Area | Description |
1 | In this area, you can filter the rule list by conditions such as Risk Type, Risk Level, Is Built-in, and Fraud Rule Name. Note The search by name supports fuzzy matching. You can enter a keyword to search for fraud detection rules whose names contain the keyword. |
2 | In this area, you can perform the following operations:
|
3 | In this area, you can perform batch operations on target rules. Currently, batch operations such as Batch Enable, Batch Disable, and Batch Delete are supported. Click the Note DataWorks does not support deleting built-in fraud detection rules. You can only delete custom rules that are in the Disabled state. |
What to do next
After creating and enabling a fraud detection rule, you can go to the Data Risks page to view the details of any risks the rule has hit and handle them promptly. For more information, see View data risks.