This topic describes how to create a data detection task and manually correct data that is inaccurately identified on the sensitive data detection page.
Manually corrected results are displayed and take effect the next day.
Create a detection task
Go to the sensitive data detection rule page. For more information, see Go to the sensitive data detection rule page.
Click the Detection Task tab to go to the detection task page.
Start a sensitive data detection task.
Configure the Sensitive Data Detection Task.
In the Enable Sensitive Data Detection Task dialog box, configure the task type, scan method, and scope. You can configure a real-time task, a scheduled task, or a one-time task.
Configure a real-time task.

The following table describes the parameters.
Parameter
Description
Detection Account
Configure data sampling and scanning using an Alibaba Cloud account or a RAM user. The selected account is used to sample and scan data. The scope of data that can be sampled varies based on the account's permissions.
NoteTo use a RAM user for detection, first grant the RAM user permissions on the MaxCompute project.
Real-time Detection
Only ODPS supports real-time detection. When ODPS metadata changes, such as adding a table or field, or modifying a field, Data Security Guard automatically starts a sensitive data detection task for the changed metadata.
Data Security Guard obtains metadata change information in real time. If the change is due to a new table or field, the new table or field may not have content yet. In this case, only metadata is used for sensitive data detection.
Configure a scheduled task.
The following table describes the parameters.Parameter
Description
Task Execution
You must manually enable task execution.
Subsequent Detection Task Scan and Update Policy
Two options are available:
Rescan and update results only for changed rules, data affected by the changed rules, and data with no results.
Rescan all data and overwrite all results.
You can select not to overwrite manually corrected results.
Detection Account
Configure data sampling and scanning using an Alibaba Cloud account or a RAM user. The selected account is used to sample and scan data. The scope of data that can be sampled and scanned varies based on the account's permissions.
NoteTo use a RAM user for sampling and scanning, first grant the RAM user permissions on the MaxCompute project.
Content Detection
Configure whether the Content Detection and Metadata Detection rules are enabled. The corresponding rules take effect only after you select them.
NoteIf you do not select Content Detection, Data Security Guard will not sample or scan data. The content detection rules will not take effect, but the rules for field names and field comments will still be effective.
Sample Size
Set the sample size for content detection. A value greater than 100 is recommended.
This parameter is required when you select Content Detection.
Scan Frequency and Scan Time
Define the scan epoch for the scheduled task.
This parameter is required only when you set Task Type to Scheduled Task.
You can set the scan frequency to Once a week or Once a day. For weekly scans, you can select any day from Monday to Friday. The time range is from 0:00 to 23:59.
Scan Scope
Configure the data scope for the sensitive data detection task.
All: Scans all data under the authorized account of the current tenant.
Partial Data: Scans table data in specified projects.
NoteThe default project scope includes all projects of all DPI engines.
You can scan data in specified tables of ODPS, EMR, and HOLO projects.
The total length of a table name can be
0 to 100characters. All character types are supported. If you leave this field empty, all tables are scanned.The
.*wildcard character is supported. For example,.*namematches table names that end withname, andprivate.*matches table names that start withprivate.Use commas (,) to separate multiple table names or field names.
If you select Partial Data, you can add multiple project or database scan scopes. The final scan scope is the union of all specified scopes.
You must manually select a project on the left side of the page.
After you select a project, the data tables within that project or database are displayed on the right. You can manually select tables or select all tables at once. By default, all data tables in the database are selected.
Keyword search is supported for projects, databases, and data tables. To search for a data table by keyword, first select a project and then perform the search within that project.
Configure a one-time task.
The following table describes the parameters.Parameter
Description
Detection Task Scan and Update Policy
Two options are available:
Rescan and update results only for changed rules, data affected by the changed rules, and data with no results.
Rescan all data and overwrite all results.
You can select not to overwrite manually corrected results.
Detection Account
Configure data sampling and scanning using an Alibaba Cloud account or a RAM user. The selected account is used to sample and scan data. The scope of data that can be sampled and scanned varies based on the account's permissions.
NoteTo use a RAM user for sampling and scanning, first grant the RAM user permissions on the MaxCompute project.
Content Detection
Configure whether the Content Detection and Metadata Detection rules are enabled. The corresponding rules take effect only after you select them.
NoteIf you do not select Content Detection, Data Security Guard will not sample or scan data. The content detection rules will not take effect, but the rules for field names and field comments will still be effective.
Sample Size
Set the sample size for content detection. A value greater than 100 is recommended.
This parameter is required when you select Content Detection.
Scan Scope
Configure the data scope for the sensitive data detection task.
All: Scans all data under the authorized account of the current tenant.
Partial Data: Scans table data in specified projects.
NoteThe default project scope includes all projects of all DPI engines.
You can scan data in specified tables of ODPS, EMR, and HOLO projects.
The total length of a table name can be
0 to 100characters. All character types are supported. If you leave this field empty, all tables are scanned.The
.*wildcard character is supported. For example,.*namematches table names that end withname, andprivate.*matches table names that start withprivate.Use commas (,) to separate multiple table names or field names.
If you select Partial Data, you can add multiple project or database scan scopes. The final scan scope is the union of all specified scopes.
You must manually select a project on the left side of the page.
After you select a project, the data tables within that project or database are displayed on the right. You can manually select tables or select all tables at once. By default, all data tables in the database are selected.
Keyword search is supported for projects, databases, and data tables. To search for a data table by keyword, first select a project and then perform the search within that project.
Click Enable to start the scan task.
After the task starts, the Task Status changes as follows:
Real-time task: The status changes to Enabling.
Scheduled task: The status changes to Enabling. When the configured scan time is reached, the platform performs sensitive data detection based on your configuration.
One-time task: The status changes to a progress bar chart. The task is complete when the progress reaches 100%. The progress is calculated using the following formula: (Number of tables scanned in the current task / Total number of tables to be scanned in the current task) × 100%.
NoteAfter a detection rule is modified, the new rule takes effect in the next scheduled task. To apply the changes immediately, you can create a one-time detection task.
After the scan task is complete, the Task Status is updated to No Task.
Manually correct detection results
Go to the sensitive data detection rule page. For more information, see Go to the sensitive data detection rule page.
Click the Detection Results tab to go to the detection results page.
Manually correct inaccurate detection results.

Operation
Description
Filter by DPI engine type
In area ① of the preceding figure, you can select a DPI engine from the drop-down list.
NoteYou can correct the detection results for sensitive fields in ODPS, EMR, CDH_HIVE, and HOLO engines.
Filter
In area ② of the preceding figure, you can filter the detection results.
You can filter by conditions such as Project, Table Name, and Field Name. You can also click Expand to view more filter conditions and further filter by Classification, Categorization, and Sensitive Field Type.
Classification: The classification information in the default classification and categorization template for the current tenant. For more information, see Configure sensitive data classification and categorization.
Categorization: The categorization information in the default classification and categorization template for the current tenant.
Correct a single data entry
Area ③ of the preceding figure displays a list of detection results. You can click Displayed Fields Settings and select the fields you want to view to refresh the list details. By default, the list displays Project, Table Name, Field Name, Classification, Categorization, Sensitive Field Type, Manually Corrected, and Last Updated.
For fields with an incorrect Sensitive Field Type, click the drop-down arrow in the Sensitive Field Type column. The list displays the published sensitive field types from the default classification and categorization template of the current tenant. Check if the existing sensitive field types meet your needs:
If they meet your needs: Select another existing sensitive field type. Then, click the
icon on the right to go to the Data Detection Rule page. Modify the detection rules for both the original and the new sensitive field types to ensure future detection accuracy.If they do not meet your needs: Click the
icon on the right to go to the Data Detection Rule page. Alternatively, scroll to the bottom of the drop-down list and click Manage Sensitive Field Types. You are redirected to the Data Detection Rule page, and the Create Sensitive Field Type dialog box appears. Add a new sensitive field type and configure its detection rules. For more information, see Configure data detection rules and execute detection tasks.
Batch correct data
Select the fields that you want to batch correct and click the Batch Correct button in area ④ of the figure above. The Batch Correct Recognition Results dialog box appears. The Sensitive Field Type drop-down list displays the Published sensitive field types from the default classification and grading template for the current tenant. Select the correct sensitive field type and click Save to complete the batch correction of the recognition results.
Export detection results
For data that has been identified by the system, you can click Export Detection Results to export the results that match the filter criteria to your local computer.
Export Detection Results: Click the
icon to automatically export the detection results that match the current filter criteria.NoteYou can export up to 100,000 data entries.