All Products
Search
Document Center

DataWorks:Manually correct sensitive data identification results

Last Updated:Nov 28, 2024

This topic describes how to manually correct sensitive data identification results on the Manual Check tab.

Note If you manually correct identification results of sensitive data on the current day, the corrected results are displayed on the next day.

Manually correct sensitive data identification results

  1. Go to the Data Recognition Rules tab. For more information, see Go to the Data Identification Rules tab.
  2. Click the Manual Check tab.
  3. Perform the following operations to manually correct inaccurate sensitive data identification results. Manual Check tab
    OperationDescription
    FilterIn the section marked with 1 in the preceding figure, you can specify the filter conditions to search for the sensitive data identification results that you want to correct.
    The filter conditions include the compute engine type, workspace name, table name, and field name. You can also click Advanced Filter to show more filter conditions, such as data categorization, data sensitivity level, and data sensitivity status.
    • Data categorization: the categorization information that is specified in the default categorization template of the current tenant. For more information, see Specify the category and sensitivity level of sensitive data.
    • Data sensitivity level: the sensitivity level information that is specified in the default categorization template of the current tenant.
    • Data sensitivity status: the sensitivity status of fields. Valid values: Sensitive field and Non-sensitive field. If you select Non-sensitive field, the fields that you manually specified as non-sensitive are displayed.
    Note You can manually correct sensitive data identification results for fields in MaxCompute projects, E-MapReduce (EMR) clusters, Cloudera Distribution Hadoop (CDH) clusters, and Hologres instances.
    Correct a single sensitive data identification resultIn the section marked with 2 in the preceding figure, you can view sensitive data identification results. You can click Display field settings to select the fields that you want to view. After you click Confirm, the table of the sensitive data identification results is refreshed. By default, the following columns are displayed: Project, Table Name, Field, and Sensitive field type. You can click Lineage in the Actions column to go to the Data lineage page to view field-level data lineage.
    For fields whose sensitive field types are inaccurate, you can click the drop-down list in the Sensitive field type column of the field whose sensitive field type you want to correct, and select a type from all published sensitive field types that are provided by the default categorization template of the current tenant. You can check whether the existing types in the drop-down list meet your business requirements.
    • If the existing types meet your business requirements, select a type from the drop-down list. Then, click the View icon next to the drop-down list to go to the Data Recognition Rules tab and modify the sensitive data identification rules for the original and new sensitive field types. This improves the accuracy of subsequent sensitive data identification.
    • If the existing types do not meet your business requirements, click the View icon next to the drop-down list to go to the Data Recognition Rules tab. Alternatively, select Manage sensitive field types from the drop-down list. In the Sensitive field type wizard, create a sensitive field type and configure a data identification rule for the sensitive field type based on your business requirements. For more information, see Configure a sensitive data identification rule and run a sensitive data identification task.
    Correct multiple sensitive data identification results at the same timeSelect the results that you want to correct and click Batch correction in the section marked with 3 in the preceding figure. In the Batch correction of recognition results dialog box, select the sensitive field type that you want to use from the Sensitive field type drop-down list. The drop-down list displays all published sensitive field types that are provided by the default categorization template of the current tenant. Click Save. The sensitive data identification results are corrected at the same time.

Manage sensitive data identification results

You can click Add recognition result in the upper-right corner of the Manual Check tab to add results that are not identified by the system. You can also click Export recognition results in the upper-right corner to export the sensitive data identification results that are displayed based on the specified filter conditions.
  • Add sensitive data identification results: In the Add recognition result dialog box, set the Data Engine parameter to the compute engine to which the added field belongs, and enter the GUID of the field in the Field name field. The field GUID must be in the Project.Table.Column format. Select a sensitive field type from the Sensitive field type drop-down list and click OK. A sensitive data identification result is manually added. The drop-down list displays all published sensitive field types that are provided by the default categorization template of the current tenant. Add recognition result
  • Export sensitive data identification results: After you click Export recognition results, the system exports the sensitive data identification results that are displayed based on the specified filter conditions.
    Note You can export up to 100,000 sensitive data identification records at the same time.