Add and manage identification results - Dataphin - Alibaba Cloud Documentation Center

Limits

Data source tables do not support automatic scanning based on rules or lineage inheritance. You can manually add or batch import identification results for these tables.

Permissions description

Security administrators and custom global roles with Classification Result-Management permissions can add and manage all identification results.
Project administrators can manage identification results for tables within their projects, including creating and editing identification results, enabling or disabling effective status, and locking identification rules.
Data table owners can manage identification results for their respective tables, including editing identification results, enabling or disabling effective status, and locking identification rules.

Identification methods description

Automatic Scanning: Generates identification results based on the scheduled and real-time scan settings.
Manual Addition: Allows batch importing identification results into Dataphin.
Automatic Inheritance Based on Lineage: Descendant fields automatically inherit identification results from direct upstream fields based on various inheritance scenarios and rules.

Manually add identification results

On the Dataphin home page, navigate to Administration > Data Security via the top menu bar.
In the navigation pane on the left, select Data Identification > Classification Result. On the Classification Result page, click the Manual Add button.

On the Manual Add page, configure the parameters.

Parameter

Description

Add Policy

Remove Duplicates Policy

Policy for handling duplicates between this upload and existing online identification records. Three policies are available: Overwrite Existing Identification Results, Only Overwrite Existing Automatic Identification Results, and Retain Existing Identification Results Without Updating.

Overwrite Existing Identification Results: When new fields match online fields, use this tagging result and mark it as manually specified.
Only Overwrite Existing Automatic Identification Results: When new fields match online fields and online identification results are not locked, use this tagging result and mark it as manually specified.
Retain Existing Identification Results Without Updating: When new fields match online fields, retain the online tagging, and this tagging does not take effect.

Added Records

Add By Table: Click the Add By Table button. In the Add By Table dialog box, configure parameters and click OK to complete the addition.
- Data Table: Supports selecting up to 200 data tables. Project administrators can select all data tables under their responsible projects. Section architects can select all data tables under their responsible sections. Table owners can select data tables they are responsible for.
  
  Only the Intelligent R&D Edition supports filtering. You can click the Filter icon to filter data tables based on section/project/data source and table type.
- Table Fields: Select fields based on data tables, supporting up to 200 fields.
- Configure Unified Classification: Disabled by default. When enabled, you can add a unified data classification for the selected fields, which you can modify in the added records list.
Search: Quickly search for added data tables based on table name and description (only supported for data source tables).
Added Records List: Displays information on data tables, table fields, data classification, data grading, and desensitization effective status. You can modify data tables, table fields, data classification, and effective status. You can also perform Continue Configuring Field Identification Rules Under This Table and Delete operations under the Actions column.
- Effective Status: Takes effect immediately after configuration. When enabled, identification results will enter subsequent display, statistics, desensitization, and other usage processes. When disabled, the identification results for the current field will not take effect.
- Continue Configuring Field Identification Rules Under This Table: Add new fields under the current table and configure data classification.
- Delete: Delete the currently added data table.
Batch Operations: Supports batch execution of Change Data Classification, modify Effective Status, and Delete operations for added data tables.

After ensuring the information is correct, click Upload to finish adding identification results manually.

Manage identification results list

The identification results list displays all added identification results, including table name, field, asset source, data classification, data grading, desensitization effective status, and identification method.

Asset Source: Dataphin tables display project and section information. Data source tables show Database/Schema and data source details.
You can search for asset objects using different criteria. You can also find all identification results associated with a classification by using data classification keywords.
- Dataphin Table: Quickly search by table, field, and project/section keywords. Precise filtering is available based on data classification (or unspecified classification), data grading, data section, project, desensitization effective status, lock status, and identification method.
- Data Source Table: Quickly search by table, Database/Schema, and table description keywords. Precise filtering is available based on data classification (or unspecified classification), data grading, data source, desensitization effective status, lock status, and identification method.

You can perform the following operations on identification results.

Operation	Description
Enable/Disable Desensitization Effective Status	The desensitization effective status controls whether the current identification result is covered by the desensitization policy. Click the switch under the desensitization effective status column or click More-Desensitization Effective/Desensitization Ineffective at the bottom to toggle the status. The configuration takes effect immediately. When enabled, the system desensitizes fields based on desensitization rules and default desensitization policy. When disabled, even if the current identification result matches a desensitization rule, it will not be desensitized. However, corresponding identification records are still generated, and the permission approval process is arbitrated and assigned based on the matching degree.
Identification Result Recommendation Prompt	If an identification record for the current field has a higher matching degree than the current effective result, a Recommended tag is displayed. You can click Recommended after the data classification name or click Actions column to View Identification Details to open the field identification details dialog box. Review the recommended identification results and decide whether to apply them based on your business needs.
View Identification Details	Displays basic information, effective results, and identification records for field identification details. Basic information: Displays the table name, field name, and sample data. The data sampling switch must be enabled to view sample data. Effective Results: Displays the current field's effective data classification and corresponding data grading, identification method, priority, actual matching degree, classification modification time, and update time. You can perform Specify Data Classification (for unconfigured data classification) and Edit Identification Results (for configured data classification) operations. Data Grading: Displays the latest grading configuration. You can view the grading results at the arbitration moment to determine whether modification is needed. Priority: Displays the latest priority configuration. You can view the priority results at the arbitration moment to determine whether modification is needed. Priority 1 is the highest level. For rules at the same level, the one with the newer update time takes effect. Specify Data Classification: If the current effective result is an automatically inherited result and the inheritance policy inherits only grading (not classification), the effective result may lack a data classification. In this case, specify a data classification to ensure desensitization rules can be matched. In the Specify Data Classification dialog box, select a data classification. You can also use the system-recommended data classification. Note The data grading of the specified classification needs to be the same as the current effective data grading. Otherwise, it cannot be directly specified. You can modify the data classification by editing the identification results. Edit Identification Results: Supports modifying effective identification results. For operation details, see Edit Identification Results. Classification Record: Displays data classification, data grading, identification method, priority, actual matching degree, classification modification time, and update time. If an identification record for the current field has a higher matching degree than the current effective result, a recommended mark appears at the top left corner of the data classification name. You can click One-click Modification at the top right corner to set it as the effective identification result. Identification Result Effective Priority Description: For automatically identified results, the scanning rules follow descending priority: data classification priority > data grading > update time > matching degree > data classification modification time. A prompt is displayed when a more suitable data classification is detected. For automatically inherited identification results, the highest data grading level takes priority. If multiple data gradings share the same level but have different data classifications, the priority follows: data classification priority > update time of identification records > classification modification time. A prompt is displayed when a more suitable data classification is detected. Data Grading: Displays the latest grading configuration. You can view grading results at the arbitration moment to determine whether modification is needed. Priority: Displays the latest priority configuration. You can view priority results at the arbitration moment to determine whether modification is needed. Priority 1 is the highest level. For rules at the same level, the one with the newer update time takes effect. Specify As Effective Result: Specifying the data classification in the current identification record as the effective result changes the identification method to manually specified, which is not affected by subsequent automatic identification results.
Edit Identification Results	Click Edit under the Actions column or click Edit at the bottom to modify identification results. Both Automatic Identification/inheritance and Manual modes are supported. Automatic Identification/inheritance: Selecting this option deletes any manually specified identification results for the current field and replaces them with the higher matching degree automatic identification or automatic inheritance results. If higher matching degree results appear later, the identification results for the current field will change accordingly. Note When batch modifying to automatic identification, since data source tables do not support automatic identification, the system will automatically skip without modification. Manual: Selecting this option locks the currently selected data classification, preventing it from being overwritten by automatic identification or inheritance results. You can also use the system-recommended data classification. Sync Modify To Desensitization Effective: When selected, the current identification result is specified as the effective result and the desensitization effective status is turned on.
Lock Current Identification Result	Click Lock under the Actions column or click Lock at the bottom to lock the identification result. Only automatic identification or automatic inheritance results with a specified classification can be locked. After locking, a manually specified identification record consistent with the current result is generated as the effective result, which is not affected by subsequent automatic identification or inheritance results.
Delete Identification Result	Click Delete under the Actions column or click More-Delete at the bottom to delete the identification result. All corresponding identification records are deleted synchronously. You can modify incorrect identification results or adjust identification rules to rescan and regenerate results.

Batch import identification results

On the Classification Result page, click the Batch Import button to open the Batch Import Identification Results dialog box.

In the Batch Import Identification Results dialog box, configure the parameters.

Parameter	Description
Asset Type	Select the asset type for importing identification results. Dataphin Table and Datasource Table are supported.
Template Download	If you do not have a template, click the file name to download the .xlsx template. The system downloads different templates based on the asset type. If you already have a template, you can upload the file and start validation directly.
Configuration File	Upload the corresponding template based on different asset types. Data source tables need to collect related assets first. Only one file can be uploaded at a time. Only .xlsx files are supported. A single Excel upload should not exceed 1000 rows. File size should not exceed 10 MB. When filling out the template, please refer to the template instructions. Full Name of Dataphin Table: For physical tables, fill in project name.table name. For logical tables, fill in section name.table name. Full Name of Data Source Table: Fill in db/schema.table name under the specified data source. Only supports adding or modifying identification results with management permissions. Security administrators and custom global roles with Classification Result-Management permissions can upload all tables. Project administrators can only upload tables under their responsible projects. Table owners can only upload tables they are responsible for.

Once the file has been uploaded, the system will validate it according to the uploaded file's specifications. Upon successful validation, click Start Validation to conduct various checks on the imported file, depending on the type of asset.
- Dataphin Table: Ensure that the full name of the imported table, field names, classification directories/data classifications are not empty, and verify the correctness of the column order.
- Data Source Table: Verify the data source name, environment, complete table name, field names, classification directory/data classification for any missing entries, and ensure the column order is correct.
- Verify that the current operator has the necessary management permissions for identification results.

After successful validation, the import results page will automatically open.

Parameter

Description

Compatible Policy

Policy for handling conflicts between this upload and existing online records. You can configure Duplicate Record Handling and Desensitization Effective Status as compatible policies.

Duplicate Record Handling: Policy for handling duplicates between uploaded field identification results and existing online fields. Supports overwriting all online identification results, overwriting all unlocked online identification results, and retaining existing online identification results without updating.
- Overwrite All Online Identification Results: When new fields match online fields, use this tagging result to overwrite generated identification results (including automatically identified and manually specified results) and mark it as manually specified.
- Overwrite All Unlocked Online Identification Results: When new fields match online fields and the online identification method is automatic identification, use this tagging result to overwrite generated identification results and mark it as manually specified. Do not overwrite identification results with the effective method of manually specified.
- Retain Existing Online Identification Results, Skip Without Updating: When new fields match online fields, retain the online tagging, and this tagging does not take effect.
Desensitization Effective Status: When an uploaded or existing online identification result has an invalid desensitization effective status, you can choose to retain the existing configuration for online updated results, unify new results to be effective, or unify new and updated results to be effective.

Note
Fields with an invalid desensitization status cannot be desensitized based on classification and grading. Corresponding identification records are still generated, and the permission approval process is arbitrated and assigned based on the matching degree.

Validation Results

View information about validation passed, validation exceptions, and duplicate records during file upload.

Validation Passed: Displays records that passed file upload, including the corresponding row number, table, field, data classification, and data grading information in the original file.
Validation Exception: Displays exception records that failed file validation, and automatically adds Exception Prompt column. You can modify and re-import based on the exception prompt.
Duplicate Records: Displays records of duplicate content in this upload and successful identification results already online, including the corresponding row number, table, field, data classification (import), data classification (online), and duplicate prompt information in the original file.
You can click the Download Validation Records button to download the corresponding validation records as an Excel file.

Click Start Import to complete the import process.

After the import is complete, you can close the file upload configuration dialog box to quickly check the results. This does not stop the upload task from executing. To view import history, click Batch Operation Records in the identification results list.

View import history

On the Classification Result page, click the drop-down arrow next to batch import and select Batch Operation Records to access the Import History panel.
In the Import History panel, you can view the historical import records of identification results.