The system automatically generates identification results based on configured rules and lineage inheritance settings. You can also manually specify identification results or batch upload them via Excel. This topic explains how to add and manage identification results.
Limits
Data source tables do not support automatic scanning for generating identification results based on rules or lineage inheritance. You can manually add or batch import identification results for these tables.
Permissions description
Security administrators and custom global roles with Classification Result-Management permissions can add and manage all identification results.
Project administrators can manage identification results for tables within their projects, including creating and editing identification results, enabling or disabling effective status, and locking identification rules.
Data table owners can manage identification results for their respective tables, including editing identification results, enabling or disabling effective status, and locking identification rules.
Identification methods description
Automatic Scanning: Generates identification results based on the scheduled scan time and real-time scan settings in the configuration.
Manual Addition: Allows for batch importing of identification results into Dataphin.
Automatic Inheritance Based on Lineage: Descendant fields automatically inherit identification results from direct upstream fields according to various inheritance scenarios and rules.
Manually add identification results
On the Dataphin home page, navigate to Administration > Data Security via the top menu bar.
In the left-side navigation pane, select Data Identification > Classification Result. On the Classification Result page, click the Manual Add button.
On the Manual Add page, configure the parameters.
Parameter
Description
Add Policy
Remove Duplicates Policy
Policy for handling duplicates between this upload and existing online identification records. Supports three policies: Overwrite Existing Identification Results, Only Overwrite Existing Automatic Identification Results, and Retain Existing Identification Results Without Updating.
Overwrite Existing Identification Results: When new fields match online fields, use this tagging result and mark it as manually specified.
Only Overwrite Existing Automatic Identification Results: When new fields match online fields and online identification results are not locked, use this tagging result and mark it as manually specified.
Retain Existing Identification Results Without Updating: When new fields match online fields, retain the online tagging, and this tagging does not take effect.
Added Records
Add By Table: Click the Add By Table button. In the Add By Table dialog box, configure parameters and click OK to complete the addition.
Data Table: Supports selecting up to 200 data tables. Project administrators can select all data tables under their responsible projects. Section architects can select all data tables under their responsible sections. Table owners can select data tables they are responsible for.
Only the Intelligent R&D Edition supports filtering. You can click the Filter icon to filter data tables based on section/project/data source and table type.
Table Fields: Select fields based on data tables, supporting up to 200 fields.
Configure Unified Classification: Disabled by default. When enabled, you can add a unified data classification for the selected fields, which you can modify in the added records list.
Search: Quickly search for added data tables based on table name and description (only supported for data source tables).
Added Records List: Displays information on data tables, table fields, data classification, data grading, and desensitization effective status. You can modify data tables, table fields, data classification, and effective status. You can also perform Continue Configuring Field Identification Rules Under This Table and Delete operations under the Actions column.
Effective Status: Takes effect immediately after configuration. When enabled, identification results will enter subsequent display, statistics, desensitization, and other usage processes. When disabled, the identification results for the current field will not take effect.
Continue Configuring Field Identification Rules Under This Table: Add new fields under the current table and configure data classification.
Delete: Delete the currently added data table.
Batch Operations: Supports batch execution of Change Data Classification, modify Effective Status, and Delete operations for added data tables.
After ensuring the information is correct, click Upload to finish adding identification results manually.
Manage identification results list
The identification results list shows added identification results, including table name, field, asset source, data classification, data grading, desensitization effective status, and identification method.
Asset Source: Dataphin tables display project and section information. Data source tables show Database/Schema and data source details.
You can search for various asset objects using different criteria. Additionally, you can find all identification results associated with a classification by using data classification keywords.
Dataphin Table: Quickly search by table, field, and project/section keywords. Precise filtering is available based on data classification (or unspecified classification), data grading, data section, project, desensitization effective status, lock status, and identification method.
Data Source Table: Quickly search by table, Database/Schema, and table description keywords. Precise filtering is available based on data classification (or unspecified classification), data grading, data source, desensitization effective status, lock status, and identification method.
You can perform various operations on the target identification results.
Operation
Description
Enable/Disable Desensitization Effective Status
The desensitization effective status is used to manage whether the current identification result is covered by the desensitization policy. Click the switch under the desensitization effective status column or click More-Desensitization Effective/Desensitization Ineffective at the bottom to enable or disable the effective status. The configuration takes effect immediately. When enabled, the system will desensitize fields based on desensitization rules and default desensitization policy. When disabled, even if the current identification result is hit by the desensitization rule, it will not be desensitized. However, corresponding identification records will still be generated, and the corresponding permission approval process will be arbitrated and assigned based on the matching degree.
Identification Result Recommendation Prompt
If there is an identification result with a higher matching degree than the current effective identification result in the identification records of the current field, a Recommended tag will be displayed. You can click the Recommended after the data classification name or click Actions column to View Identification Details to enter the field identification details dialog box. View the more suitable identification results recommended by the system and decide whether to use the recommended identification results based on business needs.
View Identification Details
Displays the basic information, effective results, and identification records of field identification details.
Basic Information: Displays table name and field name information.
Effective Results: Displays the current field's effective data classification and corresponding data grading, identification method, priority, actual matching degree, classification modification time, and update time information. You can perform Specify Data Classification (supports unconfigured data classification) and Edit Identification Results (supports configured data classification) operations.
Data Grading: Displays the latest grading configuration. You can view the grading results at the arbitration moment to determine whether modification is needed.
Priority: Displays the latest priority configuration. You can view the priority results at the arbitration moment to determine whether modification is needed. Priority 1 is the highest level. For rules at the same level, the one with the newer update time takes effect.
Specify Data Classification: If the current effective result is an automatically inherited result and the inheritance policy is to inherit only grading and not classification, there may be cases where the effective result does not specify data classification. In this case, it is recommended to specify data classification. Otherwise, desensitization rules may not be hit. In the Specify Data Classification dialog box, select data classification. You can also directly use the system-recommended data classification.
NoteThe data grading of the specified classification needs to be the same as the current effective data grading. Otherwise, it cannot be directly specified. You can modify the data classification by editing the identification results.
Edit Identification Results: Supports modifying effective identification results. For operation details, see Edit Identification Results.
Classification Record: Displays data classification, data grading, identification method, priority, actual matching degree, classification modification time, and update time information.
If there is an identification result with a higher matching degree than the current effective identification result in the identification records of the current field, a recommended mark will appear at the top left corner of the data classification name. You can click One-click Modification at the top right corner to specify it as the effective identification result.
Identification Result Effective Priority Description:
For automatically identified results, the scanning rules follow the priority from high to low: data classification priority > data grading > update time > matching degree > data classification modification time. When a more suitable data classification is detected, a prompt will be given.
For automatically inherited identification results, the scanning rules follow the highest level of data grading inheritance, which has the highest priority. If there are multiple data gradings with the same level but different data classifications, the priority follows data classification priority > update time of identification records > classification modification time. When a more suitable data classification is detected, a prompt will be given.
Data Grading: Displays the latest grading configuration. You can view the grading results at the arbitration moment to determine whether modification is needed.
Priority: Displays the latest priority configuration. You can view the priority results at the arbitration moment to determine whether modification is needed. Priority 1 is the highest level. For rules at the same level, the one with the newer update time takes effect.
Specify As Effective Result: If the data classification in the current identification record is specified as the effective result, the identification method will be changed to manually specified and will not be affected by subsequent automatic identification results.
Edit Identification Results
Click Edit under the Actions column or click Edit at the bottom to modify identification results. Supports Automatic Identification/inheritance and Manual forms.
Automatic Identification/inheritance: When selecting automatic identification/inheritance, if the current field has manually specified identification results, they will be deleted, and the tagging results will be modified to the higher matching degree automatic identification or automatic inheritance results. If higher matching degree identification results appear later, the identification results of the current field will change.
NoteWhen batch modifying to automatic identification, since data source tables do not support automatic identification, the system will automatically skip without modification.
Manual: When selecting manually specified, the currently selected data classification will be locked, and the list will automatically lock, preventing it from being overwritten by other automatic identification or automatic inheritance results. You can also directly use the system-recommended data classification.
Sync Modify To Desensitization Effective: When selected, the current identification result will be specified as the effective result, and the desensitization effective status will be turned on.
Lock Current Identification Result
Click Lock under the Actions column or click Lock at the bottom to lock the identification result. Only the current effective method of automatic identification/automatic inheritance with specified classification results supports locking. After locking, a manually specified identification record consistent with the current result will be generated as the effective result and will not be affected by subsequent automatic identification or automatic inheritance results.
Delete Identification Result
Click Delete under the Actions column or click More-Delete at the bottom to delete the identification result. After deletion, all identification records corresponding to the identification result will be deleted synchronously. You can modify incorrect identification results or modify identification rules to rescan and generate identification results.
Batch import identification results
On the Classification Result page, click the Batch Import button to open the Batch Import Identification Results dialog box.
In the Batch Import Identification Results dialog box, configure the parameters.
Parameter
Description
Asset Type
Select the asset type for which identification results need to be imported. Supports Dataphin Table and Datasource Table.
Template Download
If there is no template, click the file name to download the .xlsx file. The system will download different templates based on the asset type. If there is a template, you can directly upload the file and start validation.
Configuration File
Upload the corresponding template based on different asset types. Data source tables need to collect related assets first. Only one file can be uploaded at a time.
Only .xlsx files are supported. A single Excel upload should not exceed 1000 rows.
File size should not exceed 10 MB.
When filling out the template, please refer to the template instructions.
Full Name of Dataphin Table: For physical tables, fill in project name.table name. For logical tables, fill in section name.table name.
Full Name of Data Source Table: Fill in db/schema.table name under the specified data source.
Only supports adding or modifying identification results with management permissions.
Security administrators and custom global roles with Classification Result-Management permissions can upload all tables. Project administrators can only upload tables under their responsible projects. Table owners can only upload tables they are responsible for.
Once the file has been uploaded, the system will validate it according to the uploaded file's specifications. Upon successful validation, click Start Validation to conduct various checks on the imported file, depending on the type of asset.
Dataphin Table: Ensure that the full name of the imported table, field names, classification directories/data classifications are not empty, and verify the correctness of the column order.
Data Source Table: Verify the data source name, environment, complete table name, field names, classification directory/data classification for any missing entries, and ensure the column order is correct.
Verify that the current operator has the necessary management permissions for identification results.
After successful validation, the import results page will automatically open.
Parameter
Description
Compatible Policy
Policy for handling conflicts between this upload and existing online records. Supports selecting Duplicate Record Handling and Desensitization Effective Status as two compatible policies.
Duplicate Record Handling: Policy for handling duplicates between the identification results of the uploaded fields and existing online fields. Supports overwriting all online identification results, overwriting all unlocked online identification results, and retaining existing online identification results, skipping without updating.
Overwrite All Online Identification Results: When new fields match online fields, use this tagging result to overwrite generated identification results (including automatically identified and manually specified results) and mark it as manually specified.
Overwrite All Unlocked Online Identification Results: When new fields match online fields and the online identification method is automatic identification, use this tagging result to overwrite generated identification results and mark it as manually specified. Do not overwrite identification results with the effective method of manually specified.
Retain Existing Online Identification Results, Skip Without Updating: When new fields match online fields, retain the online tagging, and this tagging does not take effect.
Desensitization Effective Status: When there is an invalid desensitization effective status for the uploaded identification results or existing online identification results, supports choosing to retain the existing configuration for online updated results, unify new results to be effective, or unify new and updated results to be effective.
NoteThe system cannot desensitize fields with invalid status based on classification and grading. Corresponding identification records will still be generated, and the corresponding permission approval process will be arbitrated and assigned based on the matching degree.
Validation Results
Supports viewing information on validation passed, validation exceptions, and duplicate records during file upload.
Validation Passed: Displays records that passed file upload, including the corresponding row number, table, field, data classification, and data grading information in the original file.
Validation Exception: Displays exception records that failed file validation, and automatically addsException Promptcolumn. You can modify and re-import based on the exception prompt.
Duplicate Records: Displays records of duplicate content in this upload and successful identification results already online, including the corresponding row number, table, field, data classification (import), data classification (online), and duplicate prompt information in the original file.
You can click the Download Validation Records button to download the corresponding validation records as an Excel file.
Click Start Import to complete the import process.
Once the import is complete, you can quickly check the results by closing the file upload configuration dialog box. This action won't stop the upload task from executing. To view the import history, simply click on Batch Operation Records in the identification results list.
View import history
On the Classification Result page, click the drop-down arrow next to batch import and select Batch Operation Records to access the Import History panel.
In the Import History panel, you can view the historical import records of identification results.