All Products
Search
Document Center

Dataphin:Create and manage detection features

Last Updated:Jan 21, 2025

Detection features identify data characteristics within fields and metadata properties through operations such as belonging, regular expressions, inclusion, and exclusion. These features facilitate the intelligent recommendation of data classifications and standards. Dataphin includes predefined detection feature expressions for entities like phone and ID numbers, and also allows for custom feature creation. This topic describes the process for creating and managing detection features.

Prerequisites

  • Detection features are instrumental in the intelligent recommendation of data standard mappings and field classification tagging. The configuration of feature scans impacts both the mapping rules in the standard module and the detection rules in the security module. Configure these settings carefully to avoid semantic conflicts and resource wastage, considering the usage scenarios of both modules.

  • Intelligent generation is currently in public preview. Please contact Dataphin technical support if you need assistance.

Permission description

Super administrators, data standard administrators, security administrators, and custom global roles with Feature-Management permissions can create and manage detection features.

Create detection features

  1. Navigate to Administration > Data Security from the top menu bar on the Dataphin home page.

  2. In the left-side navigation pane, go to General Configuration > Feature. On the Feature page, click the Create Feature button.

  3. In the Add Feature dialog box, set the necessary parameters.

    Parameter

    Description

    Feature Name

    Please enter the name of the detection feature. The name must be unique and can contain up to 128 characters.

    Feature Condition

    Support selecting Scan By Content, Scan By Field Name, Scan By Field Description, Scan By Data Type.

    • Scan By Content: Identify and judge based on sampling and reading the data content of the target field.

      • Regular (case-insensitive): Enter a regular expression in the input box. For example, if you need to match all names containing test, the regular expression is defined as .*test.*, and the scan results are processed for case compatibility.

      • Regular Expression: Enter a regular expression in the input box. For example, if you need to match all names containing test, the regular expression is defined as .*test.*.

      • Detection Threshold: Only when the content match rate exceeds the detection threshold, the rule is considered a valid detection and enters the detection results of the field for comparison.

    • Scan By Field Name: Scan and judge based on the field name in the metadata.

      • Regular (case-insensitive): Enter a regular expression in the input box. For example, if you need to match all names containing test, the regular expression is defined as .*test.*, and the scan results are processed for case compatibility.

      • Regular Expression: Enter a regular expression in the input box. For example, if you need to match all names containing test, the regular expression is defined as .*test.*.

      • Include/exclude: Keyword matching. For example, to match a user information table, enter user_info.

    • Scan By Field Description: Scan and judge based on the field description in the metadata.

      • Regular (case-insensitive): Enter a regular expression in the input box. For example, if you need to match all names containing test, the regular expression is defined as .*test.*, and the scan results are processed for case compatibility.

      • Regular Expression: Enter a regular expression in the input box. For example, if you need to match all names containing test, the regular expression is defined as .*test.*.

      • Include/exclude: Keyword matching. For example, to match a user information table, enter user_info.

    • Scan By Data Type: Scan and judge based on the data type of the field in the metadata.

      • Belong To: Supported data types include tinyint, smallint, mediumint, int, bigint, decimal, bit, date, datetime, timestamp, varchar, text, json, string. If the required data type is not available, you can customize the input data type.

      • Regular (case-insensitive): Enter a regular expression in the input box. For example, if you need to match data types containing int, the regular expression is defined as .*int*, and the scan results are processed for case compatibility.

      • Regular Expression: Enter a regular expression in the input box. For example, if you need to match data types containing int, the regular expression is defined as .*int.*.

      • Include/exclude: Keyword matching. For example, to match numeric data types, enter int.

    Note
    • At least one rule must be configured. To add a rule, click the +add Rule button.

    • A maximum of 10 rules can be configured, with up to 2 levels of relationships.

    • The relationship between filter conditions can be configured as and or or.

    Description

    Please fill in the description of the usage scenarios related to the detection feature. No more than 1000 characters.

  4. Click Confirm to finalize the creation of the detection feature.

Manage detection features

  1. The Feature page displays details such as the name, description, type, last updated by, and last update time of each detection feature.

  2. (Optional) Search for a specific detection feature by name or filter by type as needed.

  3. The following actions are available for the selected detection feature:

    Operation

    Description

    View

    Support viewing the configuration information of the detection feature.

    Edit

    Support modifying the content of custom detection features. After modification, related detection tasks that reference the current detection feature will be updated synchronously. Please synchronize with relevant business personnel in a timely manner.

    Clone

    Support quickly copying the configuration information of an existing detection feature for creating a new detection feature.

    Delete

    Support deleting custom detection features. After deletion, the current detection feature will be automatically removed from the related detection tasks that have referenced it. Please proceed with caution.

What to do next

  • Once you have configured the detection feature, you can associate it with data classifications to facilitate intelligent recommendations for field classification and tagging. For more information, see Create and manage data classification.

  • The system automatically suggests classifications and grades during the detection rule scan and lineage automatic inheritance scan, based on predefined features. For more information, see Create and manage detection rules.

Intelligent generation

Dataphin supports intelligent generation using large model platforms like Bailian and Tongyi Qianwen. It automatically discerns meanings from the input detection feature name and generates regular expressions that accurately represent feature data content and potential field names. This streamlines the recommendation of detection feature definitions, reducing configuration efforts and enhancing precision. This feature is currently in invitational preview. If interested, please reach out to the product team.