This topic describes how to customize de-identification rules in Data Security Guard so that DataWorks can dynamically de-identify the results of ad hoc queries.

Prerequisites

DataWorks Professional Edition or a more advanced edition is activated.

Go to the Data Masking page

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region where your workspace resides, find the workspace, and then click Data Analytics in the Actions column.
  2. Click the Icon icon in the upper-left corner and choose All Products > Data governance > Data Security Guard.
  3. Click Try now to go to the Data Security Guard homepage.
  4. In the left-side navigation pane, choose Rule Change > Data Masking.
    The Data Masking page has two tabs: Data Masking and Whitelist.

Customize de-identification rules in Data Security Guard

  1. On the Data Masking page, set the Masking Scene parameter to Global Config(_default_scene_code).
  2. Create a de-identification rule.
    1. On the Data Masking tab, click Create Rule in the upper-right corner.
    2. In the Create Rule dialog box, set the Masking Rule and Method parameters.

      You can select an existing data identification rule from the Masking Rule drop-down list. For more information about data identification rules, see Set data identification rules.

      You can set the Method parameter to Pseudonymisation, HASH, or Masking Out. The valid values that are displayed for the Method parameter vary based on the data identification rule that you select from the Masking Rule drop-down list.
      • Pseudonymisation
        This method replaces the text of a data record with an artificial pseudonym of the same data type. If you select this method, you must specify whether to enable Data watermark and select a security domain from the Domain drop-down list.
        • Data watermark: Watermarks allow you to track the source of the data. If your data leaks, you can track the potential source where the data leakage occurs based on the watermark.
        • Domain: De-identification policies vary with security domains. In different security domains, different de-identification results are generated for the same data record based on the same de-identification rule. If you do not have a security domain plan for the de-identification rule, randomly select a security domain from the drop-down list.
      • HASH
        If you select HASH, you must specify whether to enable Data watermark and select a security domain from the Domain drop-down list.
        • Data watermark: Watermarks allow you to track the source of the data. If your data leaks, you can track the potential source where the data leakage occurs based on the watermark.
        • Domain: De-identification policies vary with security domains. In different security domains, different de-identification results are generated for the same data record based on the same de-identification rule. If you do not have a security domain plan for the de-identification rule, randomly select a security domain from the drop-down list.
      • Masking Out
        This method uses asterisks ( *) to mask specified parts of a data record. This is a commonly used method.
        Parameter Description
        Recommended You can select recommended policies to mask data of common types such as ID card numbers and bank card numbers.
        Custom You can flexibly specify whether to mask the specified number of characters at the first, middle, or last part of a data record.
    3. Click Save.
    4. On the Data Masking tab, set the status of the created de-identification rule to Active or Inactive as needed.
      You can click the Setup icon in the Actions column of the de-identification rule to test whether it works.
  3. Configure a whitelist.
    1. Click the Whitelist tab.
    2. On the Whitelist tab, click Add Account in the upper-right corner.
    3. In the Add Account dialog box, set the Rule, Account, and Effective From parameters.
      Note If a user queries data beyond the time range that is specified in the whitelist, the query results are de-identified.

Verify the de-identification result in DataWorks

After you create and configure de-identification rules, DataWorks dynamically de-identifies the results of queries in your workspace based on the rules.
Note You must first turn on Mask Data in Page Query Results for your workspace in the DataWorks console. For more information, see Workspace settings.