Ranger supports Hive data masking. You can configure a data masking policy to mask the return values of SELECT statements to hide sensitive information from users.

Background information

This feature applies only to HiveServer2 scenarios. For example, you can mask the return values of SELECT statements that are executed by using Beeline, JDBC, or Hue.

Procedure

Note The web UI of Ranger varies based on the Ranger version. In this example, Ranger 2.1.0 is used.
You can configure a data masking policy on the emr-hive tab of the Ranger web UI. Pay attention to the following points:
  • Multiple data masking methods are supported. For example, you can choose to show only the first or last four characters or use a hashing algorithm to process data.
  • Wildcards are not supported. For example, you are not allowed to use asterisks (*) when you configure a table or column in a data masking policy.
  • Each data masking policy applies to only one column. If you want to mask data in multiple columns, configure multiple data masking policies.
  1. Integrate Hive with Ranger and configure related permissions. For more information, see Integrate Hive with Ranger.
  2. On the web UI of Ranger, click emr-hive.
    Ranger-2
  3. Create a masking policy.
    1. Click the Masking tab.
      hive_masking
    2. Click Add New Policy in the upper-right corner.
    3. On the Create Policy page, configure the parameters. The following table describes the parameters.
      hive_para
      Parameter Description Example
      Policy Name The name of a masking policy. You can customize a policy name. test_mask
      Hive Database The name of a Hive database. testdb
      Hive Table The name of a Hive table. testtb1
      Hive Column The name of a column. a
      Select User The user to whom you want to attach the masking policy. test
      Access Types The permissions that you want to grant. select
      Select Masking Option The data masking method. Partial mask: show first 4
    4. Click Add.
  4. Optional:Test data masking.
    For example, if the test user executes the select a from testdb1.testtbl; statement to query the data in column a of the testdb1.testtbl table, only the first four characters are displayed, and all the characters that follow the four characters are masked by x. hive-column