Ranger supports row-level filtering on Hive data. You can filter the return results
of SELECT statements by row to display only the rows that meet the specified conditions.
This topic describes how to filter Hive data by row.
Prerequisites
- An E-MapReduce (EMR) cluster is created, and Ranger is selected from the optional
services when you create the cluster. For more information, see Create a cluster.
- A table whose data can be filtered by row is created.
Procedure
Note The web UI of Ranger varies based on the Ranger version. In this example, Ranger 2.1.0
is used.
- Integrate Hive with Ranger and configure related permissions. For more information,
see Integrate Hive with Ranger.
- On the web UI of Ranger, click emr-hive.
- Create a row-level filtering policy.
- Click the Row Level Filter tab.
- Click Add New Policy in the upper-right corner.
- On the Create Policy page, configure the parameters. The following table describes the parameters.
Parameter |
Description |
Example |
Policy Name |
The name of a row-level filtering policy. You can customize a policy name. |
test-row-filter |
Hive Database |
The name of a Hive database. |
default |
Hive Table |
The name of a Hive table. |
test_row_filter |
Select User |
The user to whom you want to attach the row-level filtering policy. |
testc |
Access Types |
The permissions that you want to grant. |
select |
Row Level Filter |
The function that is used to filter data. |
id>=10 |
- Click Add.
- Optional:Test row-level filtering.
For example, if the testc user executes the
select * from default.test_row_filter;
statement to query data in the
default.test_row_filter table, only the rows whose ID is greater than or equal to 10 are displayed.