This topic describes how to integrate Hive with Ranger and how to configure related
permissions.
Prerequisites
An EMR cluster is created. For more information, see
Create a cluster.
Note You must select an EMR version later than V3.26.0 when you create the cluster.
Hive access modes
You can access Hive data in three modes: HiveServer2, Hive Client, and HDFS.
- HiveServer2
- Scenario: Access Hive data by using HiveServer2.
- Method: Use the Beeline client or Java Database Connectivity (JDBC) code to run Hive
scripts.
- Permission settings:
The built-in authorization mechanism of Hive can be used for access control in this
scenario. For more information about the authorization mechanism, see Hive.
You can also grant table- or column-level permissions in Ranger. If you access Hive
data by using Hive Client or HDFS, additional permissions are required.
- Hive Client
- Scenario: Access Hive data by using Hive Client.
- Method: Use Hive Client to access Hive data.
- Permission settings:
In this scenario, Hive Client sends DDL requests such as ALTER TABLE ADD COLUMNS
to Hive metastore. Hive Client also submits MapReduce jobs to read data from HDFS.
The built-in authorization mechanism of Hive can be used for access control in this
scenario. Hive checks whether you can perform DDL or DML operations, such as ALTER TABLE test ADD COLUMNS(b STRING)
, based on your read or write permissions on the HDFS path of a specific table in
the SQL statement. For more information about the authorization mechanism, see Hive.
You can configure permissions on HDFS paths of Hive tables in Ranger and configure
storage-based authorization for Hive metastore. In this way, you can achieve access
control in the scenario where Hive Client is used.
Note In this scenario, DDL operation permissions depend on HDFS permissions. If you have
HDFS permissions, you can perform DDL operations on tables, such as DROP TABLE and
ALTER TABLE.
- HDFS
- Scenario: Access Hive data by using HDFS.
- Method: Use an HDFS client or run HDFS code to access Hive data.
- Permission settings:
You must configure permissions on HDFS paths of Hive tables.
You can use Ranger to configure the permissions. For more information, see Example of permission configuration.
Integrate Hive with Ranger
- Enable Hive in Ranger.
- Log on to the EMR console.
- In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
- Click the Cluster Management tab.
- On the Cluster Management page, find your cluster and click Details in the Actions column.
- In the left-side navigation pane, click Cluster Service and then RANGER.
- On the page that appears, choose Actions > EnabledHive in the upper-right corner.
- In the Cluster Activities dialog box that appears, set related parameters and click
OK.
Click
History in the upper-right corner to view the task progress.

Note In the HiveServer2 scenario, after Hive is restarted, you must configure Hive permissions
in Ranger. In the Hive Client scenario, you must use HDFS permissions for access control.
For more information about how to configure HDFS permissions, see
HDFS.
- Add the Hive service on the web UI of Ranger.
- Log on to Ranger. For more information, see Overview.
- Add the Hive service.
- Configure relevant parameters.

Parameter |
Description |
Service Name |
Set the value to emr-hive.
|
Username |
Set the value to hadoop.
|
Password |
Enter a custom password. |
jdbc.driverClassName |
The class name of the JDBC driver. Default value: org.apache.hive.jdbc.HiveDriver. Use the default value.
|
jdbc.url |
- Enter jdbc:hive2://emr-header-1:10000/ for a standard cluster.
- Enter jdbc:hive2://${master1_fullhost}:10000/;principal=hive/${master1_fullhost}@EMR.$id.COM for a high-security cluster.
Note ${master1_fullhost} indicates the long domain name of master 1. You can log on to
master 1 and run the hostname command to obtain the value of ${master1_fullhost}. The number in ${master1_fullhost}
is the value of $id.
|
Add New Configurations |
- Name: Set the value to policy.download.auth.users.
- Value: Set the value to hadoop for a standard cluster and hive for a high-security cluster.
|
- Click Add.
- Restart Hive.
Restart Hive for the preceding settings to take effect.
- In the left-side navigation pane, choose .
- On the page that appears, choose Actions > Restart All Components in the upper-right corner.
- In the Cluster Activities dialog box that appears, set related parameters and click
OK.
Click History in the upper-right corner to view the task progress.
Example of permission configuration
For example, you can perform the following steps to grant user foo the Select permission
on column a of the testdb.test table:
- Log on to Ranger. For more information, see Overview.
- Click emr-hive.
- Click Add New Policy in the upper-right corner.
- Configure permissions.

Parameter |
Description |
Policy Name |
The name of the policy. You can customize a name. |
database |
The name of the Hive database, such as testdb. |
table |
The name of the table, such as test. |
Hive Column |
The name of the column. You can set this parameter to an asterisk (*) to indicate
all columns.
|
Select Group |
The user group to which you want to add this policy. |
Select User |
The user to whom you want to add this policy. |
Permissions |
The permissions to be granted. |
- Click Add.
After the policy is added, authorization is completed. User foo can access the testdb.test
table.
Note After you add, remove, or modify a policy, it can take up to one minute for the configuration
to take effect.