Hive provides two authorization mechanisms: storage-based authorization and SQL standards-based authorization.
Background information
If the users of an E-MapReduce (EMR) cluster access Hive data by using HDFS or Hive Client, you must perform access control on the Hive data in HDFS. This way, you can manage Hive SQL-related operation permissions. For more information, see the Hive documentation.
The two authorization mechanisms are compatible with each other. You can configure them at the same time.
The storage-based authorization mechanism applies to Hive metastore.
Go to the Configure tab for Hive
- Log on to the EMR console.
- In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
- Click the Cluster Management tab.
- On the Cluster Management page, find your cluster and click Details in the Actions column.
- In the left-side navigation pane, choose .
- Click the Configure tab.
Add authorization configuration parameters
Restart Hive metastore
HDFS access control
For an EMR cluster deployed in Kerberos security mode, HDFS permissions on a warehouse folder are configured for Hive.
For other clusters, perform the following steps to configure basic HDFS permissions for Hive:
- Enable HDFS access control.
- Configure permissions on a warehouse folder for Hive.
hadoop fs -chmod 1771 /user/hive/warehouse You can also run the following command. 1 in the first place of 1777 denotes a sticky bit, which indicates that the files or folders created by other users cannot be deleted. hadoop fs -chmod 1777 /user/hive/warehouse
After you configure the basic permissions, you can grant permissions on the warehouse folder to related users or user groups. For example, you can grant the permissions to create tables, read data from tables, and write data into tables in the warehouse folder.
sudo su has
# Grant user test the rwx permissions on the warehouse folder.
hadoop fs -setfacl -m user:test:rwx /user/hive/warehouse
# Grant group hivegrp the rwx permissions on the warehouse folder.
hadoo fs -setfacl -m group:hivegrp:rwx /user/hive/warehouse
Users can access only the data in Hive tables created by themselves in HDFS.
Verification
- Create a table named testtbl as user test.
hive> create table testtbl(a string); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=test, access=WRITE, inode="/user/hive/warehouse/testtbl":hadoop:hadoop:drwxrwx--t at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
The preceding error occurs, which indicates that user test has no permissions.
# Switch from user root to user has. su has # Configure an ACL to grant user test the rwx permissions on the warehouse folder. hadoop fs -setfacl -m user:test:rwx /user/hive/warehouse
Create a database as user test.
hive> create table testtbl(a string); OK Time taken: 1.371 seconds #View the directory where the testtbl table is stored in HDFS. The data in a table created by user test can be read only by user test and the users in group hadoop. hadoop fs -ls /user/hive/warehouse drwxr-x--- - test hadoop 0 2017-11-25 14:51 /user/hive/warehouse/testtbl # Insert a piece of data. hive>insert into table testtbl select "hz"
- Access table testtbl as user foo.
#drop table hive> drop table testtbl; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Permission denied: user=foo, access=READ, inode="/user/hive/warehouse/testtbl":test:hadoop:drwxr-x--- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) #alter table hive> alter table testtbl add columns(b string); FAILED: SemanticException Unable to fetch table testtbl. java.security.AccessControlException: Permission denied: user=foo, access=READ, inode="/user/hive/warehouse/testtbl":test:hadoop:drwxr-x--- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1720) #select hive> select * from testtbl; FAILED: SemanticException Unable to fetch table testtbl. java.security.AccessControlException: Permission denied: user=foo, access=READ, inode="/user/hive/warehouse/testtbl":test:hadoop:drwxr-x--- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:320) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219)
The preceding information shows that user foo cannot perform any operations on the table created by user test. If you want to grant operation permissions to user foo, HDFS authorization is required.
su has # Grant only the Read permission. You can also grant a write permission such as Alter based on your requirements. # Note: -R is used to allow the user to read the files in the testtbl folder. hadoop fs -setfacl -R -m user:foo:r-x /user/hive/warehouse/testtbl # The select operation is successful. hive> select * from testtbl; OK hz Time taken: 2.134 seconds, Fetched: 1 row(s)
Note In most cases, you can create a Hive user group based on your requirements. Then, you can grant permissions to this group and add users to this group. Users in this group have the same permissions on data access.
SQL standards-based authorization
- Scenario
If the users of an EMR cluster cannot use the HDFS or Hive Client to access Hive data, they can only use
HiveServer2 (Beeline client or JDBC code)
to run Hive scripts. In this case, the SQL standards-based authorization mechanism is required.For more information, see the Hive documentation.
- Add authorization configuration parameters
The parameter settings are provided for HiveServer2.
- In the Service Configuration section of the Configure tab, click hive-site.
- Click Custom Configuration in the upper-right corner and add the configuration parameters described in the following
table.
Key Value hive.security.authorization.enabled true hive.users.in.admin.role hive hive.security.authorization.createtable.owner.grants ALL
- Restart HiveServer2
- In the left-side navigation pane, choose . On the page that appears, choose in the upper-right corner.
- In the Cluster Activities dialog box that appears, set related parameters and click
OK.
Click History in the upper-right corner to view the task progress.
- Access control commands
For more information, see the Hive documentation.
- Verification
- Access the testtbl table created by user test over the Beeline client as user foo.
2: jdbc:hive2://emr-header-1.cluster-xxx:10> select * from testtbl; Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: Principal [name=foo, type=USER] does not have following privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, name=default.testtbl]] (state=42000,code=40000)
- Grant permissions.
Switch to user test and run the grant command to grant user foo the select operation permission. hive> grant select on table testtbl to user foo; OK Time taken: 1.205 seconds
- User foo can perform the select operation.
0: jdbc:hive2://emr-header-1.cluster-xxxxx:10> select * from testtbl; INFO : OK +------------+--+ | testtbl.a | +------------+--+ | hz | +------------+--+ 1 row selected (0.787 seconds)
- Revoke permissions.
Switch back to user test and revoke the select operation permission granted to user foo. hive> revoke select from user foo; OK Time taken: 1.094 seconds
- User foo cannot perform the select operation on the data in the testtbl table.
0: jdbc:hive2://emr-header-1.cluster-xxxxx:10> select * from testtbl; Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: Principal [name=foo, type=USER] does not have following privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, name=default.testtbl]] (state=42000,code=40000)
- Access the testtbl table created by user test over the Beeline client as user foo.