After HDFS authorization is enabled, only users who are granted the relevant permissions can access HDFS and perform operations such as reading data or creating folders.
Go to the Configure tab for HDFS
- Log on to the EMR console.
- In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
- Click the Cluster Management tab.
- On the Cluster Management page, find your cluster and click Details in the Actions column.
- In the left-side navigation pane, choose .
- Click the Configure tab. Note
- For a cluster deployed in Kerberos security mode, HDFS permissions are configured by default (with umask set to 027). You do not need to configure HDFS authorization and restart HDFS.
- For other clusters, you must configure HDFS authorization and restart HDFS.
Configure HDFS authorization
Configure the following HDFS authorization parameters:
Specifies whether to enable permission check. Even if the parameter is set to false, the system checks the related permissions when you run the chmod, chgrp, chown, or setfacl command.
Permissions of data nodes on local directories. Default value: 755.
- Permission mask. This is a default setting specified when you create a file or folder.
- File creation: 0666 & ^umask.
- Folder creation: 0777 & ^umask.
- The default umask value is 022. If the default value is used, the permission for file creation is 644 (0666 & ^022 = 644), and the permission for folder creation is 755 (0777 & ^022 = 755).
- The default umask value of EMR clusters deployed in Kerberos security mode is 027. If you use such clusters, the permission for file creation is 640, and the permission for folder creation is 750.
- Specifies whether to enable support for access control lists (ACLs). After you set this parameter to true, you can control the permissions of owners, groups, and other users.
- Commands for configuring an ACL:
hadoop fs -getfacl [-R] <path> hadoop fs -setfacl [-R] [-b |-k -m |-x <acl_spec> <path>] |[--set <acl_spec> <path>]
su test # Create a folder as user test. hadoop fs -mkdir /tmp/test # View permissions on the created folder. hadoop fs -ls /tmp drwxr-x--- - test hadoop 0 2017-11-26 21:18 /tmp/test # Configure an ACL and grant user foo rwx permissions. hadoop fs -setfacl -m user:foo:rwx /tmp/test # View permissions on files. The plus sign (+) indicates that an ACL is configured. hadoop fs -ls /tmp/ drwxrwx---+ - test hadoop 0 2017-11-26 21:18 /tmp/test # View the ACL. hadoop fs -getfacl /tmp/test # file: /tmp/test # owner: test # group: hadoop user::rwx user:foo:rwx group::r-x mask::rwx other::---
The name of the superuser group. All users in this group are superusers.
- After you configure the HDFS authorization parameters, choose in the upper-right corner of the page.
- In the Cluster Activities dialog box that appears, set related parameters and click
OK.Click History in the upper-right corner to view the task progress.
- The umask value can be changed as needed.
- HDFS is a basic service. Many services such as Hive and HBase rely on HDFS. Therefore, before you configure these upper-layer services, you must configure HDFS authorization first.
- If HDFS authorization is configured, make sure that log paths are configured for related services, for example, /spark-history/ for Spark and /tmp/$user/ for YARN.
- sticky bit
You can set a sticky bit for a folder to prevent users other than superusers, file owners, and directory owners from deleting files or sub-folders in the folder (even if the users have rwx permissions on the folder).
# Add 1 to the first place. hadoop fs -chmod 1777 /tmp hadoop fs -chmod 1777 /spark-history hadoop fs -chmod 1777 /user/hive/warehouse