edit-icon download-icon

HDFS authorization

Last Updated: Mar 23, 2018

When permission control is enabled for HDFS, users need valid permissions to access HDFS for normal operations, such as reading data and creating a folder.

1. Add configuration

Configurations related to HDFS permission are as follows:

  • dfs.permissions.enabled

    Enable permission check. Even if the value is false, chmod/chgrp/chown/setfacl performs permission check.

  • dfs.datanode.data.dir.perm

    The permission of the local folder used by datanode, which is 755 by default.

  • fs.permissions.umask-mode

    Permissions mask. The default permission settings when creating a file/folder.

    File creation: 0666 & ^umask

    Folder creation: 0777 & ^umask

    Default umask value is 022, i.e. the permission of file creation is 644 (666&^022 = 644), and permission of folder creation is 755 (777&^022 = 755).

    The default setting of Kerberos security cluster in the EMR is 027, the corresponding permission of file creation is 640, and permission of folder creation is 750.

  • dfs.namenode.acls.enabled

    Enable ACL control. This gives you permission control on owner/group, and you can also set it for other users.

    Commands for setting ACL:

    1. hadoop fs -getfacl [-R] <path>
    2. hadoop fs -setfacl [-R] [-b |-k -m |-x <acl_spec> <path>] |[--set <acl_spec> <path>]

    For example:

    1. su test
    2. #The user test creates a folder
    3. hadoop fs -mkdir /tmp/test
    4. #View the permission of the created folder
    5. hadoop fs -ls /tmp
    6. drwxr-x--- - test hadoop 0 2017-11-26 21:18 /tmp/test
    7. #Set ACL and grant rwx permissions to user foo
    8. hadoop fs -setfacl -m user:foo:rwx /tmp/test
    9. #View the permission of the file (+ means that ACL is set)
    10. hadoop fs -ls /tmp/
    11. drwxrwx---+ - test hadoop 0 2017-11-26 21:18 /tmp/test
    12. #View ACL
    13. hadoop fs -getfacl /tmp/test
    14. # file: /tmp/test
    15. # owner: test
    16. # group: hadoop
    17. user::rwx
    18. user:foo:rwx
    19. group::r-x
    20. mask::rwx
    21. other::---
  • dfs.permissions.superusergroup

    Super user group. Users in the group have super user permissions.

2. Restart HDFS service

For Kerberos security cluster, HDFS permission have been set by default (umask is set to 027), without configuration and service restart.

For non-Kerberos security cluster, a configuration must be added and the service must be restarted.

3. Other
  • umask value can be modified as needed.

  • HDFS is a basic service, and Hive/HBase are based on HDFS. Therefore, HDFS permission control must be configured in advance when configuring other upper layer services.

  • When permissions are enabled for HDFS, the services must be set up (such as /spark-history for spark, and /tmp/$user/ for yarn).

  • Sticky bit can be set for a folder to prevent users other than superuser/file owner/dir owner from deleting files/folders in the folder (even if other users have rwx permissions on the folder). For example:

    1. #That is, adding numeral 1 as the first digit
    2. hadoop fs -chmod 1777 /tmp
    3. hadoop fs -chmod 1777 /spark-history
    4. hadoop fs -chmod 1777 /user/hive/warehouse
Thank you! We've received your feedback.