After HDFS authorization is enabled, only users who are granted the relevant permissions can access HDFS and perform operations such as reading data or creating folders.

Go to the Configure tab for HDFS

  1. Log on to the Alibaba Cloud EMR console.
  2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
  3. Click the Cluster Management tab.
  4. On the Cluster Management page, find your cluster and click Details in the Actions column.
  5. In the left-side navigation pane, choose Cluster Service > HDFS.
  6. Click the Configure tab.
    Note
    • For a cluster deployed in Kerberos security mode, HDFS permissions are configured by default (with umask set to 027). You do not need to configure HDFS authorization and restart HDFS.
    • For other clusters, you must configure HDFS authorization and restart HDFS.

Configure HDFS authorization

Configure the following HDFS authorization parameters:

  • dfs.permissions.enabled

    Specifies whether to enable permission check. Even if the parameter is set to false, the system checks the related permissions when you run the chmod, chgrp, chown, or setfacl command.

  • dfs.datanode.data.dir.perm

    Permissions of data nodes on local directories. Default value: 755.

  • fs.permissions.umask-mode
    • Permission mask. This is a default setting specified when you create a file or folder.
    • File creation: 0666 & ^umask.
    • Folder creation: 0777 & ^umask.
    • The default umask value is 022. If the default value is used, the permission for file creation is 644 (0666 & ^022 = 644), and the permission for folder creation is 755 (0777 & ^022 = 755).
    • The default umask value of EMR clusters deployed in Kerberos security mode is 027. If you use such clusters, the permission for file creation is 640, and the permission for folder creation is 750.
  • dfs.namenode.acls.enabled
    • Specifies whether to enable support for access control lists (ACLs). After you set this parameter to true, you can control the permissions of owners, groups, and other users.
    • Commands for configuring an ACL:
      hadoop fs -getfacl [-R] <path>
      hadoop fs -setfacl [-R] [-b |-k -m |-x <acl_spec> <path>] |[--set <acl_spec>   <path>]
      Example:
      su test
       # Create a folder as user test.
       hadoop fs -mkdir /tmp/test
       # View permissions on the created folder.
       hadoop fs -ls /tmp
       drwxr-x---   - test   hadoop          0 2017-11-26 21:18 /tmp/test
       # Configure an ACL and grant user foo rwx permissions.
       hadoop fs -setfacl -m user:foo:rwx /tmp/test
       # View permissions on files. The plus sign (+) indicates that an ACL is configured.
       hadoop fs -ls  /tmp/
       drwxrwx---+  - test   hadoop          0 2017-11-26 21:18 /tmp/test
       # View the ACL.
        hadoop fs -getfacl  /tmp/test
        # file: /tmp/test
      # owner: test
      # group: hadoop
      user::rwx
      user:foo:rwx
      group::r-x
      mask::rwx
      other::---
  • dfs.permissions.superusergroup

    The name of the superuser group. All users in this group are superusers.

Restart HDFS

  1. After you configure the HDFS authorization parameters, choose Actions > Restart All Components in the upper-right corner of the HDFS page.
  2. In the Cluster Activities dialog box, configure relevant parameters and click OK. In the Confirm message, click OK.
    Click History in the upper-right corner to view the task progress.

Remarks

  • The umask value can be changed as needed.
  • HDFS is a basic service. Many services such as Hive and HBase rely on HDFS. Therefore, before you configure these upper-layer services, you must configure HDFS authorization first.
  • If HDFS authorization is configured, make sure that log paths are configured for related services, for example, /spark-history/ for Spark and /tmp/$user/ for YARN.
  • sticky bit
    You can set a sticky bit for a folder to prevent users other than superusers, file owners, and directory owners from deleting files or sub-folders in the folder (even if the users have rwx permissions on the folder).
    # Add 1 to the first place.
    hadoop fs -chmod 1777 /tmp
    hadoop fs -chmod 1777 /spark-history
    hadoop fs -chmod 1777 /user/hive/warehouse