edit-icon download-icon

Authentication method compatible with MIT Kerberos

Last Updated: Mar 23, 2018

The Kerberos server in the EMR cluster is started at the master node, and some management operations must be performed with the root account of the master node (emr-header-1).

User test accessing to HDFS service is used as an example to introduce the relevant procedure as follows:

A) Executed hadoop fs -ls / on the Gateway

  • Configure krb5.conf

    1. Use root account on the Gateway
    2. scp root@emr-header-1:/etc/krb5.conf /etc/
  • Add principal

    -> Log on to the cluster emr-header-1 node and switch to the root account

    -> Open the admin tool in Kerberos

    1. sh /usr/lib/has-current/bin/hadmin-local.sh /etc/ecm/has-conf -k /etc/ecm/has-conf/admin.keytab
    2. HadminLocalTool.local: #Press Enter to view the use of the commands
    3. HadminLocalTool.local: addprinc #Input the command and press Enter to view the use of the specific command
    4. HadminLocalTool.local: addprinc -pw 123456 test #Add principal for the user test, and set the password to 123456
  • Export the keytab file

    Admin tool in Kerberos can be used to export the keytab file corresponding to the principal

    1. HadminLocalTool.local: ktadd -k /root/test.keytab test #Export the keytab file, which can be used subsequently
  • Use kinit to obtain the Ticket

    On the client machine where HDFS commands are executed, such as the Gateway

    -> Add Linux account test

    1. useradd test

    -> Install MITKerberos client tools

    MITKerberos tools can be used for relevant operations (such as kinit and klist). For detailed usage, see MITKerberos document

    1. yum install krb5-libs krb5-workstation -y

    -> Switch to account test to execute kinit

    1. su test
    2. #If the keytab file does not exist, execute
    3. kinit #Press Enter
    4. Password for test: 123456 #Done
    5. #the keytab file exists, execute
    6. kinit -kt test.keytab test
    7. #View the ticket
    8. klist

    Note: Practices of MITKerberos tools mitkerberos-tools

  • Execute HDFS commands

    When the Ticket is obtained, HDFS commands can be normally executed

    1. hadoop fs -ls /
    2. Found 5 items
    3. drwxr-xr-x - hadoop hadoop 0 2017-11-12 14:23 /apps
    4. drwx------ - hbase hadoop 0 2017-11-15 19:40 /hbase
    5. drwxrwx--t+ - hadoop hadoop 0 2017-11-15 17:51 /spark-history
    6. drwxrwxrwt - hadoop hadoop 0 2017-11-13 23:25 /tmp
    7. drwxr-x--t - hadoop hadoop 0 2017-11-13 16:12 /user

    Note: Corresponding Linux accounts must be added to all the nodes in the cluster in advance for running yarn job (for more information, see the following [Add test account to the EMR cluster])

b) Use Java code to access to HDFS

  • Use local ticket cache

    Note: kinit must be executed in advance to obtain the ticket, and the application will not be normally accessed when the ticket expires.

    1. public static void main(String[] args) throws IOException {
    2. Configuration conf = new Configuration();
    3. //Load the HDFS configuration, which is copied from the EMR cluster
    4. conf.addResource(new Path("/etc/ecm/hadoop-conf/hdfs-site.xml"));
    5. conf.addResource(new Path("/etc/ecm/hadoop-conf/core-site.xml"));
    6. //kinit needs to be executed in advance to obtain the ticket with the Linux account of the application
    7. UserGroupInformation.setConfiguration(conf);
    8. UserGroupInformation.loginUserFromSubject(null);
    9. FileSystem fs = FileSystem.get(conf);
    10. FileStatus[] fsStatus = fs.listStatus(new Path("/"));
    11. for(int i = 0; i < fsStatus.length; i++){
    12. System.out.println(fsStatus[i].getPath().toString());
    13. }
    14. }
  • Use keytab file (recommended)

    Note: keytab has long-term validity, and is independent with the local ticket

    1. public static void main(String[] args) throws IOException {
    2. String keytab = args[0];
    3. String principal = args[1];
    4. Configuration conf = new Configuration();
    5. //Load the HDFS configuration, which is copied from the EMR cluster
    6. conf.addResource(new Path("/etc/ecm/hadoop-conf/hdfs-site.xml"));
    7. conf.addResource(new Path("/etc/ecm/hadoop-conf/core-site.xml"));
    8. //Directly use keytab file, which is obtained through executing relevant commands on master-1 in the EMR cluster [the commands are introduced earlier in this article]
    9. UserGroupInformation.setConfiguration(conf);
    10. UserGroupInformation.loginUserFromKeytab(principal, keytab);
    11. FileSystem fs = FileSystem.get(conf);
    12. FileStatus[] fsStatus = fs.listStatus(new Path("/"));
    13. for(int i = 0; i < fsStatus.length; i++){
    14. System.out.println(fsStatus[i].getPath().toString());
    15. }
    16. }

Pom dependencies are attached:

  1. <dependencies>
  2. <dependency>
  3. <groupId>org.apache.hadoop</groupId>
  4. <artifactId>hadoop-common</artifactId>
  5. <version>2.7.2</version>
  6. </dependency>
  7. <dependency>
  8. <groupId>org.apache.hadoop</groupId>
  9. <artifactId>hadoop-hdfs</artifactId>
  10. <version>2.7.2</version>
  11. </dependency>
  12. </dependencies>
Thank you! We've received your feedback.