All Products
Search
Document Center

E-MapReduce:What do I do if the services of a cluster fail to start because I do not have access permissions on the root storage directory of the cluster?

Last Updated:Jul 21, 2023

This topic describes the causes of and the solutions to startup failures of specific services of a cluster that occur when you do not have access permissions on the root storage directory of the cluster. The root storage directory of a cluster is specified by the fs.DefaultFS configuration item of the Hadoop-Common service in the cluster.

Problem description

When you view the results of check items that are in the abnormal state in the Health Check Items section of the Status tab of the Hadoop-Common service in the E-MapReduce (EMR) console, the [hadoop_fs_availability] DefaultFS is unable to access message is displayed. When you move the pointer over the image.png icon, the system prompts the message fs.defaultFS is unable to access. Check the configuration items and select a storage address in which you have access permissions.

Scenarios in which this issue may occur:

  • When you create a cluster, you select the OSS-HDFS service, and select a bucket for which the OSS-HDFS service is activated as the root storage directory for the cluster. However, in the Basic Configuration step of the procedure for configuring the cluster, you use an Elastic Compute Service (ECS) application role that does not have access permissions on Object Storage Service (OSS). As a result, you cannot access the bucket when the cluster is running.

  • When you create a cluster, you pass the address of a bucket on which you do not have access permissions to the fs.DefaultFS configuration item of the Hadoop-Common service by using the custom software configuration method.

  • You set the fs.DefaultFS configuration item of the Hadoop-Common service of an existing cluster to a bucket on which you do not have access permissions.

Causes and solutions

The causes and solutions vary based on services.

YARN

Cause

You do not have access permissions on the bucket that is configured as the root storage directory of the cluster. As a result, when the YARN service is started in the cluster, the ResourceManager component cannot properly create data directories such as Node Labels, and the MRHistoryServer component cannot properly create data directories such as directories of aggregate logs.

Solution

  1. Log on to the master node of the cluster. For more information, see Log on to a cluster.

  2. Run the following commands to create data directories and grant users the required permissions on the directories:

    sudo su hadoop
    STORE_DIR=$(hdfs getconf -confKey yarn.node-labels.fs-store.root-dir)
    hadoop fs -mkdir -p $STORE_DIR
    hadoop fs -chmod 775 $STORE_DIR
    hadoop fs -chown hadoop:hadoop $STORE_DIR
    STAGING_DIR=$(hdfs getconf -confKey yarn.app.mapreduce.am.staging-dir)
    hadoop fs -mkdir -p $STAGING_DIR
    hadoop fs -chmod 777 $STAGING_DIR
    hadoop fs -chown hadoop:hadoop $STAGING_DIR
    hadoop fs -mkdir -p $STAGING_DIR/history
    hadoop fs -chmod 775 $STAGING_DIR/history
    hadoop fs -chown hadoop:hadoop $STAGING_DIR/history
    LOG_DIR=$(hdfs getconf -confKey yarn.nodemanager.remote-app-log-dir)
    hadoop fs -mkdir -p $LOG_DIR
    hadoop fs -chmod 1777 $LOG_DIR
    hadoop fs -chown hadoop:hadoop $LOG_DIR
  3. Restart the YARN service. For more information, see Restart a service.

    In the Components section of the Status tab of the YARN service in the EMR console, you can check whether the YARN service starts as expected.

Hive

Cause

You do not have access permissions on the bucket that is configured as the root storage directory of the cluster. As a result, when the Hive service is started in the cluster, the HiveServer component cannot properly create data directories such as a Hive warehouse.

Solution

  1. Log on to the master node of the cluster. For more information, see Log on to a cluster.

  2. Run the following commands to create data directories and grant users the required permissions on the directories:

    hadoop fs -mkdir -p /user/hive/warehouse
    hadoop fs -chown hive /user/hive
    hadoop fs -chown hive /user/hive/warehouse
    hadoop fs -chmod 751 /user/hive
    hadoop fs -chmod 1771 /user/hive/warehouse

In the Components section of the Status tab of the Hive service in the EMR console, you can check whether the HiveServer component starts as expected.

Spark

Cause

You do not have access permissions on the bucket that is configured as the root storage directory of the cluster. As a result, when the Spark service is started in the cluster, the Spark History directory cannot be properly created.

Solution

  1. Log on to the master node of the cluster. For more information, see Log on to a cluster.

  2. Run the following command to create the Spark History directory:

    hadoop fs -mkdir /spark-history

    In the Components section of the Status tab of the Spark service in the EMR console, you can check whether the SparkHistoryServer and SparkThriftServer components start as expected.

Tez

Cause

You do not have access permissions on the bucket that is configured as the root storage directory of the cluster. As a result, when the Tez service is started in the cluster, related files cannot be properly uploaded to the related storage directories.

Solution

  1. Log on to the master node of the cluster. For more information, see Log on to a cluster.

  2. Run the following commands:

    tez_dir=`readlink  $TEZ_HOME`
    tez_version=`basename $tez_dir`
    
    cd /tmp
    mkdir -p $tez_version/lib
    cp $TEZ_HOME/*.jar $tez_version
    cp $TEZ_HOME/lib/*.jar $tez_version/lib
    tar czf $tez_version.tar.gz $tez_version
    
    hadoop fs -mkdir -p /apps/$tez_version
    hadoop fs -rm -f /apps/$tez_version/$tez_version.tar.gz
    hadoop fs -put $tez_version.tar.gz /apps/$tez_version/
    
    rm -fr $tez_version*

    In the Health Check Items section of the Status tab of the Tez service in the EMR console, you can check whether the status of the tez_env_status check item is normal.

Flink

Cause

You do not have access permissions on the bucket that is configured as the root storage directory of the cluster. As a result, when the Flink service is started in the cluster, the Flink History directory cannot be properly created and Flink jobs that are started based on the default settings may not be able to write checkpoints or savepoints to external storage systems.

Solution

  1. Log on to the master node of the cluster. For more information, see Log on to a cluster.

  2. Run the following commands to create directories required by Flink and grant users the required permissions on the directories:

    hdfs dfs -mkdir -p /flink/flink-checkpoints /flink/flink-jobs /flink/flink-savepoints
    hdfs dfs -chmod -R /flink
  3. Restart the Flink service. For more information, see Restart a service.

    In the Components section of the Status tab of the Flink service in the EMR console, you can check whether the FlinkHistoryServer component starts as expected. You can also start a sample Flink job to check whether checkpoints of the Flink job are written to external storage systems.

HBase

Cause

You do not have access permissions on the bucket that is configured as the root storage directory of the cluster. As a result, when the HBase service is started in the cluster, an HBase data storage directory cannot be properly created.

Solution

  1. Log on to the master node of the cluster. For more information, see Log on to a cluster.

  2. Run the following commands to create a data storage directory and grant users the required permissions on the directory:

    hadoop dfs -mkdir -p /hbase
    hadoop fs -chown hbase:hadoop /hbase
    hadoop fs -chmod 755 /hbase
  3. Restart the HBase service. For more information, see Restart a service.

    In the Components section of the Status tab of the HBase service in the EMR console, you can check whether the HBase service starts as expected.