This topic describes how to view environment variables and how to start and stop the service processes of E-MapReduce (EMR) clusters. You can maintain your clusters based on the instructions in this topic.

Prerequisites

An EMR cluster is created. For more information, see Create a cluster.

View environment variables

  1. Log on to your cluster in SSH mode. For more information, see Log on to a cluster.
  2. Run the env command.
    The configurations of environment variables are displayed. Example:
    PRESTO_HOME=/usr/lib/presto-current
    TEZ_CONF_DIR=/etc/ecm/tez-conf
    HUDI_HOME=/usr/lib/hudi-current
    XDG_SESSION_ID=35918
    SPARK_HOME=/usr/lib/spark-current
    HOSTNAME=emr-header-1.cluster-23****
    HADOOP_LOG_DIR=/var/log/hadoop-hdfs
    SMARTDATA_CONF_DIR=/usr/lib/b2smartdata-current//conf
    ECM_AGENT_STACK_CACHE_DIR=/usr/lib/emr/ecm-agent/cache/ecm
    TERM=xterm
    SHELL=/bin/bash
    HUE_CONF_DIR=/etc/ecm/hue-conf
    HADOOP_HOME=/usr/lib/hadoop-current
    FLOW_AGENT_CONF_DIR=/etc/ecm/flow-agent-conf
    HISTSIZE=1000
    YARN_PID_DIR=/usr/lib/hadoop-current/pids
    ECM_AGENT_CACHE_DIR=/usr/lib/emr/ecm-agent/cache
    SSH_CLIENT=1.80.115.185 26289 22
    HADOOP_PID_DIR=/usr/lib/hadoop-current/pids
    EMR_HOME_DIR=/usr/lib/emr
    HADOOP_MAPRED_PID_DIR=/usr/lib/hadoop-current/pids
    SQOOP_CONF_DIR=/etc/ecm/sqoop-conf
    SQOOP_HOME=/usr/lib/sqoop-current
    BIGBOOT_MONITOR_HOME=/usr/lib/b2monitor-current/
    HCAT_HOME=/usr/lib/hive-current/hcatalog
    DATA_FACTORY_CONF_PATH=/etc/ecm/datafactory-conf
    HIVE_HOME=/usr/lib/hive-current
    PWD=/root
    JAVA_HOME=/usr/lib/jvm/java-1.8.0
    EMR_DATA_DIR=/usr/lib/emr/data
    B2MONITOR_CONF_DIR=/usr/lib/b2monitor-current//conf
    HISTCONTROL=ignoredups
    SPARK_PID_DIR=/usr/lib/spark-current/pids
    SHLVL=1
    HOME=/root
    HADOOP_MAPRED_LOG_DIR=/var/log/hadoop-mapred
    ALLUXIO_CONF_DIR=/etc/ecm/alluxio-conf
    ECM_AGENT_LOG_DIR=/usr/lib/emr/ecm-agent/log
    TEZ_HOME=/usr/lib/tez-current
    DATA_FACTORY_HOME=/usr/lib/datafactory-current
    LOGNAME=root
    EMR_LOG_DIR=/usr/lib/emr/log
    EMR_TMP_DIR=/usr/lib/emr/tmp
    XDG_RUNTIME_DIR=/run/user/0
    ECM_AGENT_HOME_DIR=/usr/lib/emr/ecm-agent
    B2SDK_CONF_DIR=/usr/lib/b2smartdata-current/conf
    HIVE_CONF_DIR=/etc/ecm/hive-conf
    _=/usr/bin/env

Log on to the built-in MySQL

  1. Log on to your cluster in SSH mode. For more information, see Log on to a cluster.
  2. Run the following command to log on to the built-in MySQL:
    mysql -uroot -pEMRroot1234
    Note The username that is used to log on to the built-in MySQL is root and the password is EMRroot1234.

Start and stop a single service process in the EMR console

You can start, stop, or restart a service process in the EMR console. Operations on all service processes are similar. This section describes how to start, stop, and restart the HDFS service process DataNode on the emr-worker-1 node of your cluster.

  1. Go to the Cluster Overview page.
    1. Log on to the Alibaba Cloud EMR console.
    2. In the top navigation bar, select the region in which you want to create a cluster and select a resource group based on your business requirements.
    3. Click the Cluster Management tab.
    4. On the Cluster Management page, find your cluster and click Details in the Actions column.
  2. In the left-side navigation pane, choose Cluster Service > HDFS.
  3. Click the Component Deployment tab.
    The HDFS service processes of the cluster are displayed.
  4. Manage the DataNode process on the emr-worker-1 node.
    1. Start the process.
      1. Find the process and click Start in the Actions column.
      2. In the Cluster Activities dialog box, specify Description and click OK.
      3. In the Confirm message, click OK.
    2. Restart the process.
      1. Find the process and click Restart in the Actions column.
      2. In the Cluster Activities dialog box, specify Description and click OK.
      3. In the Confirm message, click OK.
    3. Stop the process.
      1. Find the process and click Stop in the Actions column.
      2. In the Cluster Activities dialog box, specify Description and click OK.
      3. In the Confirm message, click OK.

Manage multiple service processes at the same time in the EMR console

This section describes how to restart the DataNode processes of HDFS on all the nodes of your cluster at the same time.

  1. Go to the Cluster Overview page.
    1. Log on to the Alibaba Cloud EMR console.
    2. In the top navigation bar, select the region in which you want to create a cluster and select a resource group based on your business requirements.
    3. Click the Cluster Management tab.
    4. On the Cluster Management page, find your cluster and click Details in the Actions column.
  2. In the left-side navigation pane, choose Cluster Service > HDFS.
  3. Click the Component Deployment tab.
    The HDFS service processes of the cluster are displayed.
  4. Choose Actions > Restart DataNode in the upper-right corner.
    1. In the Cluster Activities dialog box, specify Description and click OK.
    2. In the Confirm message, click OK.
    Notice After you perform a rolling restart, you cannot perform a common restart on the same processes. Otherwise, an error is reported.

Start and stop a single service process by using the CLI

  • YARN
    Account: hadoop
    • ResourceManager (master node)
      • Start ResourceManager
        /usr/lib/hadoop-current/sbin/yarn-daemon.sh start resourcemanager
      • Stop ResourceManager
        /usr/lib/hadoop-current/sbin/yarn-daemon.sh stop resourcemanager
    • NodeManager (core node)
      • Start NodeManager
        /usr/lib/hadoop-current/sbin/yarn-daemon.sh start nodemanager
      • Stop NodeManager
        /usr/lib/hadoop-current/sbin/yarn-daemon.sh stop nodemanager
    • JobHistory Server (master node)
      • Start JobHistory Server
        /usr/lib/hadoop-current/sbin/mr-jobhistory-daemon.sh start historyserver
      • Stop JobHistory Server
        /usr/lib/hadoop-current/sbin/mr-jobhistory-daemon.sh stop historyserver
    • WebProxyServer (master node)
      • Start WebProxyServer
        /usr/lib/hadoop-current/sbin/yarn-daemon.sh start proxyserver
      • Stop WebProxyServer
        /usr/lib/hadoop-current/sbin/yarn-daemon.sh stop proxyserver
  • HDFS
    Account: hdfs
    • NameNode (master node)
      • Start NameNode
        /usr/lib/hadoop-current/sbin/hadoop-daemon.sh start namenode
      • Stop NameNode
        /usr/lib/hadoop-current/sbin/hadoop-daemon.sh stop namenode
    • DataNode (core node)
      • Start DataNode
        /usr/lib/hadoop-current/sbin/hadoop-daemon.sh start datanode
      • Stop DataNode
        /usr/lib/hadoop-current/sbin/hadoop-daemon.sh stop datanode
  • Hive
    Account: hadoop
    • MetaStore (master node)
      // Start MetaStore. You can set the memory size to a larger value based on your business requirements. 
      HADOOP_HEAPSIZE=512 /usr/lib/hive-current/bin/hive --service metastore >/var/log/hive/metastore.log 2>&1 &
    • HiveServer2 (master node)
      // Start HiveServer2.
      HADOOP_HEAPSIZE=512 /usr/lib/hive-current/bin/hive --service hiveserver2 >/var/log/hive/hiveserver2.log 2>&1 &
  • HBase
    Account: hdfs
    Notice You can perform the following operations only if the HBase service is deployed in your cluster. Otherwise, an error is reported.
    • HMaster (master node)
      • Start HMaster
        /usr/lib/hbase-current/bin/hbase-daemon.sh start master
      • Restart HMaster
        /usr/lib/hbase-current/bin/hbase-daemon.sh restart master
      • Stop HMaster
        /usr/lib/hbase-current/bin/hbase-daemon.sh stop master
    • HRegionServer (core node)
      • Start HRegionServer
        /usr/lib/hbase-current/bin/hbase-daemon.sh start regionserver
      • Restart HRegionServer
        /usr/lib/hbase-current/bin/hbase-daemon.sh restart regionserver
      • Stop HRegionServer
        /usr/lib/hbase-current/bin/hbase-daemon.sh stop regionserver
    • Thrift Server (master node)
      • Start Thrift Server
        /usr/lib/hbase-current/bin/hbase-daemon.sh start thrift -p 9099 >/var/log/hive/thriftserver.log 2>&1 &
      • Stop Thrift Server
        /usr/lib/hbase-current/bin/hbase-daemon.sh stop thrift
  • Hue

    Account: hadoop

    • Start Hue
      su -l root -c "${HUE_HOME}/build/env/bin/supervisor >/dev/null 2>&1 &"
    • Stop Hue
      ps aux | grep hue     // Find the Hue process. 
      kill -9 huepid        // Kill the Hue process. 
  • Zeppelin

    Account: hadoop

    • Start Zeppelin
      // You can set the memory size to a larger value based on your business requirements. 
      su -l root -c "ZEPPELIN_MEM=\"-Xmx512m -Xms512m\" ${ZEPPELIN_HOME}/bin/zeppelin-daemon.sh start"
    • Stop Zeppelin
      su -l root -c "${ZEPPELIN_HOME}/bin/zeppelin-daemon.sh stop"
  • Presto

    Account: hdfs

    • PrestoServer (master node)
      • Start PrestoServer
        /usr/lib/presto-current/bin/launcher --config=/usr/lib/presto-current/etc/coordinator-config.properties start
      • Stop PrestoServer
        /usr/lib/presto-current/bin/launcher --config=/usr/lib/presto-current/etc/coordinator-config.properties stop
    • PrestoServer (core node)
      • Start PrestoServer
        /usr/lib/presto-current/bin/launcher --config=/usr/lib/presto-current/etc/worker-config.properties start
      • Stop PrestoServer
        /usr/lib/presto-current/bin/launcher --config=/usr/lib/presto-current/etc/worker-config.properties stop

Manage multiple service processes at the same time by using the CLI

You can manage service processes on all the core nodes of a cluster at the same time by running script commands. In an EMR cluster, all the worker nodes that are run by the hadoop account or the hdfs account are interconnected with the master node.

For example, you can run the following command to stop the NodeManager processes of all the core nodes. In this example, the number of core nodes is 10.
for i in `seq 1 10`;do ssh emr-worker-$i /usr/lib/hadoop-current/sbin/yarn-daemon.sh stop nodemanager;done