If your existing jobs and execution plans do not meet your complex application requirements, you can log on to the master node of the E-MapReduce cluster and navigate to the cluster details page, which contains the public network IP address of the cluster's master node. You can then log on to the master node through SSH to view various settings and states.

Common variables

Some relevant environment variables have been set on the machine, including the following common ones:
  • JAVA_HOME

  • HADOOP_HOME

  • HADOOP_CONF_DIR

  • HADOOP_LOG_DIR

  • YARN_LOG_DIR

  • HIVE_HOME

  • HIVE_CONF_DIR

  • PIG_HOME

  • PIG_CONF_DIR

You can quote these variables in scripts. However, to avoid unexpected E-MapReduce errors, we recommend that you do not change them.

Connect to the master node

  1. Enter the following command for SSH to log on to the master node. You can obtain the public network IP of the cluster's master node in the host information column on the Cluster details page.
    ssh root@ip.of.master
  2. Enter the password set during creation.

Connect to a cluster using SSH without a password

To be able to manage or perform operations on a cluster, you first need to log on. To connect to the master node, you can use the passwordless SSH logon on the master machine. By default, the master node opens up the public network IP. The procedure is as follows:
  1. Connect to the master node in the root and password mode.
  2. Change to a Hadoop or HDFS user.

## SSH mode of Linux

  1. Copy the private key to the local machine.
    sz ~/.ssh/id_rsa
  2. Return to your local machine and attempt to connect to the master again.
    ssh -i private_key_path/id_rsa hadoop@server_ip_address

    If only one private key exists, put it in ~/.ssh/. By default, this private key does not need to be specified by -i.

Connect to the master node using SSH on Windows

There are two ways in Windows of connecting to the master node through SSH without a password:
  • Using PuTTY
    1. Click to enter the PuTTY download page.
    2. Download PuTTYgen.
    3. Open PuTTYgen and load your private key.
      Notice Keep the private key safe. In the event of accidental disclosure, generate a new private key immediately.
    4. Use the default configuration and save the private key. Obtain a secret PuTTY key file with the .ppk suffix.
    5. Operate PuTTY and select Session on the configuration page.
    6. Enter the public network IP address of the target machine and add the user name (for example, hadoop@MasterNodeIP).
    7. Select Connetion > SSH > Auth on the configuration page.
    8. Select the generated PPK file.
    9. Click Open to log on to the master node.
  • Using Cygwin | MinGW

    This convenient tool simulates the Linux environment in Windows.

    To use this method, see the aforementioned SSH method of Linux.

    The MinGW method is recommended because it is the most compact. If the official website cannot be opened, download a Git client. The default Git Bash can be used.

View the web UI of Hadoop, Spark, Ganglia, and other systems

Note You must have finished the passwordless SSH logon before performing this step.
For safety reasons, the ports of the web UI monitoring system for Hadoop, Spark, Ganglia, and other systems in the E-MapReduce cluster are not open to the public. If you want to visit these web UIs, you must build an SSH tunnel for port forwarding. The following two methods are available:
Notice The following operations should be completed on your local machine, and not on the machine in the cluster.
  • Dynamic port forwarding
    Create a SSH tunnel that can connect certain dynamic port connections between the local and master machines in the E-MapReduce cluster.
    ssh -i /path/id_xxx -ND 8157 hadoop@masterNodeIP

    8157 is any port not used in the local machine. You can customize this.

    After dynamic forwarding, you can view the following:
    • Recommended methods

      We recommend that you use Chrome as your browser. View the web UI as follows:
      chrome --proxy-server="socks5://localhost:8157" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmp/

      In Windows, the tmppath can be written into a similar d:/tmppath. In Linux or OSX, /tmp/ can be written directly.

      The location of Chrome depends on the operating system, as shown in the following table.
      Operating system Chrome location
      Mac OS X /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
      Linux /usr/bin/google-chrome
      Windows C:\Program Files (x86)\Google\Chrome\Application\chrome.exe
    • Plug-in method

      If your local machine is connected to the master machine of the E-MapReduce cluster through an SSH tunnel, you need to configure a local agent to view the web UI of Hadoop, Spark, or Ganglia in your browser. Complete the following steps:
      1. For Chrome or Firefox, click here to download the FoxyProxy Standard software.
      2. After you have installed this and restarted your browser, open a text editor and edit the following content:
        <?<!--?--> xml version="1.0" encoding="UTF-8"? >
        <foxyproxy>
        <proxies>
        <proxy name="aliyun-emr-socks-proxy" id="2322596116" notes="" fromSubscription="false" enabled="true" mode="manual" selectedTabIndex="2" lastresort="false" animatedIcons="true" includeInCycle="true" color="#0055E5" proxyDNS="true" noInternalIPs="false" autoconfMode="pac" clearCacheBeforeUse="false" disableCache="false" clearCookiesBeforeUse="false" rejectCookies="false">
        <matches>
        <match enabled="true" name="120.*" pattern="http://120.*" isRegEx="false" isBlackList="false" isMultiLine="false" caseSensitive="false" fromSubscription="false" ></match>
        </matches>
        <manualconf host="localhost" port="8157" socksversion="5" isSocks="true" username="" password="" domain="" ></manualconf>
        </proxy>
        </proxies>
        </foxyproxy>
        Specifically:
        • Port 8157 is a port that your local machine uses to build an SSH connection with the cluster's master machine. It is the same port used in the SSH command executed in the terminal.
        • 120.* is used to match with the IP address in the master machine. Confirm it according to the master IP address.
      3. In a browser, click the FoxyProxy button and then select Options.
      4. Select Import/Export.
      5. Select the XML file you edited in Step 2 and click Open.
      6. In the Import FoxyProxy Setting dialog box, click Add.
      7. In a browser, click the FoxyProxy button and select Use Proxy aliyun-emr-socks-proxy for all URLs.
      8. Enter localhost:8088 in your browser to open the Hadoop interface.
  • Local port forwarding
    Notice One disadvantage of local port forwarding is that only the interface in the outermost layer can be seen. The viewing of detailed job information results in an error.
    ssh -i /path/id_rsa -N -L 8157:masterNodeIP:8088 hadoop@masterNodeIP
    The parameters are described as follows:
    • path: Private key storage path.
    • masterNodeIP: IP address of the master node to be connected.
    • 8088: Access port number of ResourceManager on the master node.