edit-icon download-icon

Connect to the cluster using SSH

Last Updated: Mar 13, 2018

If the job and execution plan on the page cannot meet your more complex application requirements, log on to the master node of the E-MapReduce cluster. Here, navigate to the cluster details page where the public network IP address of the cluster master exists. You can log on to the master node through SSH to view various settings and states.

Relevant environment variables have been set on the machine, including the following common ones:

  • JAVA_HOME

  • HADOOP_HOME

  • HADOOP_CONF_DIR

  • HADOOP_LOG_DIR

  • YARN_LOG_DIR

  • HIVE_HOME

  • HIVE_CONF_DIR

  • PIG_HOME

  • PIG_CONF_DIR

You can quote these variables in the script, however, we recommend that you do not change them to avoid unexpected E-MapReduce errors.

Connect to the Master

  1. SSH logs on to the master with the following commands. Obtain the public network IP of the cluster master in the hardware information column on the Cluster Details Page.

    1. ssh root@ip.of.master
  2. Enter the password set during creation.

Connect to a cluster using SSH without a password

You must connect to the cluster for management and operation. To connect to the cluster master, you can break through the SSH password-less logon from the master machine (by default, the cluster master opens up the public network IP). the procedure is as follows:

  1. Connect to the master with the root and password mode as mentioned previously.

  2. Change to Hadoop or hdfs user. su hadoop

Connect to the Master Node using SSH on Linux, Unix, and Mac OS X

  1. Copy the private key to the local machine.

    1. rz ~/.ssh/id_rsa
  2. Return to your local machine and attempt to connect to the master again.

    1. ssh -i private_key_path/id_rsa hadoop@server_ip_address

    If only one private key exits, you can put it in your ~/.ssh/ and use it by default without designation of -i.

Connect to the Master Node using SSH on Windows

You can connect to the master through SSH without input password with multiple methods under Windows.

Method I: PuTTY

  1. Click Download PuTTY.

  2. Download PuTTYgen from the same location.

  3. Open PuTTYgen and load your private key.

    Note: Keep the private key safe. In case of accidental disclosure, generate a new private key immediately for replacement.

  4. Use default configuration and save the private key. Obtain a secret PuTTY key file with a suffix of ppk.

  5. Operate PuTTY and select Session on the configuration page.

  6. Enter the public network IP address of the target machine you will connect to and add the user name (for example, hadoop@MasterNodeIP).

  7. Select Connection on the configuration page. Unfold > Select SSH and unfold > Select Auth.

  8. Select the generated ppk file.

  9. Click Open to log on to the master node automatically.

Method II: Cygwin | MinGW

It is a convenient tool to simulate Linux env in Windows.

For this method, see the preceding SSH method of Linux.

MinGW method is recommended for use because it is the most compact. If the official website cannot be opened, download a Git client. The default Git Bash can be used.

View webui of Hadoop, Spark, Ganglia, and other systems

Note: Confirm you have finished the preceding SSH logon without password process before this step.

For safety, the webui monitoring system ports of Hadoop, Spark, Ganglia, and other systems in the E-MapReduce cluster are not opened to the outside world. If you want to visit these webUIs, a SSH tunnel needs to be built to forward through a port. The following two methods are available:

Note: The following operations are completed in your local machine, instead of the machine in the cluster.

Method I: Port dynamic forwarding

Create a SSH tunnel that can connect certain dynamic port connections between the local machine and the master machine in E-MapReduce cluster.

  1. ssh -i /path/id_xxx -ND 8157 hadoop@masterNodeIP

8157 is any port not used in the local machine and can be customized by you.

After dynamic forwarding, you can view the following:

  • Recommended methods

    We recommend that you use the Chrome browser. Visit Web UI in the following methods:

The Chrome location varies with operating systems. See the following table.

Operating System Chrome Location
Mac OS X /Applications/Google Chrome.app/Contents/MacOS/Google Chrome
Linux /usr/bin/google-chrome
Windows C:\Program Files (x86)\Google\Chrome\Application\chrome.exe
  1. chrome --proxy-server="socks5://localhost:8157" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmppath/

Replace Chrome with the real location in the preceding table.

For Windows system, the tmppath can be written into similar d:/tmppath. For Linux or OSX, /tmp/ can be written directly.

Method II: Local port forwarding

Note: A local port forwarding disadvantage is that only the interface in the outermost layer can be seen. The viewing of detailed job information results in an error.

  1. ssh -i /path/id_rsa -N -L 8157:masterNodeIP:8088 hadoop@masterNodeIP

Parameter description:

  • path: Private key storage path.

  • masterNodeIP: IP address of the master node to be connected.

  • 8088: Access port number of ResourceManager on the master node.

Thank you! We've received your feedback.