This topic describes how to connect to the master node of an E-MapReduce (EMR) cluster in SSH mode.

Prerequisites

  • A cluster is created. For more information, see Create a cluster.
  • Port 22 is enabled for the security group where your cluster resides. You can turn on Remote Logon during the creation of a cluster or manually add security group rules after a cluster is created. For more information, see Add security group rules.
    Note When you add inbound security group rules, you must set Authorization Type to IPv4 CIDR Block and Port Range to 22/22.
  • Your local server is connected to the master node of the cluster. You can turn on Assign Public IP Address during the creation of a cluster to associate an EIP address with your cluster. Alternatively, you can assign a fixed public IP address or an EIP address to the master node of your cluster in the ECS console after a cluster is created. For more information, see Bind an ENI.

Background information

After your local machine is connected to the master node of your cluster in SSH mode, you can run Linux commands to monitor the cluster and communicate with the cluster. You can also create an SSH tunnel to view the web UIs of open source components. For more information, see Create an SSH tunnel to access web UIs of open source components.

Obtain the public IP address of the master node

  1. Log on to the Alibaba Cloud EMR console.
  2. In the top navigation bar, select the region where your cluster resides. Select the resource group as required. By default, all resources of the account appear.
  3. Click the Cluster Management tab.
  4. On the Cluster Management page that appears, find the target cluster and click Details in the Actions column.
  5. In the Instance Info section of the Cluster Overview page, view the public IP address of the master node.
    IP

Connect to the master node by using an SSH key pair

Note For information about how to obtain the public IP address of the master node, see Obtain the public IP address of the master node.
Procedure:
  • Connect from your local machine that runs a Linux operating system
    Use the private key file ecs.pem as an example:
    1. Obtain the path where the ecs.pem file is stored on your local server.
    2. Run the following command to modify the attribute of the private key file:
      chmod 400 ~/.ssh/ecs.pem

      ~/.ssh/ecs.pem is the path where the ecs.pem file is stored on your local server.

    3. Run the following command to connect to the master node:
      ssh -i ~/.ssh/ecs.pem root@10.10.xx.xx

      10.10.xx.xx indicates the public IP address of the master node.

  • Connect from your local machine that runs a Windows operating system
    Perform the following steps to log on to the master node:
    1. Download PuTTY and PuTTYgen.
    2. Convert the format of the private key file from .pem to .ppk.
      1. Run PuTTYgen. In this example, PuTTYgen 0.73 is used.
      2. In the Actions section, click Load to import the private key file that is saved when you create a cluster.

        Make sure that the format of the file to be imported is All files (*. *).

      3. Select the specific .pem file and click Open.
      4. Click Save private key.
      5. In the dialog box that appears, click Yes. Specify a name for the .ppk file and click Save.

        Save the .ppk file such as kp-123.ppk in this example to your local machine.

    3. Run PuTTY.
    4. Choose Connection > SSH > Auth in the left-side navigation pane. Click Browse below Private key file for authentication to select the .ppk file.
    5. Click Session. Enter the logon account and the public IP address of the master node in the Host Name (or IP address) field.

      The format is root@[Public IP address of the master node], for example, root@10.10.xx.xx.

      session
    6. Click Open.
      If the following information appears, the logon succeeds.putty

Connect to the master node in SSH password mode

Note The username and password involved in the following operations are the root user and the password you specified when you created a cluster. For information about how to obtain the public IP address of the master node, see Obtain the public IP address of the master node.
Procedure:
  • Connect from your local machine that runs a Linux operating system

    Run the following command in the command-line interface (CLI) of your local machine to connect to the master node:

    ssh root@[Public IP address of the master node]
  • Connect from your local machine that runs a Windows operating system
    1. Download and install PuTTY.

      Download link: PuTTY.

    2. Start PuTTY.
    3. Configure the parameters required to connect to a Linux master node.
      • Host Name (or IP address): Specify the fixed public IP address of the master node or the EIP address associated with the master node.
      • Port: Enter port number 22.
      • Connection type: Select SSH.
      • Saved Sessions: optional. Enter an informative name that is easy to identify and click Save to save the session. Saved sessions store session information. You do not need to enter session information such as the public IP address when you log on to the master node again.
    4. Click Open.
    5. Specify the username (root by default) and press Enter.

      The characters of the password are hidden when you enter the password. After you enter the password, press Enter.

Appendix: Environment variables of a cluster

Notice We recommend that you do not change the values of these variables. Otherwise, unexpected errors may occur.
Environment variables:
  • JAVA_HOME
  • HADOOP_HOME
  • HADOOP_CONF_DIR
  • HADOOP_OG_DIR
  • YARN_LOG_DIR
  • HIVE_HOME
  • HIVE_CONF_DIR
  • PIG_HOME
  • PIG_CONF_DIR