A gateway is an ECS server in the same intranet as an E-MapReduce cluster. You can use gateways for load balancing and security isolation or to submit jobs to E-MapReduce clusters.

You can create a gateway in the following two ways:

Create a gateway on the E-MapReduce console

E-MapReduce gateways only support Hadoop clusters. Before you create an E-MapReduce gateway, you first need to create a Hadoop cluster. To create an E-MapReduce gateway, complete the following steps.
  1. Log on to the E-MapReduce console.
  2. Click Create Gateway.
  3. Enter the required information on the Create Gateway page.
    • Billing Method:
      • Subscription: Charges for a specified period of time. This method is more cost-effective than Pay-As-You-Go, especially when you pay for three years at a time, as you get a larger discount.
      • Pay-As-You-Go: Charges by the hour. This method calculates fees based on the number of hours that you use the product.
    • Cluster: Create a gateway that the gateway can submit jobs to. Gateways automatically configure the Hadoop environment that is consistent with the cluster.
    • Configuration: The available ECS instance specifications in the zone.
    • System Disk Type: The system disk type of the gateway node. There are two types of system disk: SSD cloud disk and efficient cloud disk. The displayed type varies according to the server model and region. By default, the system disk is released when the cluster is released.
    • System Disk Size: The minimum size is 40 GB and the maximum is 500 GB. The default value is 300 GB.
    • Data Disk Type: The data disk type of the gateway node. There are two types of data disk: SSD cloud disk and efficient cloud disk. The displayed type varies according to the server model and region. By default, the data disk is released when the cluster is released.
    • Data Disk Size: The minimum size is 200 GB and the maximum is 4000 GB. The default value is 300 GB.
    • Quantity: The number of data disks. The minimum is 1 and the maximum is 10.
    • Cluster Name: The name of a gateway. It can contain between 1 and 64 characters. Only Chinese characters, letters, numbers, hyphens (-), and underscores (_) are allowed.
    • Password/Key Pair:
      • Password Mode: Enter the password for logging on to the gateway in the text box.
      • Key Pair Mode: Select the key pair name for logging on to the gateway in the drop-down menu. If no key pair has been created yet, click Create Key Pair on the right to go to the ECS console. Do not disclose private key files in the .pem format that correspond to the key pair. After the gateway is created, the public key of the key pair is automatically bound to the ECS where the gateway is located. When you log on to the gateway through SSH, enter the private key in the private key file.
  4. Click Create to save the configurations.

    If the creation is successful, the newly created gateway is displayed in the cluster list and the status in the Status column becomes Idle.

Create a gateway manually

  • Network environment

    Make sure that the gateway machine is in the security group of the corresponding E-MapReduce cluster. This allows the gateway nodes to have easy access to the E-MapReduce cluster. For more information about setting the security group of a machine, see Create a security group.

  • Software environment
    • System environment: CentOS 7.2+ is recommended.
    • Java environment: JDK 1.7 or later must be installed. OpenJDK 1.8.0 is recommended.
  • Procedure
    • E-MapReduce 2.7 or later, 3.2 or later

      To create a gateway in these versions, we recommend that you use the E-MapReduce console.

      If you want to set up a gateway manually, copy the following script to the gateway host and run it: sh deploy.sh <masteri_ip> master_password_file.
      • deploy.sh is the script name.
      • masteri_ip is the IP address of the master node in the cluster, which needs to be accessible.
      • master_password_file is the file for storing the password of the master node, which is written in the file.
      #! /usr/bin/bash
      If [$ #! = 2]
      then
         echo "Usage: $0 master_ip master_password_file"
         exit 1;
      fi
      masterip=$1
      masterpwdfile=$2
      if ! type sshpass >/dev/null 2>&1; then
         yum install -y sshpass
      fi
      if ! type java >/dev/null 2>&1; then
         yum install -y java-1.8.0-openjdk
      fi
      mkdir -p /opt/apps
      mkdir -p /etc/ecm
      echo "Start to copy package from $masterip to local gateway(/opt/apps)"
      echo " -copying hadoop-2.7.2"
      sshpass -f $masterpwdfile scp -r -o 'StrictHostKeyChecking no' root@$masterip:/usr/lib/hadoop-current /opt/apps/
      echo " -copying hive-2.0.1"
      sshpass -f $masterpwdfile scp -r root@$masterip:/usr/lib/hive-current /opt/apps/
      echo " -copying spark-2.1.1"
      sshpass -f $masterpwdfile scp -r root@$masterip:/usr/lib/spark-current /opt/apps/
      echo "Start to link /usr/lib/\${app}-current to /opt/apps/\${app}"
      if [ -L /usr/lib/hadoop-current ]
      then
         unlink /usr/lib/hadoop-current
      fi
      ln -s /opt/apps/hadoop-current  /usr/lib/hadoop-current
      if [ -L /usr/lib/hive-current ]
      then
         unlink /usr/lib/hive-current
      fi
      ln -s /opt/apps/hive-current  /usr/lib/hive-current
      if [ -L /usr/lib/spark-current ]
      then
         unlink /usr/lib/spark-current
      fi
      ln -s /opt/apps/spark-current /usr/lib/spark-current
      echo "Start to copy conf from $masterip to local gateway(/etc/ecm)"
      sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/hadoop-conf  /etc/ecm/hadoop-conf
      sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/hive-conf /etc/ecm/hive-conf
      sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/spark-conf /etc/ecm/spark-conf
      echo "Start to copy environment from $masterip to local gateway(/etc/profile.d)"
      sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hdfs.sh /etc/profile.d/
      sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/yarn.sh /etc/profile.d/
      sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hive.sh /etc/profile.d/
      sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/spark.sh /etc/profile.d/
      if [ -L /usr/lib/jvm/java ]
      then
         unlink /usr/lib/jvm/java
      fi
      echo "" >>/etc/profile.d/hdfs.sh
      echo export JAVA_HOME=/usr/lib/jvm/jre-1.8.0 >>/etc/profile.d/hdfs.sh
      echo "Start to copy host info from $masterip to local gateway(/etc/hosts)"
      sshpass -f $masterpwdfile scp root@$masterip:/etc/hosts /etc/hosts_bak
      cat /etc/hosts_bak | grep emr | grep cluster >>/etc/hosts
      if ! id hadoop >& /dev/null
      then
         useradd hadoop
      fi
    • E-MapReduce 2.7 or earlier, 3.2 or earlier
      Copy the following script to the gateway host and run it: sh deploy.sh <masteri_ip> master_password_file.
      • deploy.sh is the script name.
      • masteri_ip is the IP address of the master node in the cluster, which needs to be accessible.
      • master_password_file is the file for storing the password of the master node, which is written in the file.
      ! /usr/bin/bash
      if [ $# ! = 2 ]
      then
         echo "Usage: $0 master_ip master_password_file"
         exit 1;
      fi
      masterip=$1
      masterpwdfile=$2
      if ! type sshpass >/dev/null 2>&1; then
         yum install -y sshpass
      fi
      if ! type java >/dev/null 2>&1; then
         yum install -y java-1.8.0-openjdk
      fi
      mkdir -p /opt/apps
      mkdir -p /etc/emr
      echo "Start to copy package from $masterip to local gateway(/opt/apps)"
      echo " -copying hadoop-2.7.2"
      sshpass -f $masterpwdfile scp -r -o 'StrictHostKeyChecking no' root@$masterip:/usr/lib/hadoop-current /opt/apps/
      echo " -copying hive-2.0.1"
      sshpass -f $masterpwdfile scp -r root@$masterip:/usr/lib/hive-current /opt/apps/
      echo " -copying spark-2.1.1"
      sshpass -f $masterpwdfile scp -r root@$masterip:/usr/lib/spark-current /opt/apps/
      echo "Start to link /usr/lib/\${app}-current to /opt/apps/\${app}"
      if [ -L /usr/lib/hadoop-current ]
      then
         Unlink/usr/lib/hadoop-Current
      fi
      ln -s /opt/apps/hadoop-current  /usr/lib/hadoop-current
      if [ -L /usr/lib/hive-current ]
      then
         unlink /usr/lib/hive-current
      fi
      ln -s /opt/apps/hive-current  /usr/lib/hive-current
      if [ -L /usr/lib/spark-current ]
      then
         unlink /usr/lib/spark-current
      fi
      Ln-S/opt/apps/spark-current/usr/lib/spark-Current
      echo "Start to copy conf from $masterip to local gateway(/etc/emr)"
      sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/hadoop-conf  /etc/emr/hadoop-conf
      sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/hive-conf /etc/emr/hive-conf
      sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/spark-conf /etc/emr/spark-conf
      Echo "start to copy environment from $ masterip to local Gateway (/etc/profile. d )"
      sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hadoop.sh /etc/profile.d/
      if [ -L /usr/lib/jvm/java ]
      then
         unlink /usr/lib/jvm/java
      fi
      ln -s /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre /usr/lib/jvm/java
      echo "Start to copy host info from $masterip to local gateway(/etc/hosts)"
      sshpass -f $masterpwdfile scp root@$masterip:/etc/hosts /etc/hosts_bak
      cat /etc/hosts_bak | grep emr | grep cluster >>/etc/hosts
      if ! id hadoop >& /dev/null
      then
         useradd hadoop
      fi
  • Test
    • Hive
      [hadoop@iZ23bc05hrvZ ~]$ hive
      hive> show databases;
      OK
      default
      Time taken: 1.124 seconds, Fetched: 1 row(s)
      hive> create database school;
      OK
      Time taken: 0.362 seconds
      hive>
    • Run the Hadoop job
      [hadoop@iZ23bc05hrvZ ~]$ hadoop  jar /usr/lib/hadoop-current/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 10 10
      Number of Maps  = 10
      Samples per Map = 10
      Wrote input for Map #0
      Wrote input for Map #1
      Wrote input for Map #2
      Wrote input for Map #3
      Wrote input for Map #4
      Wrote input for Map #5
      Wrote input for Map #6
      Wrote input for Map #7
      Wrote input for Map #8
      Wrote input for Map #9
        File Input Format Counters 
            Bytes Read=1180
        File Output Format Counters 
            Bytes Written=97
      Job Finished in 29.798 seconds
      Estimated value of Pi is 3.20000000000000000000