You can use a gateway cluster to balance loads and isolate clusters in a secure manner. You can also use the gateway cluster to submit jobs to an E-MapReduce (EMR) cluster. This topic describes how to create a gateway cluster.

Prerequisites

A Hadoop or Kafka cluster is created in EMR. For more information, see Create a cluster.

Limits

A gateway cluster can be associated with only a Hadoop or Kafka cluster in EMR.

Procedure

  1. Go to the Cluster Management page.
    1. Log on to the Alibaba Cloud EMR console.
    2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
    3. Click the Cluster Management tab.
  2. In the upper-right corner of the Cluster Management page, click CreateGateway.
  3. On the Create Gateway page, configure the parameters.
    Section Parameter Description
    Basic Information Cluster Name The name of the gateway cluster. The name must be 1 to 64 characters in length and can contain only letters, digits, hyphens (-), and underscores (_).
    Assign Public IP Address Specifies whether to assign an elastic IP address to the gateway cluster.
    Password and Key Pair
    • Password: the password that is used to log on to the gateway cluster. The password must be 8 to 30 characters in length and must contain uppercase letters, lowercase letters, digits, and special characters.

      The password can contain the following special characters:

      ! @ # $ % ^ & *

    • Key Pair: the name of the key pair that is used to log on to the gateway cluster. If no key pair is created, click Create Key Pair next to this field to go to the SSH Key Pairs page of the Elastic Compute Service (ECS) console and create a key pair.

      Keep the .pem private key file secure. After you create a gateway cluster, the public key is automatically bound to the ECS instance. When you log on to the gateway cluster by using SSH, you must enter the private key in the private key file.

    Billing Method
    • Subscription: You are charged only once for each subscription period. The unit price of a subscription cluster is lower than the unit price of a pay-as-you-go cluster of the same specifications. The amount of savings increases with the subscription period.
    • Pay-As-You-Go: You are charged for the hours during which a cluster is running.
    Cluster Configuration Associated Cluster The cluster that you want to associate with the gateway cluster. The gateway cluster submits jobs to this cluster.
    Zone The zone where the associated cluster resides.
    Network Type The network type of the associated cluster.
    VPC The virtual private cloud (VPC) to which the associated cluster belongs.
    VSwitch The vSwitch that you want the gateway cluster to use. Select a vSwitch in the zone and the VPC in which the cluster resides.
    Security Group Name The name of the security group to which the associated cluster belongs.
    Instance Gateway Instance The available instance types in the current region. For more information, see Instance families.
    • System Disk Type: the type of the system disk that you want the gateway cluster to use. System disks are classified into ultra disks, standard SSDs, and ESSDs. The types of system disks that you can use to create a gateway cluster vary based on the region and instance type that you select. By default, system disks are released after the relevant cluster is released.
    • Disk Size: the size of the system disk. Unit: GB. Valid values: 40 to 500. Default value: 300.
    • Data Disk Type: the type of data disks you want the gateway cluster to use. Data disks are classified into ultra disks, standard SSDs, and ESSDs. The types of data disks that you can use to create a gateway cluster vary based on the region and instance type that you select. By default, the data disks are released after the relevant cluster is released.
    • Disk Size: the size of a data disk. Unit: GB. Valid values: 200 to 4000. Default value: 300.
    • Count: the number of data disks. Valid values: 1 to 10.
    Advanced Settings Permission Settings The RAM roles that allow applications running in a cluster to access other Alibaba Cloud services. You can use the default RAM roles.
    • EMR Role: This parameter has a fixed value of AliyunEMRDefaultRole and cannot be modified. This RAM role allows a cluster to access other Alibaba Cloud services, such as ECS and Object Storage Service (OSS).
    • ECS Role: You can also assign an application role to a cluster. Then, EMR applies for a temporary AccessKey pair when applications running on the compute nodes of that cluster access other Alibaba Cloud services, such as OSS. This way, you do not need to manually enter an AccessKey pair. You can grant the access permissions of the application role on specific Alibaba Cloud services based on your business requirements.
    Bootstrap Actions Optional. You can configure bootstrap actions to run custom scripts before a cluster starts. For more information, see Bootstrap actions.
    Data Disk Encryption This feature is disabled by default.
    If you turn on Enable Encryption, data in all cloud disks that serve as the data disks of the ECS instances in the cluster is encrypted. By default, a service-managed key is used to encrypt your data. You can also use a user-managed key to encrypt your data.
    Notice You cannot encrypt data in local disks.
  4. Read the terms of service, select E-MapReduce Service Terms, and then click Create.
    The gateway cluster that you created appears in the cluster list and its state changes from Initializing to Idle a few minutes later.