All Products
Search
Document Center

E-MapReduce:Create a gateway cluster

Last Updated:Jan 17, 2024

You can use a gateway cluster to balance loads and isolate clusters for data security. You can also use the gateway cluster to submit jobs to an E-MapReduce (EMR) cluster. This topic describes how to create a gateway cluster in the EMR console.

Note

If this is the first time you create an EMR cluster after 17:00 (UTC+8) on December 19, 2022, you cannot create a Hadoop or Kafka cluster.

Prerequisites

A Hadoop or Kafka cluster is created in the EMR console. For more information, see Create a cluster.

Limits

This topic applies only to Hadoop and Kafka clusters. For more information about how to deploy a gateway cluster in a DataLake, OLAP, or Dataflow cluster, see Use EMR-CLI to deploy a gateway.

Procedure

  1. Log on to the EMR on ECS console.
  2. On the EMR on ECS page, find the desired cluster and click the name of the cluster in the Cluster ID/Name column.

  3. In the upper-right corner of the Basic Information tab, choose All Operations>Create Gateway.

  4. On the page that appears, configure the parameters.

    Section

    Parameter

    Description

    Associated Settings

    Region

    The geographical location of the gateway cluster.

    Resource Group

    The resource group to which the gateway cluster belongs.

    To create a resource group, click Create Resource Group. For more information, see Create a resource group.

    Associated Cluster

    The EMR cluster to be associated with the gateway cluster. Select an EMR cluster in the region that you specified. The associated cluster must meet the following requirements:

    • The status of the cluster is Running.

    • The cluster is a Hadoop or Kafka cluster.

    Note

    By default, after you select an associated cluster, the virtual private cloud (VPC) of the associated cluster is selected for the gateway cluster. You can associate EMR clusters created in both the old and new versions of the EMR console with the cluster gateway.

    Basic Settings

    Billing Method

    • Subscription: You can use a cluster only after you pay for the cluster.

    • Pay-as-you-go: You can pay for a cluster after you use the cluster. The system bills a cluster based on the uptime of the cluster. You are charged on an hourly basis. We recommend that you use pay-as-you-go clusters for short-term test jobs or dynamically scheduled jobs.

    Zone

    The zone where the associated cluster resides.

    vSwitch

    The vSwitch used by the cluster in the zone that you specified.

    Default Security Group

    The security group to which the associated cluster belongs.

    Assign Public Network IP

    Specifies whether to assign an elastic IP address (EIP) to the gateway cluster.

    Node Group

    • Instance Type: The available instance types in the current region. For more information, see Instance families.

    • System Disk: the type of system disk that you want the gateway cluster to use. System disks are classified into ultra disks, standard SSDs, and ESSDs. The types of system disks that you can use to create a gateway cluster vary based on the region and instance type that you select. By default, a system disk is released together with the cluster that uses the system disk.

      Specify the size of the system disk based on your business requirements. Valid values: 60 to 500. Unit: GiB.

    • Data Disk: the type of data disk you want the gateway cluster to use. Data disks are classified into ultra disks, standard SSDs, and ESSDs. The types of data disks that you can use to create a gateway cluster vary based on the region and instance type that you select. By default, data disks are released together with the cluster that uses the data disks.

      Specify the size of each data disk based on your business requirements. Valid values: 40 to 32768. Unit: GiB.

    • Instances: Specify the number of Elastic Compute Service (ECS) instances that you want to create. Default value: 1.

    Cluster Name

    The name of the gateway cluster. The name must be 1 to 64 characters in length, and can contain only letters, digits, hyphens (-), and underscores (_).

    Identity Credentials

    The identity credentials that are used to log on to all nodes in the gateway cluster.

    • Password: Specify a password that is used to log on to the gateway cluster. The password must be 8 to 30 characters in length, and can contain uppercase letters, lowercase letters, digits, and special characters.

      The password can contain the following special characters:

      !@#$%^&*

    • Key Pair: Select a key pair that is used to log on to the gateway cluster. If no key pair exists, click Create Key Pair to go to the SSH Key Pairs page of the ECS console and create a key pair.

      Keep the .pem private key file secure. After you create a gateway cluster, the public key is automatically bound to the ECS instance. When you log on to the gateway cluster by using SSH, you must enter the private key in the private key file.

    Advanced Settings

    ECS Application Role

    The Resource Access Management (RAM) role that allows applications running in a cluster to access other Alibaba Cloud services. Use the default RAM role. Default value: AliyunECSInstanceForEMRRole.

    Bootstrap Actions

    Optional. You can configure bootstrap actions to run custom scripts before a cluster starts. For more information, see Manage bootstrap actions and Manually run scripts.

    Tags

    Optional. You can add a tag pair when you create a cluster or add a tag pair after you create a cluster. For more information, see Manage and use tags.

    Data Disk Encryption

    Optional. You can turn on Data Disk Encryption only when you create a cluster. For more information, see Enable data disk encryption.

  5. Read the terms of service, select I have read and agree to E-MapReduce Terms of Service, and then click Create.

    After the gateway cluster is created, the Status column of the gateway cluster displays Idle.