A gateway cluster provides load balancing and security isolation. You can also use a gateway cluster to submit jobs to an E-MapReduce (EMR) cluster.
Prerequisites
Before you begin, ensure that you have:
An existing Hadoop or Kafka cluster in EMR with a Running status. For setup instructions, see Create a cluster.
Hadoop and Kafka clusters are available only if your Alibaba Cloud account created such clusters before 17:00 (UTC+8) on December 19, 2022. Accounts that did not create these cluster types before this deadline can no longer create them.
Limitations
This procedure applies only to Hadoop and Kafka clusters. For gateway environment deployment on DataLake, OLAP, DataFlow, and Custom clusters, see Gateway deployment modes and selection guide.
Create a gateway cluster
Log on to the EMR console.
On the EMR on ECS page, click the name of the target cluster.
In the upper-right corner of the Basic Information page, choose All Operations > Create Gateway.
On the Create Gateway page, configure the parameters described in the following table.
Associated settings
Parameter Description Region The physical location of the gateway cluster. Resource Group The resource group for the gateway cluster. To create a new resource group, click Create Resource Group. For more information, see Create a resource group. Associated Cluster The compute cluster to associate with the gateway cluster, filtered by the selected region. The cluster must be in Running status and must be a Hadoop or Kafka cluster. After you select a cluster, the gateway cluster's VPC defaults to the VPC of the associated cluster. Clusters created in both the new and old console versions are supported. Basic settings
Parameter Description Billing Method The billing method for the gateway cluster. Subscription: Pay before use. Pay-as-you-go: Charged hourly based on actual usage. Suitable for short-term tests or dynamic workloads. Zone The zone where the associated cluster is located. vSwitch The vSwitch in the corresponding VPC and zone. Default Security Group The security group of the associated cluster. Assign Public Network IP Whether to attach an Elastic IP Address (EIP) to the gateway. Node Group The compute resources for gateway nodes. Configure the following sub-parameters: Instance Type: The ECS instance types available in the selected region. For more information, see Instance families. System Disk: The system disk type. Supported types: ultra disk, enterprise SSD (ESSD), and standard SSD. Available types vary by instance type and region. Size range: 60 GiB to 500 GiB. Released with the cluster by default. Data Disk: The data disk type. Supported types: ultra disk, ESSD, and standard SSD. Available types vary by instance type and region. Size range: 40 GiB to 32,768 GiB. Released with the cluster by default. Instances: The number of gateway nodes. Default: 1. Cluster Name The name of the gateway cluster. Must be 1–64 characters and can contain Chinese characters, letters, digits, hyphens (-), and underscores (_). Identity Credentials The credentials used to log on to all nodes in the gateway cluster. Password: Must be 8–30 characters and include uppercase letters, lowercase letters, digits, and special characters ( !@#$%^&*). Key Pair: Select a key pair for SSH access. If you haven't created one, click Create Key Pair to go to the ECS console. Keep the private key file (.pem) secure — it is required each time you log on via SSH.Advanced settings
Parameter Description ECS Application Role The Resource Access Management (RAM) role that grants applications on the cluster permission to call other Alibaba Cloud services. Default: AliyunECSInstanceForEMRRole.Bootstrap Actions Optional. Custom scripts that run before the cluster starts. For more information, see Run scripts using bootstrap actions and Manually run scripts. Tags Optional. Tags to attach to the cluster. Tags can be added at creation time or after. For more information, see Manage tags. Data Disk Encryption Optional. Encrypts the data disk. This setting can only be enabled at cluster creation time. For more information, see Enable data disk encryption. Click Create and Pay. The cluster Status changes from Creating to Running when the gateway cluster is ready.
What's next
After the gateway cluster status changes to Running, connect to it via SSH using the credentials you configured during creation. Once connected, use the cluster's client tools to submit jobs to the associated Hadoop or Kafka cluster.