All Products
Search
Document Center

E-MapReduce:Gateway nodes

Last Updated:Apr 10, 2024

Gateway nodes play an important role in Alibaba Cloud E-MapReduce (EMR). Gateway nodes can be associated with an existing EMR cluster and serve as separate job submission points. This topic describes gateway clusters and gateway node groups and provides references on how to create a gateway cluster and a gateway node group based on an existing EMR cluster.

A gateway cluster or a gateway node group is an independent cluster or node group that consists of multiple gateway nodes with the same configurations. Clients such as Hadoop Distributed File System (HDFS), YARN, Hive, Spark 2, Spark 3, JindoSDK, Flink, Sqoop, Impala, Presto, Hudi, Iceberg, Tez, and Delta Lake are deployed on the cluster. If no gateway cluster or gateway node group is created, jobs of an EMR cluster, such as a Hadoop cluster, are submitted on the master node or a core node of the cluster. This consumes the resources of this cluster. After a gateway cluster is created, you can use the gateway cluster to submit jobs of the cluster associated with this gateway cluster. This way, the jobs do not occupy the resources of the associated cluster, and the stability of the core nodes and especially the master node in the associated cluster is improved.

Each gateway cluster or gateway node group can have an independent configuration environment. For example, you can create multiple gateway clusters or gateway node groups for one EMR cluster that is shared by multiple departments to meet their different business requirements. For more information about how to create a gateway cluster and a gateway node group, see the following references.