This topic describes how to add, modify, and delete node groups for DataLake, DataFlow, OLAP, DataServing, and Custom clusters.
Background
A node group is a core unit for managing cluster nodes in Alibaba Cloud E-MapReduce. A node group usually consists of Elastic Compute Service (ECS) instances that have the same instance type. You can perform operations on a node group to manage the nodes within it in batches. You can also create node groups with different instance types based on your business needs. For example, you can use memory-optimized instances (vCore:vMem = 1 vCPU:8 GiB) for offline big data processing and compute-optimized instances (vCore:vMem = 1 vCPU:2 GiB) for model training.
For information about managing node groups in Hadoop, Data Science, and EMR Studio clusters, see Manage node groups (Hadoop, Data Science, and EMR Studio clusters).
Limitations
The operations in this topic apply only to DataLake, DataFlow, OLAP, DataServing, and Custom clusters.
Task node groups that use the Pay-as-you-go or Preemptible Instance Billing Method are not eligible for Configuration Upgrades.
For more information about configuration upgrades, see Upgrade node configurations.
Add a node group
Go to the Nodes tab.
Log on to the EMR console. In the left-side navigation pane, click EMR on ECS.
In the top navigation bar, select the region in which your cluster resides and select a resource group based on your business requirements.
On the EMR on ECS page, find the cluster that you want to manage and click Nodes in the Actions column.
On the Nodes page, click Add Node Group.
In the Add Node Group pane, you can configure the following parameters.
Parameter
Description
Zone
The zone where the cluster is located is displayed by default. Click View Zones to select another zone in the region.
You can only add task node groups in other zones.
After you add a node group that spans across zones, enable the YARN Node Label feature to divide the cluster into different partitions. This reduces the impact of bandwidth uncertainty caused by network transmission, especially during the shuffle process, on task efficiency. For more information, see Use Node Labels for node partitioning.
Node Group Type
You can add the following four types of node groups:
Core: A core node group. This type is suitable for scenarios with small data volumes, such as log analysis and website traffic statistics.
Task: A task node group. This type is suitable for scenarios where you need to temporarily add computing resources, such as batch processing and data cleaning.
Gateway: A gateway node group. This type is supported only by DataLake and DataFlow clusters of EMR V5.10.1 or later. It is suitable for scenarios where tasks are frequently submitted, such as model training by data scientists and data processing by data engineers.
Master-Extend: A load extension group. This type is supported only by high availability clusters of EMR V3.51.1 or later and EMR V5.17.1 or later.
If the master node of a cluster has a high payload, you can add a Master-Extend node group to deploy services to different node groups. This distributes the pressure on the master node. This type is suitable for scenarios with large-scale clusters and high master node payloads.
NoteAfter a service is added, it is not deployed to the Master-Extend node group by default. To deploy a service, you can select the service when you add the Master-Extend node group.
Billing Method
The billing method for the node group. The supported billing methods are pay-as-you-go, spot instance, and subscription.
NoteOnly task node groups support spot instances.
Node Group Name
The node group name must be unique.
Components
Only Master-Extend node groups support custom service deployment.
The following services can be deployed:
Hive: HiveMetaStore, HiveServer
Kyuubi: KyuubiServer
Spark: SparkHistoryServer, SparkThriftServer
Assign Public Network IP
Select whether to enable Internet access for the node group. If you enable this feature, all nodes in the node group are connected to the Internet.
vSwitch
You can set a vSwitch in the same VPC when you create a node group. The vSwitch cannot be changed after the node group is created.
NoteYou cannot set a vSwitch that is in a VPC but not in the same zone as the cluster.
Additional Security Group
(Optional) You can associate more security groups with the node group.
You can associate up to four additional security groups with the node group.
Instance Type
Select instance types as needed.
If the billing method is subscription, you can select only one instance type.
If the billing method is pay-as-you-go or spot instance and the node group type is Task, you can select up to 10 instance types with the same number of vCPUs and memory size as backup options.
Storage Configuration
System Disk: Select an enterprise SSD (ESSD) or an ultra disk as needed. The system disk size can range from 60 GiB to 500 GiB. A size of at least 120 GiB is recommended.
Data Disk: Select an ESSD or an ultra disk as needed. The data disk size can range from 40 GiB to 32,768 GiB. A size of at least 80 GiB is recommended.
NoteIf you select enhanced SSDs, you can specify different performance levels (PLs) for the enhanced SSDs based on the disk capacity to meet different cluster performance requirements. The default performance level is PL1. When you configure the system disk, you can select an enhanced SSD of the following performance levels: PL0, PL1, and PL2. When you configure data disks, you can select enhanced SSDs of the following performance levels: PL0, PL1, PL2, and PL3. For more information, see Disks.
Resource Reservation Policy
NoteThis parameter is available only when Node Group Type is set to TASK (Task Instance Group) and Billing Method is set to Pay-as-you-go.
The resource reservation policy lets you associate your private ECS pools. You can go to the ECS console to reserve resources. For more information, see Resource Butler overview.
Public Pool Only (Default): Uses resources directly from public resource pools.
Private Pool First: Select this option if you have created private pools in the ECS console and want to use these resources that are pre-allocated to specific projects or teams. The system first tries to obtain ECS instances from your specified private pool. If the private pool does not have enough available resources, the system automatically turns to public resource pools to fulfill the request.
Specified Private Pool: Specify an ECS private pool for the current EMR cluster.
Automatic Compensation
NoteThis parameter is available only when Node Group Type is set to TASK(Task Node Group).
If you enable automatic compensation, EMR automatically monitors the running status of nodes in the current node group. If an abnormal node is detected, EMR automatically releases the node and scales out the same number of new nodes. For more information, see Node compensation.
Scaling Policy
NoteThis parameter is available only when Billing Method is set to **Preemptible Instance**.
Priority-based Policy (Default)
When a node is created, the system attempts to purchase an instance starting from the first instance type until the creation is successful. The final purchased instance type may vary based on inventory.
Cost Optimization Policy
During a scale-out, Auto Scaling tries to create ECS instances in ascending order of vCPU unit price. During a scale-in, Auto Scaling tries to remove ECS instances in descending order of vCPU unit price. If the billing method in the scaling configuration is set to spot instance, spot instances are created with priority. If spot instances of the specified instance types cannot be created due to reasons such as insufficient inventory, the system automatically tries to create pay-as-you-go instances.
For more information, see Cost optimization mode.
Graceful Shutdown
NoteThis parameter is available only for clusters where the YARN service is deployed.
If you enable graceful shutdown, the system waits for tasks on a node to complete or for the specified timeout period to elapse before scaling in the node. You can go to the YARN service page and configure the yarn.resourcemanager.nodemanager-graceful-decommission-timeout-secs parameter to modify the graceful shutdown timeout period.
Click OK.
After the node group is added, it appears on the Nodes page.
Modify a node group
On the Nodes page, click the Node Group Name of the target group.
In the Node Group Attributes dialog box, modify the node group information and click Save.
For Master, Core, Gateway, and Master-Extend node groups, you can modify the node group name and additional security groups.
For Task node groups, you can modify the node group name, node specifications, and additional security groups. You can also configure the settings in the Advanced Information section.
Delete a node group
To delete a Task or Core node group, its Operation Status must be Running and the Number Of Nodes must be 0.
On the Nodes page, find the desired node group and click Delete Node Group in the Actions column.
In the dialog box that appears, click Delete.
Cost optimization mode
This mode is available only when you add a Task node group and set the billing method to Preemptible Instance.
In this mode, you can create more detailed cost control policies to balance cost and stability.
Parameter | Description |
Minimum Pay-As-You-Go Nodes in Auto Scaling Group | The minimum number of on-demand instances in the scaling group. If the number of on-demand instances is less than this value, the system prioritizes the creation of on-demand instances. |
Percentage of Pay-As-You-Go Nodes | The percentage of on-demand instances to create after the minimum number of on-demand nodes in the scaling group is met. |
Lowest-Cost Instance Types | The number of lowest-cost instance types to use. When spot instances are created, they are evenly distributed among the specified number of instance types. The maximum value is 3. |
Preemptible Instance Compensation | Specifies whether to enable spot instance compensation. If you enable this feature, the system proactively replaces a spot instance approximately five minutes before it is reclaimed. |
Use Pay-as-you-go Instances When Preemptible Instances Are Insufficient | Specifies whether to supplement spot instances with on-demand instances. If the required spot instance capacity cannot be met because of price or inventory issues, the system can create on-demand instances to meet the capacity requirement. |
If you do not specify the Minimum Pay-As-You-Go Nodes, Percentage of Pay-As-You-Go Nodes, or Lowest-Cost Instance Types parameter, the machine group is a general cost optimization scaling group. If you specify the parameters, the machine group is a mixed-instance cost optimization scaling group. The two types of cost optimization scaling groups are fully compatible with each other in terms of interfaces and features.
- In a general cost optimization scaling group, only pay-as-you-go instances are created.
In your mixed-instance cost optimization scaling group, set Minimum Pay-As-You-Go Nodes to 0, Percentage of Pay-As-You-Go Nodes to 100, and Lowest-Cost Instance Types to 1.
- In a general cost optimization scaling group, preemptible instances are preferentially created.
In your mixed-instance cost optimization scaling group, set Minimum Pay-As-You-Go Nodes to 0, Percentage of Pay-As-You-Go Nodes to 0, and Lowest-Cost Instance Types to 1.
References
To scale out a node group, see Scale out an EMR cluster.
To scale in a node group, see Scale in a cluster.
To scale out disks, see Expand a disk.
To configure an Auto Scaling rule, see Configure custom auto scaling rules.
To view Auto Scaling records, see View auto scaling activities.