Managed auto scaling monitors your YARN cluster load and automatically adjusts the number of task nodes to match your workloads—scaling out when demand rises and scaling in when it drops. This topic describes how to configure managed auto scaling rules in the E-MapReduce (EMR) console.
Prerequisites
Before you begin, make sure that you have:
-
A DataLake, Dataflow, online analytical processing (OLAP), DataServing, or custom cluster. For more information, see Create a cluster.
-
A task node group with pay-as-you-go instances or preemptible instances in the cluster. For more information, see Create a node group
Limitations
-
Only clusters with the YARN service deployed support managed auto scaling rules.
-
If Trino, Presto, StarRocks, Impala, or ClickHouse is deployed in the cluster, managed auto scaling rules may become ineffective.
-
To avoid auto scaling failures caused by insufficient Elastic Compute Service (ECS) instances, select multiple instance types when creating a node group—up to 10 types. During scale-out, EMR tries each instance type in list order, starting with the first. The actual instance types used depend on inventory availability.
Configure managed auto scaling for an existing cluster
After you enable managed auto scaling, EMR continuously monitors YARN cluster loads and calculates peak loads over the past period to automatically adjust the number of task nodes.
-
Log in to the EMR console. In the left-side navigation pane, click EMR on ECS.
-
In the top navigation bar, select a region and a resource group.
-
On the EMR on ECS page, click the name of your cluster in the Cluster ID/Name column.
-
Click the Auto Scaling tab.
-
On the Configure Auto Scaling subtab, go to the Configure Auto Scaling Rule section and click Managed Auto Scaling Rule.
-
In the dialog box, click Reconfigure and set the following parameters. Example: If Minimum number of task nodes is
0, Maximum number of task nodes is20, and Maximum number of pay-as-you-go task nodes is15, EMR first adds up to 15 pay-as-you-go nodes during scale-out. If more nodes are needed, the remaining capacity is filled by preemptible instances.Parameter Description Minimum number of task nodes The minimum number of task nodes reserved for auto scaling of the cluster when the managed scale-in rule is triggered. Maximum number of task nodes The maximum number of task nodes reserved for auto scaling of the cluster when the managed scale-out rule is triggered. Maximum number of pay-as-you-go task nodes The maximum number of pay-as-you-go nodes that can be added during scale-out. This parameter is used to configure the proportions of pay-as-you-go nodes and preemptible instances. The default value equals the maximum number of task nodes. If your cluster also has a preemptible instance task node group, set this to a lower value—the remaining capacity is filled by preemptible instances. -
Click Save and Apply.
Configure managed auto scaling when creating a cluster
-
Log in to the EMR console. In the left-side navigation pane, click EMR on ECS.
-
In the top navigation bar, select a region and a resource group.
-
Click Create Cluster. For more information about cluster parameters, see Create a cluster.
-
In the auto scaling section, select Managed Auto Scaling Rule.
-
Click Edit next to Managed Auto Scaling Rule and set the following parameters. Example: If Minimum number of task nodes is
0, Maximum number of task nodes is20, and Maximum number of pay-as-you-go task nodes is15, EMR first adds up to 15 pay-as-you-go nodes during scale-out. If more nodes are needed, the remaining capacity is filled by preemptible instances.Parameter Description Minimum number of task nodes The minimum number of task nodes reserved for auto scaling of the cluster when the managed scale-in rule is triggered. Maximum number of task nodes The maximum number of task nodes reserved for auto scaling of the cluster when the managed scale-out rule is triggered. Maximum number of pay-as-you-go task nodes The maximum number of pay-as-you-go nodes that can be added during scale-out. This parameter is used to configure the proportions of pay-as-you-go nodes and preemptible instances. The default value equals the maximum number of task nodes. If your cluster also has a preemptible instance task node group, set this to a lower value—the remaining capacity is filled by preemptible instances. -
Click Save and Apply.
-
Confirm the order.