If the computing workloads of a Hadoop cluster fluctuate on a regular basis, you can add and remove a specific number of task nodes at fixed points in time every day, every week, or every month to supplement the computing power. This ensures job completion at low costs.

Prerequisites

An auto scaling machine group is created. For more information, see Create an auto scaling machine group.

Configure auto scaling rules by time

For more information about how to configure basic information and cost optimization policies, see Manage auto scaling.

If you want to configure auto scaling rules by time in E-MapReduce (EMR), you can configure the relevant parameters based on the descriptions in the following table.

Auto scaling rules are divided into scale-out rules and scale-in rules. This topic uses a scale-out rule as an example. If you disable auto scaling for a cluster, all auto scaling rules are cleared. If you enable auto scaling for the cluster again, you must reconfigure auto scaling rules. Expansion by time
Parameter Description
Rule Name The name of the auto scaling rule. The name must be unique in a cluster.
Rule execution cycle
  • Run Periodically: Auto scaling is performed at a specific point in time every day, every week, or every month.
  • Run Once: Auto scaling is performed only once at a specific point in time.
Retry Interval(Seconds) The retry interval. Auto scaling may not be performed at the specified point in time due to various reasons. If the retry interval is set, the system tries to perform auto scaling every 30 seconds during the period specified by this parameter until auto scaling succeeds. Valid values: 0 to 21600.

For example, auto scaling operation A needs to be performed within a specified period of time. If auto scaling operation B is still in progress or is in the cooldown state during this period, operation A cannot be performed. In this case, the system tries to perform operation A every 30 seconds within the retry interval you specified. If required conditions are met, the cluster immediately runs auto scaling.

Scale Out(Nodes) The number of task nodes you want to add to the cluster each time an auto scaling rule is triggered.
Cooldown(Seconds) The interval between two auto scaling activities. Auto scaling is forbidden during the cooldown.

Configure the specifications of nodes

You can specify the hardware specifications of the nodes that are used to scale in or scale out a cluster. You can configure the specifications only before you enable auto scaling. You are not allowed to modify the specifications after you enable auto scaling. If modifications are required due to special circumstances, disable auto scaling, modify the specifications, and then enable auto scaling again.

  • The system automatically searches for the instance types that match the vCPU and memory specifications you specified and displays the instance types in the Instance Type section. You must select one or more instance types in the Instance Type section so that the cluster can be scaled based on the selected instance types.
  • You can select a maximum of three instance types. This prevents auto scaling failures due to insufficient Elastic Compute Service (ECS) resources.
  • Regardless of whether you select an ultra disk or a standard SSD, the minimum size of a data disk is 40 GB.