If your business requirements fluctuate, we recommend that you enable auto scaling for your E-MapReduce (EMR) cluster and configure auto scaling rules to increase or decrease task nodes based on your business requirements. Auto scaling not only ensures sufficient computing resources for your jobs but also reduces costs.

Prerequisites

An auto scaling machine group is created. For more information, see Create an auto scaling machine group.

Configure auto scaling

  1. Go to the Cluster Overview page.
    1. Log on to the Alibaba Cloud EMR console.
    2. In the top navigation bar, select the region where your cluster resides and select a resource group based on your business requirements.
    3. Click the Cluster Management tab.
    4. On the Cluster Management page, find your cluster and click Details in the Actions column.
  2. In the left-side navigation pane, choose Auto Scaling > Auto Scaling Settings.
  3. On the Auto Scaling Settings page, find your auto scaling machine group and click Configure Rules in the Actions column.
  4. In the Auto Scaling Settings pane, configure the parameters.
    1. In the Basic Information section, configure the parameters described in the following table.
      Parameter Description
      Max Instances The maximum number of task nodes in the auto scaling machine group. If an auto scaling rule is met but this upper limit has been reached, the system still does not trigger auto scaling. Maximum value: 500.
      Minimum Instances The minimum number of task nodes in the auto scaling machine group. If the number of task nodes set in a scale-out or scale-in rule is less than the value of this parameter, the cluster is scaled based on the value of this parameter when the rule is triggered for the first time.

      For example, if this parameter is set to 3 and a scale-out rule is that one node is added at 00:00 every day, the system adds three nodes at 00:00 on the first day. This way, the requirement for the minimum number of nodes is met.

      Graceful Deprecation You can enable graceful deprecation and set a timeout period to deprecate the task node on which a job on YARN runs. If the period of time for which a job on YARN has run exceeds the timeout period or no job on YARN has run, the system deprecates the task node. The maximum value of Timeout is 3600, in seconds.
      Note To enable graceful deprecation, you must first change the value of the yarn.resourcemanager.nodes.exclude-path parameter on the YARN service page to /etc/ecm/hadoop-conf/yarn-exclude.xml.
    2. In the Cost Optimization Policy section, select Single Billing Method or Cost Optimization Mode.
      • Single Billing Method
        The system automatically searches for the instance types that match the vCPU and memory specifications you specified and displays the instance types in the Instance Type section. You must select one or more instance types in the Instance Type section so that the cluster can be scaled based on the selected instance types. Single Billing Method supports the following billing methods:
        • Pay-As-You-Go
          The order in which you select instance types determines the priorities of the instances that are used. The hourly price of each instance is displayed below the disk specifications in the Instance Type section. The price is the sum of the EMR service price and ECS instance price. Pay as you go
        • Spot Instance
          Notice If you have high service level agreement (SLA) requirements for your jobs, do not select this option to use preemptible instances because the instances may be released due to a failed bid or other reasons.
          The order in which you select instance types determines the priorities of the instances that are used. The hourly price of each instance based on the pay-as-you-go billing method is displayed below the disk specifications in the Instance Type section. You can also set an upper limit for the hourly price of each instance. The instance is displayed if its price does not exceed the upper limit. For more information about preemptible instances, see Overview. Preemptive instance
      • Cost Optimization Mode
        In this mode, you can develop a detailed cost optimization policy to achieve a balance between cost and stability. Cost Optimization Mode
        Parameter Description
        Minimum Pay-As-You-Go Nodes The minimum number of pay-as-you-go instances required by the auto scaling machine group. If the number of pay-as-you-go instances in the auto scaling machine group is less than this value, pay-as-you-go instances are preferentially created.
        Percentage of Pay-As-You-Go Nodes The proportion of pay-as-you-go instances in the auto scaling machine group after the number of created pay-as-you-go instances reaches the value of Minimum Pay-As-You-Go Nodes.
        Lowest-Cost Instance Types The number of instance types that have the lowest prices. If preemptible instances are required, the system evenly creates the preemptible instances of the instance types that have the lowest prices. The maximum value is 3.
        Supplement Preemptible Instances Specifies whether to enable preemptible instance supplementation. If this feature is enabled, the system automatically replaces an existing preemptible instance with a new preemptible instance about five minutes before the existing instance is reclaimed.

        If you do not specify the Minimum Pay-As-You-Go Nodes, Percentage of Pay-As-You-Go Nodes, or Lowest-Cost Instance Types parameter, the machine group is a general cost optimization scaling group. If you specify the parameters, the machine group is a mixed-instance cost optimization scaling group. The two types of cost optimization scaling groups are fully compatible with each other in terms of interfaces and features.

        You can use a mixed-instance cost optimization scaling group to achieve the same effect as a specific general cost optimization scaling group by configuring appropriate mixed-instance policies. Examples:
        • In a general cost optimization scaling group, only pay-as-you-go instances are created.

          In your mixed-instance cost optimization scaling group, set Minimum Pay-As-You-Go Nodes to 0, Percentage of Pay-As-You-Go Nodes to 100, and Lowest-Cost Instance Types to 1.

        • In a general cost optimization scaling group, preemptible instances are preferentially created.

          In your mixed-instance cost optimization scaling group, set Minimum Pay-As-You-Go Nodes to 0, Percentage of Pay-As-You-Go Nodes to 0, and Lowest-Cost Instance Types to 1.

    3. In the Trigger Rules section, specify Trigger Mode.

      Scale By Time: For information about this mode, see Configure auto scaling rules by time.

    4. Click OK.

Enable auto scaling

After you configure auto scaling, find your auto scaling machine group on the Auto Scaling Settings page and turn on the switch in the Auto Scaling Status column to enable auto scaling. Open elastic expansion
If you modify the basic information or trigger rules after you enable auto scaling, you must click Use Latest Configuration in the Actions column on the Auto Scaling Settings page to make the modifications take effect. Apply new configuration

Disable auto scaling

Notice You can disable auto scaling for an auto scaling machine group only if the number of instances in the group is 0. To disable auto scaling for an auto scaling machine group that contains instances, you must first configure a scale-in rule for the machine group or set the maximum number of instances to 0. After all instances in the machine group are removed, you can disable auto scaling for it.

Find your auto scaling machine group on the Auto Scaling Settings page and turn off the switch in the Auto Scaling Status column to disable auto scaling.

If you want to modify instance configurations or your business traffic becomes stable, you can disable auto scaling.