Alibaba Cloud Container Service for Kubernetes (ACK) provides the auto scaling component for nodes to automatically scale in and out. Regular instances, instances with GPU capabilities, and preemptible instances can be automatically added to or removed from an ACK cluster based on your requirements. This feature is applicable to instances deployed across multiple zones and diverse instance types, and also supports different scaling modes.

How it works

The auto scaling feature automatically scales nodes based on the resource allocation in ACK clusters. The resource allocation in nodes is calculated based on the resource requests. When the pods on a node are overwhelmed by resource requests, they become pending. The auto scaling component then automatically calculates the number of nodes required to be added based on the predefined resource specifications and constraint of the scaling group. If the scaling requirements are met, the nodes of the scaling group are added to the ACK cluster. If the resource demand on a node in the scaling group drops below the threshold, the auto scaling component automatically removes the node from the ACK cluster. Therefore, appropriate resource request settings are the prerequisite for auto scaling.


  • For each account, the default CPU quota of pay-as-you-go instances in each region is 50 vCPUs. You can create up to 48 custom route entries in each route table in a Virtual Private Cloud (VPC) network. To increase the quota, submit a ticket.
  • The stock of a specific Elastic Compute Service (ECS) instance type fluctuates greatly. We recommend that you specify multiple instance types for a scaling group. This improves the success rate of node scaling.
  • In swift mode, when a node is shut down and reclaimed, it stops running and remains in the NotReady state. When a scale-out event is triggered, the node status changes to Ready.
  • When a node is shut down and reclaimed in swift mode, you are charged only for the storage costs of the disk. Exceptions are nodes that use local disks, such as the instance type of ecs.d1ne.2xlarge, where the computing costs are charged in addition. If the stock of the node resources is sufficient, nodes can be launched immediately.
  • If you have bound Elastic IP addresses (EIPs) to the pods, we recommend that you do not delete the auto scaling group or the ECS nodes scaled out by the scaling group from the ECS console. Otherwise, the Elastic IP addresses cannot be automatically released.

Perform auto scaling

  1. Log on to the Container Service console.
  2. In the left-side navigation pane, choose Clusters > Clusters. The Clusters page appears.
  3. Find the target cluster, choose More > Auto Scaling on the right side of the page.
    Auto Scaling


Activate Auto Scaling (ESS)
  1. In the dialog box that appears, click Auto Scaling (ESS) to log on to the Auto Scaling console.
    Activate Auto Scaling
  2. Click Activate Auto Scaling to go to the Enable Service page.
    Enable Service
  3. Select the I agree with Auto Scaling Service Agreement of Service check box and click Enable Now.
    Enable Now
  4. On the Activated page, click Console to return to the Auto Scaling console.
    Auto Scaling
  5. Click Authorize to go to the Cloud Resource Access Authorization page, and then authorize Auto Scaling to access other cloud resources.
    Cloud Resource Access Authorization
  6. Click Confirm Authorization Policy.
    Confirm Authorization Policy

Expected result

If the authorization is successful, you are redirected to the Auto Scaling console. Go to the page in step 1 and continue to Grant permissions to RAM roles.

Grant permissions to RAM roles

  1. Click the second hyperlink in the dialog box to go to the RAM Roles page.
    Note You must log on to the console with an Alibaba Cloud account.
    RAM Roles
  2. Click the Permissions tab. Click the target policy to go to the details page.
  3. Click Modify Policy Document. The Modify Policy Document page appears on the right side of the page.
    Modify Policy Document
  4. Add the following content to the Action field of the Policy Document:
    Note Add a comma (,) to the end of the bottom line in the Action field before you add the following content.
  5. On the RAM Roles page, click Trust Policy Management. On the Edit Trust Policy tab, add to the field as shown in the following figure.
  6. Click OK.

Configure auto scaling

  1. On the Auto Scaling page, set the following parameters and click Submit.
    Parameter Description
    Cluster The name of the target ACK cluster.
    Scale-in Threshold In a scaling group managed by cluster-autoscaler, the system automatically calculates the ratio of the requested resources to the resource capacity of a node. If the ratio is lower than the threshold, nodes in the scaling group are removed from the ACK cluster.
    Note Scale-out events are automatically triggered based on the scheduling of pods. Therefore, you only need to configure scale-in rules.
    Defer Scale-in For The amount of time that the cluster must wait before the cluster scales in. Unit: minutes. The default value is 10 minutes.
    Cooldown The cooldown time after a scale-in event is triggered. No scale-in event will be triggered again during this period of time. Unit: minutes. The default value is 10 minutes.
  2. Select an instance type based on your workload requirements, such as a regular instance, an ECS instance with GPU capabilities, or a preemptible instance. Then, click Create.
    Instance types
  3. On the Auto Scaling Group Configuration page, set the following parameters to create a scaling group:
    Parameter Description
    Region The region where the scaling group is deployed. The region where the cluster is deployed is selected and cannot be changed.
    VPC The VPC network where the cluster is deployed is automatically selected.
    VSwitch You can select VSwitches based on the zones where they are deployed and the CIDR blocks of the pods.
  4. Configure worker nodes.
    Parameter Description
    Node Type The types of nodes in the scaling group. The node types must be the same as those in the cluster.
    Instance Type The types of instances in the scaling group.
    Selected Types The instance types that you have selected. You can select up to 10 instance types.
    System Disk The system disk of the scaling group.
    Mount Data Disk Specifies whether to mount data disks to the scaling group. By default, no data disk is mounted.
    Instances The number of instances in the scaling group.
    • Existing instances in the cluster are not taken into account.
    • By default, the minimum number of instances is 0. If you specify one or more instances, the system will add instances to the scaling group. When a scale-out event is triggered, instances in the scaling group are added to the cluster to which the scaling group is bound.
    Key Pair The key pair used to log on to the nodes after these nodes are added to the ACK cluster. You can create key pairs in the ECS console.
    Note Currently, only key pair logon is supported.
    Scaling Mode You can select Standard or Swift mode.
    RDS whitelist The Relational Database Service (RDS) instances that can be accessed by the nodes in the scaling group after a scale-out event is triggered.
    Node Label Node labels are automatically attached to the nodes added to the ACK cluster after a scale-out event is triggered.
    Taints After you add taints to a node, ACK will not assign pods to the node.
  5. Click OK to create the scaling group.

Expected result

  1. On the Auto Scaling page, you can find a newly created scaling group in the regular instance section.
    Auto Scaling
  2. Select the target cluster and the kube-system namespace. You can find the cluster-autoscaler component, which indicates that the scaling group is created.
    Create a scaling group


  • Why does the auto scaling component fail to add nodes after a scale-out event is triggered?
    Check for the following issues:
    • Whether instance types configured for the scaling group can meet the requested instance types of pods. Whether the requested resource specifications of the pods are within the specified instance specifications.
    • Whether you have authorized the Resource Access Management (RAM) roles to manage the cluster. Whether the authorization process is performed on each cluster involved in the scale-out event.
    • Whether the cluster can access the Internet. Nodes in a scaling group require Internet access. This is because the auto scaling component needs to call Alibaba Cloud APIs over the Internet.
  • Why does the auto scaling component fail to remove nodes after a scale-in event is triggered?
    Check for the following issues:
    • Whether the ratio of the requested resources to the resource capacity of the node is higher than the scale-in threshold.
    • Whether the pods in the kube-system namespace are running on the nodes.
    • Whether the pods are configured with a scheduling policy, which constrains that these pods cannot be scheduled to other nodes.
    • Whether the pods on the nodes are configured with a PodDisruptionBudget and the number of pods has reached the specified minimum requirement.

    For more frequently asked questions about the auto scaling component, visit the open source community.

  • How does the system select from multiple scaling groups for a scale-out event?

    When pods cannot be scheduled to nodes, the auto scaling component will simulate the scheduling of pods based on the configurations of the scaling group. The configurations include the labels, taints, and instance specifications. If a scaling group can simulate the scheduling of pods, this scaling group is selected for the scale-out event. If more than one scaling groups can simulate the scheduling of pods, the system chooses the scaling group that has the least idle resources remained after the simulation.