All Products
Search
Document Center

Container Service for Kubernetes:Enable auto scaling for node pools

Last Updated:Dec 01, 2025

If your cluster's allocated resources are insufficient to schedule application pods due to high demand, you can enable the auto scaling feature in ACK One registered clusters to automatically scale out nodes and increase available resources for scheduling. Two elasticity solutions are available: node auto scaling and node instant scaling. The latter offers faster scaling, higher delivery efficiency, and lower operational complexity.

Prerequisites

Step 1: Configure RAM permissions

  1. Create a RAM user and grant the following custom policy to the user. For more information, see Use RAM to authorize access to clusters and cloud resources.

    Expand to view the custom policy document

    {
      "Version": "1",
      "Statement": [
        {
          "Action": [
            "ess:DescribeScalingGroups",
            "ess:DescribeScalingInstances",
            "ess:DescribeScalingActivities",
            "ess:DescribeScalingConfigurations",
            "ess:DescribeScalingRules",
            "ess:DescribeScheduledTasks",
            "ess:DescribeLifecycleHooks",
            "ess:DescribeNotificationConfigurations",
            "ess:DescribeNotificationTypes",
            "ess:DescribeRegions",
            "ess:CreateScalingRule",
            "ess:ModifyScalingGroup",
            "ess:RemoveInstances",
            "ess:ExecuteScalingRule",
            "ess:ModifyScalingRule",
            "ess:DeleteScalingRule",
            "ecs:DescribeInstanceTypes",
            "ess:DetachInstances",
            "ess:CompleteLifecycleAction",
            "ess:ScaleWithAdjustment",
            "ess:DescribePatternTypes",
            "vpc:DescribeVSwitches",
            "cs:DeleteClusterNodes",
            "cs:DescribeClusterNodes",
            "cs:DescribeClusterNodePools",
            "cs:DescribeClusterNodePoolDetail",
            "cs:DescribeTaskInfo",
            "cs:ScaleClusterNodePool",
            "cs:RemoveNodePoolNodes",
            "ecs:DescribeAvailableResource",
            "ecs:DescribeInstanceTypeFamilies",
            "ecs:DescribeInstances",
            "cs:GetClusterAddonInstance",
            "cs:DescribeClusterDetail",
            "ecs:DescribeCapacityReservations",
            "ecs:DescribeElasticityAssurances",
            "ecs:DescribeImages"
          ],
          "Resource": [
            "*"
          ],
          "Effect": "Allow"
        }
      ]
    }
  2. Log on to the ACK console. In the left navigation pane, click Clusters.

  3. On the Clusters page, click the name of the one you want to change. In the left navigation pane, choose Configurations > Secrets.

  4. On the Secrets page, click Create from YAML. Fill in the following sample code to create a Secret named alibaba-addon-secret.

    Note

    Components access cloud services using the stored AccessKeyID and AccessKeySecret. Skip this step If an alibaba-addon-secret already exists.

    apiVersion: v1
    kind: Secret
    metadata:
      name: alibaba-addon-secret
      namespace: kube-system
    type: Opaque
    stringData:
      access-key-id: <AccessKeyID of the RAM user>
      access-key-secret: <AccessKeySecret of the RAM user>

Step 2: Configure node scaling solution

Enable node auto scaling

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, find the cluster to manage and click its name. In the left navigation pane, choose Nodes > Node Pools.

  3. On the Node Pools page, click Enable next to Node Scaling.

    1.jpg

  4. If this is the first time you use the node auto scaling feature, follow the prompted instructions to activate the service and complete authorization. Otherwise, skip this step.

    • For an ACK managed cluster, authorize ACK to use the AliyunCSManagedAutoScalerRole for accessing your cloud resources.

    • For an ACK dedicated cluster, authorize ACK to use the KubernetesWorkerRole and AliyunCSManagedAutoScalerRolePolicy for scaling management. The following figure shows the console page on which you can make the authorization when you enable Node Scaling.

      image

  5. In the Node Scaling Configuration panel, set Node Scaling Method to Auto Scaling, configure scaling parameters, and click OK.

Enable node instant scaling

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, find the cluster to manage and click its name. In the left navigation pane, choose Nodes > Node Pools.

  3. On the Node Pools page, click Enable next to Node Scaling.

  4. If this is the first time you use the automatic cluster scaling feature, follow the on-screen instructions to activate the Auto Scaling service and grant the required permissions. If you have activated the service and granted the permissions, skip this step.

    • ACK managed clusters: Grant permissions to the AliyunCSManagedAutoScalerRole role.

    • ACK dedicated clusters: Grant permissions to the KubernetesWorkerRole role and the AliyunCSManagedAutoScalerRolePolicy system policy. The following figure shows the entries.

      image

  5. On the Node Scaling Configuration page, set Node Scaling Method to Instant Scaling, configure the scaling parameters, and then click OK.

    When elastic scaling is performed, the scaling component automatically triggers a scale-out based on the scheduling status.

    You can switch the node scaling solution after it is selected. To switch the solution, you can change it to Node Autoscaling. Carefully read the on-screen messages and follow the instructions. This feature is available only to users on the whitelist. To use this feature, submit a ticket.

    Configuration item

    Description

    Scale-in Threshold

    The ratio of the requested resources of a single node to the resource capacity of the node in a node pool for which node autoscaling is enabled.

    A node can be scaled in only when the ratio is lower than the configured threshold. This means the CPU and memory resource utilization of the node is lower than the Scale-in Threshold.

    GPU Scale-in Threshold

    The scale-in threshold for GPU-accelerated instances.

    A GPU-accelerated node can be scaled in only when the ratio is lower than the configured threshold. This means the CPU, memory, and GPU resource utilization of the node is lower than the GPU Scale-in Threshold.

    Defer Scale-in For

    The interval between when a scale-in is required and when the scale-in is performed. Unit: minutes. Default value: 10 minutes.

    Important

    The scaling component can perform a scale-in only after the conditions specified by Scale-in Threshold and Defer Scale-in For are met.

    View the descriptions of advanced configuration items

    Configuration item

    Description

    Pod Termination Timeout

    The maximum amount of time to wait for pods on a node to be terminated during a node scale-in. Unit: seconds.

    Minimum Number of Replicated Pods

    The minimum number of pods allowed in each ReplicaSet before a node scale-in. If the actual number of replicas in the ReplicaSet to which a pod belongs is smaller than this value, the node is not scaled in.

    Evict DaemonSet Pods

    If you enable this feature, DaemonSet pods on a node are evicted when the node is scaled in.

    Skip Nodes Hosting Kube-system Pods

    If you enable this feature, the system can ignore nodes that run pods in the kube-system namespace during an automatic node scale-in. This ensures that these nodes are not affected by the scale-in.

    Note

    This feature does not apply to DaemonSet pods and mirror pods.

Step 3: Configure a node pool with auto scaling enabled

You can either modify existing node pools by switching their Scaling Mode to Auto, or create new node pools with auto scaling enabled. Key configurations are as follows:

Parameter

Description

Scaling Mode

Manual and Auto scalings are supported. Computing resources are automatically adjusted as needed and policies to reduce cluster costs.

  • Manual: ACK adjusts the number of nodes in the node pool based on the value of the Expected Nodes parameter. The number of nodes is always the same as the value of the Expected Nodes parameter. For more information, see Manually scale a node pool.

  • Auto: When the capacity planning of the cluster cannot meet the requirements of pod scheduling, ACK automatically scales out nodes based on the configured minimum and maximum number of instances. By default, node instant scaling is enabled for clusters running Kubernetes 1.24 and later, and node autoscaling is enabled for clusters running Kubernetes versions earlier than 1.24. For more information, see Node scaling.

Instances

The Min. Instances and Max. Instances defined for a node pool exclude your existing instances.

Note
  • If you set Min. Instances above zero, the scaling group will automatically create the specified number of ECS instances when changes are applied.

  • Configure Max. Instances to be no lower than the current number of nodes in the node pool. Otherwise, a scale-down will be triggered immediately once auto scaling takes effect.

Instance-related parameters

Select the ECS instances used by the worker node pool based on instance types or attributes. You can filter instance families by attributes such as vCPU, memory, instance family, and architecture. For more information about the instance specifications not supported by ACK and how to configure nodes, see ECS instance type recommendations.

When the node pool is scaled out, ECS instances of the selected instance types are created. The scaling policy of the node pool determines which instance types are used to create new nodes during scale-out activities. Select multiple instance types to improve the success rate of node pool scale-out operations.

The instance types of the nodes in the node pool. If you select only one, the fluctuations of the ECS instance stock affect the scaling success rate. We recommend that you select multiple instance types to increase the scaling success rate.

Select the ECS instances used by the worker node pool based on instance types or attributes. You can filter instance families by attributes such as vCPU, memory, instance family, and architecture. For more information about the instance specifications not supported by ACK and how to configure nodes, see ECS specification recommendations for ACK clusters.

When the node pool is scaled out, ECS instances of the selected instance types are created. The scaling policy of the node pool determines which instance types are used to create new nodes during scale-out activities. Select multiple instance types to improve the success rate of node pool scale-out operations.

Operating System

When you enable auto scaling, you can select an image based on Alibaba Cloud Linux, Windows, or Windows Core.

If you select an image based on Windows or Windows Core, the system automatically adds the { effect: 'NoSchedule', key: 'os', value: 'windows' } taint to nodes in the node pool.

Node Labels

Node labels are automatically added to nodes that are added to the cluster by scale-out activities.

Important

Auto scaling can recognize node labels and taints only after the node labels and taints are mapped to node pool tags. A node pool can have only a limited number of tags. Therefore, you must limit the total number of ECS tags, taints, and node labels of a node pool that has auto scaling enabled to less than 12.

Scaling Policy

  • Priority: The system scales the node pool based on the priorities of the vSwitches that you select for the node pool. The ones you select are displayed in descending order of priority. If Auto Scaling fails to create ECS instances in the zone of the vSwitch with the highest priority, Auto Scaling attempts to create ECS instances in the zone of the vSwitch with the next highest priority.

  • Cost Optimization: The system creates instances based on the vCPU unit prices in ascending order.

    If the Billing Method of the node pool is set to Spot Instance, such instances are preferentially created. You can also set the Percentage Of Pay-as-you-go Instances parameter. If spot instances cannot be created due to reasons such as insufficient stocks, pay-as-you-go instances are automatically created as a supplement.

  • Distribution Balancing: The even distribution policy takes effect only when you select multiple vSwitches. This policy ensures that ECS instances are evenly distributed among the zones (the vSwitches) of the scaling group. If they are unevenly distributed due to reasons such as insufficient stocks, you can perform a rebalancing operation.

Use Pay-as-you-go Instances When Preemptible Instances Are Insufficient

You must set the Billing Method parameter to Preemptible Instance.

After this feature is enabled, if enough preemptible instances cannot be created due to price or inventory constraints, ACK automatically creates pay-as-you-go instances to meet the required number of ECS instances.

Enable Supplemental Spot Instances

You must set the Billing Method parameter to Spot Instance.

After this feature is enabled, when a system receives a message that spot instances will be reclaimed (5 minutes before reclamation), ACK will attempt to scale out new instances as compensation.

If compensation succeeds, ACK will drain and remove the old nodes from the cluster. If compensation fails, ACK will not drain the old nodes. Active release of spot instances may cause service interruptions. After compensation failure, when inventory becomes available or price conditions are met, ACK will automatically purchase instances to maintain the expected node count. For details, see Best practices for preemptible instance-based node pools.

To improve compensation success rates, we recommend enabling Use Pay-as-you-go Instances When Spot Instances Are Insufficient at the same time.

Scaling Mode

You must enble Node Scaling on the Node Pools page and set the Scaling Mode of the node pool to Auto.
  • Standard: Auto scaling is implemented by creating and releasing ECS instances.

  • Swift: Auto scaling is implemented by creating, stopping, and starting ECS instances. Those in the stopped state can be directly restarted to accelerate scaling.

    When an ECS instance is stopped, only disk fees are charged. Computing fees are not charged. This rule does not apply to instance families that use local disks, such as big data and local SSDs instance families. For more information about the billing rules and limits of the economical mode, see Economical mode.

Taints

After you add taints to a node, ACK no longer schedules pods to it.

Step 4: (Optional) Verify the result

After you complete the preceding operations, you can use the node auto scaling feature. The node pool shows that auto scaling has started and the cluster-autoscaler component is automatically installed in the cluster.

Auto scaling is enabled for the node pool

On the Node Pools page, the node pools that have auto scaling enabled are displayed in the node pool list.

image

The cluster-autoscaler component is installed

  1. In the left-side navigation pane of the details page, choose Workloads > Deployments.

  2. Select the kube-system namespace to view the cluster-autoscaler component.

    image