use cluster-autoscaler to implement node auto scaling - Container Service for Kubernetes

You can use the node auto scaling feature to enable Container Service for Kubernetes (ACK) to automatically scale nodes when resources in the current cluster cannot fulfil pod scheduling. The node auto scaling feature is suitable for small-scale scaling activities and workloads that require only one scaling activity each time. For example, this feature is suitable for a cluster that contains less than 20 node pools with auto scaling enabled or node pools that have auto scaling enabled and each of which contains less than 100 nodes.

Before you start

To better work with the node auto scaling feature, we recommend that you read the Overview of node scaling topic and pay attention to the following items:

How node auto scaling works and its features

Use scenarios of node auto scaling

Usage notes for node auto scaling

Step 1: Enable node auto scaling

Before you use node auto scaling, you must enable and configure this feature on the node pools page in the ACK console. When you configure this feature, select cluster-autoscaler as the autoscaler to implement node auto scaling.

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose Nodes > Node Pools in the left-side navigation pane.
On the Node Pools page, click Enable next to Configure Auto Scaling.
If this is the first time you use this feature, follow the on-screen instructions to activate Auto Scaling and complete authorization. Skip this step if you have already completed authorization.
View the procedure for activating Auto Scaling and completing authorization
Note
The following procedure for activating Auto Scaling is for reference only. Follow the actual instructions on the page when you activate Auto Scaling.
1. Click the hyperlink next to Auto Scaling in the panel that appears to go to the Auto Scaling page and follow the on-screen instructions to activate Auto Scaling.
2. On the Activated page, click Console to log on to the Auto Scaling console.
3. Click Go to Authorize to go to the Cloud Resource Access Authorization page and authorize Auto Scaling to access other cloud resources. Then, click Confirm Authorization Policy.
  Note
  If you use ACK dedicated clusters, follow the instructions to attach the AliyunCSManagedAutoScalerRolePolicy policy to each cluster.
  If you use an ACK managed cluster, make sure that the addon.aliyuncsmanagedautoscalerrole.token is stored in a Secret in the kube-system namespace. If the token does not exist, submit a ticket.
  If the token does not exist, ACK assumes the worker RAM role to use the relevant capabilities by default. You can refer to How do I attach the AliyunCSManagedAutoScalerRolePolicy policy to the worker RAM role? to grant the required permissions.

In the Configure Auto Scaling panel, set Scaling Component to cluster-autoscaler, configure scaling parameters, and then click OK.

Parameter	Description
Node PoolsScale-out Policy	Random Policy: randomly scale out a node pool when multiple scalable node pools exist. Default Policy: scale out the node pool that wastes the least resources when multiple scalable node pools exist. Priority-based Policy: scale out node pools based on their scale-out priorities when multiple scalable node pools exist. You can specify a scale-out priority for a node pool only after the node pool is created.
Scan Interval	Specify the interval at which the cluster is evaluated for scaling. Default value: 60s.
The autoscaler triggers scale-out activities based on the actual scheduling status. You need only to configure scale-in conditions. Important Elastic Compute Service (ECS) nodes: The autoscaler performs scale-in activities only when the Scale-in Threshold, Defer Scale-in For:, and Cooldown conditions are met. GPU-accelerated nodes: The autoscaler performs scale-in activities only when the GPU Scale-in Threshold, Defer Scale-in For:, and Cooldown conditions are met.
Allow Scale-in	Specify whether to allow scale-in activities. The scale-in configuration does not take effect when this switch is turned off. Proceed with caution.
Scale-in Threshold	Specify the ratio of the resource request of a node to resource capacity of the node in a node pool that has node auto scaling enabled. A scale-in activity is performed only when the CPU and memory utilization of a node is lower than the Scale-in Threshold.
GPU Scale-in Threshold	The scale-in threshold for GPU-accelerated nodes. A scale-in activity is performed only when the CPU, memory, and GPU utilization of a node is lower than the Scale-in Threshold.
Defer Scale-in For	The interval between the time when the scale-in threshold is reached and the time when the scale-in activity (reduce the number of pods) starts. Unit: minutes. Default value: 10. Important The autoscaler performs scale-in activities only when Scale-in Threshold is configured and the Defer Scale-in For condition is met.
Cooldown	After the autoscaler performs a scale-out activity, the autoscaler waits a cooldown period before it can perform a scale-in activity. The autoscaler cannot perform scale-in activities within the cooldown period but can still check whether the nodes meet the scale-in conditions. After the cooldown period ends, if a node meets the scale-in conditions and the waiting period specified in the Defer Scale-in For parameter ends, the node is removed. For example, the Cooldown parameter is set to 10 minutes and the Defer Scale-in For parameter is set to 5 minutes. The autoscaler cannot perform scale-in activities within the 10-minute cooldown period after performing a scale-out activity. However, the autoscaler can still check whether the nodes meet the scale-in conditions within the cooldown period. When the cooldown period ends, the nodes that meet the scale-in conditions are removed after 5 minutes.

View advanced scale-in settings

Parameter	Description
Pod Termination Timeout	The maximum amount of time to wait for pods on a node to terminate during a scale-in activity. Unit: seconds.
Minimum Number of Replicated Pods	The minimum number of pods that are allowed in each ReplicaSet during node draining.
Evict DaemonSet Pods	After you enable this feature, DaemonSet pods are evicted during a scale-in activity.
Skip Nodes Hosting Kube-system Pods	After you enable this feature, the node that hosts the pods in the kube-system namespace is ignored during a scale-in activity. Note This feature does not take effect on mirror pods and DaemonSet pods.

Step 2: Configure a node pool that has auto scaling enabled

The node auto scaling feature scales only nodes in node pools that have auto scaling enabled. Therefore, after you configure node auto scaling, you need to configure at least one node pool that has auto scaling enabled. You can create a node pool that has auto scaling enabled or enable auto scaling for an existing node pool.

The following table describes the key parameters. The term "node pool" in the following section refers to a node pool that has auto scaling enabled. For more information, see Create a node pool and Modify a node pool.

Parameter	Description
Auto Scaling	Specify whether to enable auto scaling. This feature provides cost-effective computing resource scaling based on business requirements and scaling policies. For more information, see Auto scaling overview. Before you enable this feature, you need to enable node auto scaling for the node pool. For more information, see Step 1: Enable node auto scaling.
Instance Type	The instance types of the nodes in the node pool. If you select only one instance type, the fluctuations of the ECS instance stock affect the scaling success rate. We recommend that you select multiple instance types to increase the scaling success rate. If you select GPU-accelerated instances, you can select Enable GPU Sharing on demand. For more information, see cGPU overview.
Instances	The number of instances in the node pool, excluding existing instances in the cluster. By default, the minimum number of instances is 0. If you specify one or more instances, the system adds the instances to the node pool. When a scale-out activity is triggered, the instances in the node pool are added to the associated cluster.
Operating System	When you enable auto scaling, you can select an image based on Alibaba Cloud Linux, Windows, or Windows Core. If you select an image based on Windows or Windows Core, the system automatically adds the `{ effect: 'NoSchedule', key: 'os', value: 'windows' }` taint to nodes in the node pool.
Node Label	Node labels are automatically added to nodes that are added to the cluster by scale-out activities. Important Auto scaling can recognize node labels and taints only after the node labels and taints are mapped to node pool tags. A node pool can have only a limited number of tags. Therefore, you must limit the total number of ECS tags, taints, and node labels of a node pool that has auto scaling enabled to less than 12.
Scaling Policy	Priority: The system scales the node pool based on the priorities of the vSwitches that you select for the node pool. The vSwitches that you select are displayed in descending order of priority. If Auto Scaling fails to create ECS instances in the zone of the vSwitch with the highest priority, Auto Scaling attempts to create ECS instances in the zone of the vSwitch with a lower priority. Cost Optimization: The system creates instances based on the vCPU unit prices in ascending order. Preemptible instances are preferentially created when multiple preemptible instance types are specified in the scaling configurations. If preemptible instances cannot be created due to reasons such as insufficient stocks, the system attempts to create pay-as-you-go instances. When Billing Method is set to Preemptible Instance, you can configure the following parameters in addition to the Enable Supplemental Preemptible Instances parameter: Percentage of Pay-as-you-go Instances: Specify the percentage of pay-as-you-go instances in the node pool. Valid values: 0 to 100. Enable Supplemental Pay-as-you-go Instances: After you enable this feature, Auto Scaling attempts to create pay-as-you-go ECS instances to meet the scaling requirement if Auto Scaling fails to create preemptible instances for reasons such as that the unit price is too high or preemptible instances are out of stock. Distribution Balancing: The even distribution policy takes effect only when you select multiple vSwitches. This policy ensures that ECS instances are evenly distributed among the zones (the vSwitches) of the scaling group. If ECS instances are unevenly distributed across the zones due to reasons such as insufficient stocks, you can perform a rebalancing operation. Important You cannot change the scaling policy of a node pool after the node pool is created. When Billing Method is set to Preemptible Instance, you can specify whether to turn on Enable Supplemental Preemptible Instances. After this feature is enabled, when a system message that indicates preemptible instances are reclaimed is received, the node pool with auto scaling enabled attempts to create new instance to replace the reclaimed the preemptible instances.
Scaling Mode	You can select Standard or Swift. Standard: Auto scaling is implemented by creating and releasing ECS instances based on resource requests and usage. Swift: Auto scaling is implemented by creating, stopping, and starting ECS instances. This mode accelerates scaling activities. Important When a node in swift mode needs to be reclaimed, the node is suspended and enters the `NotReady` state. The node enters the `Ready` state when a scale-out activity is triggered. When a suspended instance in swift mode fails to launch, the system does not release the instance. You need to manually release the instance. When a node in swift mode is reclaimed, only disk fees are charged for the node. No computing fee is charged. This rule does not apply to instance types that use local disks, such as ecs.d1ne.2xlarge. This allows the system to quickly launch ECS instances when the inventory is sufficient.
Taints	After you add taints to a node, ACK no longer schedules pods to the node.

After you create a node pool that has auto scaling enabled, you can refer to Step 1: Enable node auto scaling and select Priority-based Policy on demand. The valid values of priorities are integers from 1 to 100.

Step 3: (Optional) Verify node auto scaling

After you complete the preceding configuration, you can use the node auto scaling feature. The node pool displays that auto scaling is enabled and cluster-autoscaler is installed in the cluster.

Auto scaling is enabled for the node pool

The Node Pools page shows that auto scaling is enabled for the node pool.

cluster-autoscaler is installed

In the left-side navigation pane of the details page, choose Workloads > Deployments.
Select the kube-system namespace. The cluster-autoscaler component is displayed.

FAQ

Why does cluster-autoscaler fail to add nodes after a scale-out activity is triggered?

Check whether the following situations exist:

The instance types of the node pool that has auto scaling enabled cannot fulfil the resource requests of pods. Resources provided by ECS instance types comply with the ECS specifications. ACK reserves a certain amount of node resources to run Kubernetes components and system processes. This ensures that the OS kernel, system services, and Kubernetes daemons can run as normal. However, this causes the amount of allocatable resources of a node to differ from the resource capacity of the node. For more information, see Resource reservation policy.
By default, system components are automatically installed on nodes. Therefore, the resource request of pods must be lower than the resource capacity of the instance type.
Cross-zone scale-out activities cannot be triggered for pods that have limits on zones.
The RAM role does not have the permissions to manage the cluster. You must complete authorization for each cluster that is involved in the scale-out activity. For more information, see Step 1: Enable node auto scaling.
The following issues occur when you activate Auto Scaling:
- The instance fails to be added to the cluster and a timeout error occurs.
- The node is not ready and a timeout error occurs.
To ensure that nodes can be accurately scaled, cluster-autoscaler does not perform any scaling activities before it fixes the abnormal nodes.

Why does cluster-autoscaler fail to remove nodes after a scale-in activity is triggered?

Check whether the following situations exist:

The resource request threshold of pods is higher than the scale-in threshold.
Pods in the kube-system namespace run on the node.
A scheduling policy forces the pods to run on the current node. Therefore, the pods cannot be scheduled to other nodes.
The pods on the node have PodDisruptionBudget and have reached the minimum limit of PodDisruptionBudget.

For more information about FAQ, see open source component.

How do I choose between multiple node pools that have auto scaling enabled when I perform a scaling activity?

When pods fail to be scheduled, the scheduling simulation logic of the autoscaler is triggered to help make decisions based on the labels, taints, and instance types of a node pool. If the simulation shows that pods can be scheduled to the node pool, the autoscaler performs a scale-out activity to add nodes. If multiple node pools meet the scheduling conditions during the simulation, the least-waste principle is applied by default. The node pool that has the least resources left after nodes are added to the cluster is selected.

What types of pods can prevent cluster-autoscaler from removing nodes?

What scheduling policies does cluster-autoscaler use to determine whether the unschedulable pods can be scheduled to a node pool that has the auto scaling feature enabled?

The following list describes the scheduling policies used by cluster-autoscaler.

PodFitsResources
GeneralPredicates
PodToleratesNodeTaints
MaxGCEPDVolumeCount
NoDiskConflict
CheckNodeCondition
CheckNodeDiskPressure
CheckNodeMemoryPressure
CheckNodePIDPressure
CheckVolumeBinding
MaxAzureDiskVolumeCount
MaxEBSVolumeCount
ready
MatchInterPodAffinity
NoVolumeZoneConflict

How do I attach the AliyunCSManagedAutoScalerRolePolicy policy to the worker RAM role?

Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Cluster Resources tab, click the hyperlink next to Worker RAM Role.
In the RAM, click Precise Permission.
In the Precise Permission panel, System Policy is selected by default. Enter AliyunCSManagedAutoScalerRolePolicy into the Policy Name field and click OK.
In the Precise Permission panel, click Close. Refresh the page. The page shows that the permissions are added.
Manually restart deployment cluster-autoscaler in the kube-system namespace so that the new RAM policy can take effect.