You can use node auto scaling to automatically scale nodes when resources in the current Container Service for Kubernetes (ACK) cluster cannot fulfil pod scheduling. The node auto scaling feature applies to scenarios with limited scaling requirements. This includes clusters that have less than 20 node pools with auto scaling enabled, or where nodes per node pool remain below 100. Node auto scaling is optimal for workloads with stable traffic patterns, periodic or predictable resource demands, and operations where single-batch scaling meets business requirements.
Before you start
To better work with the node auto scaling feature, we recommend that you read the Node scaling topic and pay attention to the following items:
How node auto scaling works and its features
Use scenarios of node auto scaling
Usage notes for node auto scaling
Prerequisite
Ensure that Auto Scaling is activated.
This feature involves the following workflow:
Step 1: Enable node auto scaling for the cluster
The node pool auto scaling mode only takes effect after auto scaling is enabled for the cluster.
Step 2: Configure a node pool with auto scaling enabled
The node auto scaling feature only applies to node pools with auto scaling enabled. You must explicitly set the Scaling Mode to Auto for target node pools.
Step 1: Enable node auto scaling for the cluster
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster to manage and click its name. In the left-side navigation pane, choose .
On the Node Pools page, click Enable next to Node Scaling.
If this is the first time you use the node auto scaling feature, follow the prompted instructions to activate the service and complete authorization. Otherwise, skip this step.
For an ACK managed cluster, authorize ACK to use the AliyunCSManagedAutoScalerRole for accessing your cloud resources.
For an ACK dedicated cluster, authorize ACK to use the KubernetesWorkerRole and AliyunCSManagedAutoScalerRolePolicy for scaling management. The following figure shows the console page on which you can make the authorization when you enable Node Scaling.
In the Node Scaling Configuration panel, set Node Scaling Method to Auto Scaling, configure scaling parameters, and click OK.
Node Scaling Configuration can be modified by switching to Instant Scaling and completing the configuration workflow as prompted.
Parameter
Description
Node Pools Scale-out Policy
Random Policy: randomly scale out a node pool when multiple scalable node pools exist.
Default Policy: scale out the node pool that wastes the least resources when multiple scalable node pools exist.
Priority-based Policy: scale out node pools based on their scale-out priorities when multiple scalable node pools exist.
The scale-out priority of a node pool is defined through the Node Pool Scale-out Priority parameter.
Node Pool Scale-out Priority
Specify the scaling order during a scale-out operation. Only effective when Node Pools Scale-out Policy is set to Priority-based Policy.
Valid values: integers from 1 to 100. A larger number indicates a higher priority.
Configuration steps:
a. Click + Add next to the parameter.
b. Select the node pool with auto scaling enabled.
c. Set a priority value.
If no node pools with auto scaling enabled are available, skip this parameter for now, and configure it after completing Step 2: Configure a node pool with auto scaling enabled.
Scan Interval
Specify the interval at which the cluster is evaluated for scaling. Default value: 60s.
The autoscaler triggers scale-out activities based on the actual scheduling status.
ImportantElastic Compute Service (ECS) nodes: The autoscaler performs scale-in activities only when the Scale-in Threshold, Defer Scale-in For, and Cooldown conditions are met.
GPU-accelerated nodes: The autoscaler performs scale-in activities only when the GPU Scale-in Threshold, Defer Scale-in For:, and Cooldown conditions are met.
Allow Scale-in
Specify whether to allow scale-in activities. The scale-in configuration does not take effect when this switch is turned off. Proceed with caution.
Scale-in Threshold
Specify the ratio of the resource request of a node to resource capacity of the node in a node pool that has node auto scaling enabled.
A scale-in activity is performed only when the CPU and memory utilization of a node is lower than the Scale-in Threshold.
GPU Scale-in Threshold
The scale-in threshold for GPU-accelerated nodes.
A scale-in activity is performed only when the CPU, memory, and GPU utilization of a node is lower than the Scale-in Threshold.
Defer Scale-in For
The interval between the time when the scale-in threshold is reached and the scale-in activity (reduce the number of pods) starts. Unit: minutes. Default value: 10.
ImportantThe autoscaler performs scale-in activities only when Scale-in Threshold is configured and the Defer Scale-in For condition is met.
Cooldown
After the autoscaler performs a scale-out activity, it undergoes a cooldown period before it can perform a scale-in activity.
The autoscaler cannot perform scale-in activities during the cooldown period, but can check if the nodes meet the scale-in conditions. After the cooldown period, if a node meets the scale-in conditions and the waiting period specified in the Defer Scale-in For parameter has ended, the node is removed. For example, the Cooldown parameter is set to 10 minutes and Defer Scale-in For is set to 5 minutes. The autoscaler cannot scale in activities during the cooldown period, but can still check if the nodes meet the scale-in conditions. The ones that do are removed 5 minutes after the cooldown period ends.
Step 2: Configure a node pool with auto scaling enabled
You can either modify existing node pools by switching their Scaling Mode to Auto, or create new node pools with auto scaling enabled. For detailed steps, see Create and manage a node pool.
Key configurations include:
Parameter | Description |
Scaling Mode | Manual and Auto scalings are supported. Computing resources are automatically adjusted based on your business requirements and policies to reduce cluster costs.
|
Instances | The Min. Instances and Max. Instances defined for a node pool exclude your existing instances. Note
|
Instance-related parameters | Select the ECS instances used by the worker node pool based on instance types or attributes. You can filter instance families by attributes such as vCPU, memory, instance family, and architecture. For more information about the instance specifications not supported by ACK and how to configure nodes, see ECS specification recommendations for ACK clusters. When the node pool is scaled out, ECS instances of the selected instance types are created. The scaling policy of the node pool determines which instance types are used to create new nodes during scale-out activities. Select multiple instance types to improve the success rate of node pool scale-out operations. The instance types of the nodes in the node pool. If you select only one, the fluctuations of the ECS instance stock affect the scaling success rate. We recommend that you select multiple instance types to increase the scaling success rate. If you select only GPU-accelerated instances, you can select Enable GPU Sharing on demand. For more information, see cGPU overview. |
Operating System | When you enable auto scaling, you can select an image based on Alibaba Cloud Linux, Windows, or Windows Core. If you select an image based on Windows or Windows Core, the system automatically adds the |
Node Labels | Node labels are automatically added to nodes that are added to the cluster by scale-out activities. Important Auto scaling can recognize node labels and taints only after the node labels and taints are mapped to node pool tags. A node pool can have only a limited number of tags. Therefore, you must limit the total number of ECS tags, taints, and node labels of a node pool that has auto scaling enabled to less than 12. |
Scaling Policy |
|
Use Pay-as-you-go Instances When Preemptible Instances Are Insufficient | You must set the Billing Method parameter to Preemptible Instance. After this feature is enabled, if enough preemptible instances cannot be created due to price or inventory constraints, ACK automatically creates pay-as-you-go instances to meet the required number of ECS instances. |
Enable Supplemental Preemptible Instances | You must set the Billing Method parameter to Preemptible Instance. After this feature is enabled, when a system receives a message that preemptible instances are reclaimed, the node pool with auto scaling enabled attempts to create new instances to replace the reclaimed preemptible ones. |
Scaling Mode | You must enble Node Scaling on the Node Pools page and set the Scaling Mode of the node pool to Auto.
|
Taints | After you add taints to a node, ACK no longer schedules pods to it. |
Step 3: (Optional) Verify node auto scaling
After you complete the preceding configuration, you can use the node auto scaling feature. The node pool displays that auto scaling is enabled and cluster-autoscaler is installed in the cluster.
Auto scaling is enabled for the node pool
The Node Pools page displays node pools with auto scaling enabled.
cluster-autoscaler is installed
In the left-side navigation pane of the details page, choose .
Select the kube-system namespace. The cluster-autoscaler component is displayed.
FAQs
Category | Subcategory | Issue |
Scaling behavior of node auto scaling | ||
Custom scaling behavior | ||
Questions related to cluster-autoscaler |