If your cluster has insufficient capacity to schedule application Pods, you can use node auto scaling to automatically scale nodes in and out. Node auto scaling is ideal for scenarios with smaller scaling requirements, such as having fewer than 20 auto-scaling node pools or fewer than 100 nodes in those pools. It is also suitable for workloads with stable traffic, predictable resource demands that a single scaling operation can meet, and periodic or foreseeable resource needs.
Before you begin
To make the most of node auto scaling, read Node scaling and understand the following concepts:
How node auto scaling works and its features
Use cases where node auto scaling can meet your business requirements
Important considerations before using node auto scaling
During a scale-in, subscription instances are removed but not released. To avoid extra costs, use pay-as-you-go instances when you enable this feature.
Prerequisites
-
Make sure you have activated Auto Scaling.
-
See Usage notes to understand the quotas and limitations of node scaling.
-
Node auto scaling has known limitations with certain scheduling policies, which may lead to unexpected scaling results. If your workloads or components use an unsupported scheduling policy, consider one of the following solutions:
-
Solution 1: Switch to node instant scaling.
-
Solution 2: Deploy the affected workloads or components to a node pool without node scaling enabled.
For example, to deploy the ack-node-local-dns-admission-controller component, place it in a node pool without node scaling enabled and declare the following node affinity requirement in the component's configuration:
nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: "k8s.aliyun.com" operator: "NotIn" values: ["true"]
-
-
The cluster-autoscaler component requires node resources for updates or deployments. Insufficient resources may cause these operations to fail and lead to scaling issues. Ensure that your nodes have adequate resources.
This feature involves the following steps:
-
Step 1: Enable node auto scaling for the cluster: You must first enable node auto scaling at the cluster level before the scaling policies of your node pools can take effect.
-
Step 2: Configure a node pool for auto scaling: The node auto scaling feature only affects node pools that are configured for auto scaling. Therefore, you must set the scaling mode of the target node pools to Auto.
Step 1: Enable node auto scaling
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
On the Node Pools page, click Enable next to Node Scaling.
-
If you are using node auto scaling for the first time, follow the on-screen instructions to activate the Auto Scaling service and grant the required permissions. You can skip this step if you have already done this.
-
ACK managed cluster: Authorize the AliyunCSManagedAutoScalerRole role.
-
ACK dedicated cluster: Authorize the KubernetesWorkerRole role and attach the AliyunCSManagedAutoScalerRolePolicy.
In the Node Scaling Configuration dialog box, after the precheck passes, click the RAM role link (such as
KubernetesWorkerRole-xxxx) to complete authorization in the RAM console.
-
-
In the Node Scaling Configuration dialog box, set Node Scaling Plan to Auto Scaling, configure the scaling parameters, and then click OK.
You can switch the node scaling method after the initial configuration. To do this, change the selection here to node instant scaling. Carefully read and follow the on-screen instructions to complete the process.
Parameter
Description
Node Pool Scale-out Policy
-
Random Policy: If multiple node pools are available for scale-out, one is chosen at random.
-
Default Policy: If multiple node pools are available for scale-out, the one that results in the least resource waste is chosen.
-
Priority-based Policy: If multiple node pools are available for scale-out, the one with the highest priority is chosen.
Node pool priority is defined by the Node Pool Scale-out Priority parameter.
Node Pool Scale-out Priority
Sets the scale-out priority for node pools. This parameter takes effect only when Node Pool Scale-out Policy is set to Priority-based Policy.
The value can be an integer from 1 to 100. A larger value indicates a higher priority.
Click Add next to the parameter, select a node pool with auto scaling enabled, and set its priority.
If no node pools with auto scaling enabled are available, you can ignore this parameter for now and configure it after you complete Step 2: Configure a node pool for auto scaling.
Scaling Sensitivity
The interval at which the system checks for scaling conditions. The default value is 60s.
During auto scaling, the scaling component automatically triggers scale-out based on scheduling conditions.
Important-
ECS nodes: A node scale-in can occur only when all three conditions are met: Scale-in Threshold, Scale-in Trigger Delay, and Cooldown Period.
-
GPU nodes: A GPU node scale-in can occur only when all three conditions are met: GPU Scale-in Threshold, Scale-in Trigger Delay, and Cooldown Period.
Allow Scale-in
Specifies whether to allow node scale-in operations. If disabled, scale-in settings do not take effect. Use this setting with caution.
Scale-in Threshold
The ratio of total resource requests to the total capacity of a single node in a node pool where node auto scaling is enabled.
A node is considered for scale-in only when this ratio is below the configured threshold, meaning both its CPU and memory utilization are below the Scale-in Threshold.
GPU Scale-in Threshold
The scale-in threshold for GPU instances.
A GPU node is considered for scale-in only when its CPU, memory, and GPU utilization are all below the GPU Scale-in Threshold.
Scale-in Trigger Delay
The time to wait from when a scale-in condition is detected to when the scale-in operation is actually performed. Unit: minutes. Default value: 10 minutes.
ImportantThe scaling component performs a node scale-in only after the Scale-in Threshold is met and the Scale-in Trigger Delay period has passed.
Cooldown Period
The period after a scale-out event during which the scaling component will not perform a scale-in.
During the cooldown period, the scaler does not perform scale-ins but continues to evaluate nodes against the scale-in conditions. After the cooldown ends, if a node has met the scale-in threshold for longer than the scale-in delay, the scaler removes it. For example, if the cooldown is 10 minutes and the scale-in delay is 5 minutes, the scaler will not scale in any nodes for 10 minutes after the last scale-out. However, during these 10 minutes, it still checks for nodes that are eligible for scale-in. Once the 10-minute cooldown ends, if a node has met the scale-in threshold for more than the 5-minute delay, it is scaled in.
-
Step 2: Configure a node pool
You can either configure an existing node pool by changing its Scaling Mode to Auto, or create a new node pool with auto scaling enabled.
For more information, see Create and manage node pools. The key parameters are described below:
|
Parameter |
Description |
|
Scaling Mode |
|
|
Instances |
The scalable range of nodes in the node pool, defined by Min. Instances and Max. Instances. This does not include your existing instances. Note
|
|
Instance-related configurations |
When scaling out, nodes are allocated from the configured ECS instance families. To improve scale-out success rates, select multiple instance types across multiple zones to avoid unavailability or insufficient inventory. The specific instance type used for scaling is determined by the configured Scaling Policy. To ensure business stability and accurate resource scheduling, do not mix GPU and non-GPU instance types in the same node pool. Configure instance types for scaling in one of two ways:
Refer to the console's elasticity strength recommendations for configuration, or view node pool elasticity strength after creation. For ACK-unsupported instance types and node configuration recommendations, see ECS instance type configuration recommendations. Cloud resource and billing information: |
|
Operating System |
When auto scaling is enabled, you can select Alibaba Cloud Linux, Windows, or Windows Core images. When the selected image is a Windows image or a Windows Core image, the system automatically configures the taint |
|
Node Labels |
Node labels added in the cluster are automatically applied to nodes created by auto scaling. Important
Auto scaling can recognize node labels and taints only after they are mapped to node pool tags. A node pool has a limit on the number of tags it can have. Therefore, the total number of ECS tags, taints, and node labels for a node pool with auto scaling enabled must be 12 or fewer. |
|
scaling policy |
Configure how the node pool selects instances during scaling.
|
|
Use Pay-as-you-go Instances When Spot Instances Are Insufficient |
Requires selecting spot instances as the billing method. When enabled, if sufficient spot instances cannot be created due to price or inventory reasons, ACK automatically attempts to create pay-as-you-go instances as a supplement. Cloud resource and billing information: |
|
Enable Supplemental Spot Instance |
Requires selecting spot instances as the billing method. When enabled, upon receiving a system notification that a spot instance will be reclaimed (5 minutes before reclamation), ACK attempts to scale out new instances for compensation.
Active release of spot instances may cause business disruptions. To improve compensation success rates, we recommend also enabling Use Pay-as-you-go Instances When Spot Instances Are Insufficient. Cloud resource and billing information: |
|
Scaling Mode |
Requires enabling Auto Scaling for the node pool and setting Scaling Mode to Auto.
|
|
Taints |
After you add a taint to a node, the cluster no longer schedules Pods to it. |
Step 3: (Optional) Verify the results
After you complete the steps, node auto scaling is active. The node pool's status will indicate that auto scaling has started, and the cluster-autoscaler component will be automatically installed.
Node pool has auto scaling enabled
On the Node Pools page, the list shows the node pools that have auto scaling enabled.
Installed cluster-autoscaler component
In the left navigation pane of the cluster management page, choose .
-
Select the kube-system namespace. The cluster-autoscaler component is displayed.
FAQ
Category | Subcategory | Link |
Scaling behavior of node auto scaling | ||
| ||
Does the cluster-autoscaler support CustomResourceDefinitions (CRDs)? | ||
Custom scaling behavior | ||
cluster-autoscaler component | ||