If your cluster's allocated resources are insufficient to schedule application pods due to high demand, you can enable the auto scaling feature in ACK One registered clusters to automatically scale out nodes and increase available resources for scheduling. Two elasticity solutions are available: node auto scaling and node instant scaling. The latter offers faster scaling, higher delivery efficiency, and lower operational complexity.
Prerequisites
You have created a node pool.
You have read node scaling to understand its working principles and features.
Step 1: Configure RAM permissions
Create a RAM user and grant the following custom policy to the user. For more information, see Use RAM to authorize access to clusters and cloud resources.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the one you want to change. In the left-side pane, choose .
On the Secrets page, click Create From YAML, enter the following YAML template, and create a Secret named alibaba-addon-secret.
NoteThe component uses the AccessKey ID and AccessKey secret stored in the Secret to access cloud services. If alibaba-addon-secret already exists, skip this step.
apiVersion: v1 kind: Secret metadata: name: alibaba-addon-secret namespace: kube-system type: Opaque stringData: access-key-id: <AccessKey ID of the RAM user> access-key-secret: <AccessKey secret of the RAM user>
Step 2: Configure node scaling solution
Enable node auto scaling
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster to manage and click its name. In the left-side navigation pane, choose .
On the Node Pools page, click Node Scaling and then click Configure.
If you are using the node auto scaling feature for the first time, follow the on-screen prompts to activate the ESS service.
On the Node Scaling Configuration page, select Node Scaling Solution as Auto Scaling, configure the scaling parameters, and then click OK.
You can switch between node scaling solutions after selection. If you want to switch, you can change to node instant scaling here, carefully read the on-screen instructions, and follow the guidance to complete the operation.
Parameter
Description
Node Pool Scale-out Order Policy
Random Policy: When multiple node pools are available for scale-out, a random node pool is selected for the scale-out operation.
Default Policy: When multiple node pools are available for scale-out, the node pool that wastes the least resources is selected for the scale-out operation.
Priority Policy: When multiple node pools are available for scale-out, the node pool with the highest priority is selected for the scale-out operation.
The priority of a node pool is defined by the Node Pool Scale-out Priority parameter.
Node Pool Scale-out Priority
Set the scale-out priority for node pools. This parameter takes effect only when Node Pool Scale-out Order Policy is set to Priority Policy.
Valid values: integers from 1 to 100. A larger value indicates a higher priority.
You need to click Add on the right side of the parameter, select a node pool that has auto scaling enabled, and set a priority for the node pool.
If no node pool with auto scaling enabled is available for selection, you can temporarily ignore this parameter and set the node pool priority after Step 2: Configure a node pool with auto scaling enabled is performed.
Scaling Sensitivity
Specify the interval at which the cluster is evaluated for scaling. Default value: 60s.
The autoscaler triggers scale-out activities based on the actual scheduling status.
ImportantECS nodes: The autoscaler performs scale-in activities only when the following conditions are met: Scale-in Threshold, Defer Scale-in For, and Cooldown.
GPU nodes: The autoscaler performs scale-in activities for GPU nodes only when the following conditions are met: GPU Scale-in Threshold, Defer Scale-in For, and Cooldown.
Allow Scale-in
Specify whether to allow scale-in activities. The scale-in configuration does not take effect when this switch is turned off. Proceed with caution.
Scale-in Threshold
The ratio of the requested resources to the capacity of a single node in a node pool that has node auto scaling enabled.
A node can be removed only when this ratio is lower than the configured threshold, which means that both the CPU and memory utilization of the node are lower than the Scale-in Threshold.
GPU Scale-in Threshold
The scale-in threshold for GPU-accelerated nodes.
A GPU node can be removed only when this ratio is lower than the configured threshold, which means that the CPU, memory, and GPU utilization of the node are all lower than the GPU Scale-in Threshold.
Defer Scale-in For
The period of time between when a scale-in condition is met and when the scale-in activity is performed. Unit: minutes. Default value: 10.
ImportantThe autoscaler performs scale-in activities only when the Scale-in Threshold condition is met and the waiting period specified in Defer Scale-in For ends.
Cooldown
After the autoscaler performs a scale-out activity, the autoscaler waits a cooldown period before it can perform a scale-in activity.
The autoscaler cannot perform scale-in activities within the cooldown period but can still check whether the nodes meet the scale-in conditions. After the cooldown period ends, if a node meets the scale-in conditions and the waiting period specified in the Defer Scale-in For parameter ends, the node is removed. For example, the Cooldown parameter is set to 10 minutes and the Defer Scale-in For parameter is set to 5 minutes. The autoscaler cannot perform scale-in activities within the 10-minute cooldown period after performing a scale-out activity. However, the autoscaler can still check whether the nodes meet the scale-in conditions within the cooldown period. When the cooldown period ends, the nodes that meet the scale-in conditions are removed after 5 minutes.
Enable node instant scaling
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster to manage and click its name. In the left-side navigation pane, choose .
On the Node Pools page, click Node Scaling and then click Configure.
If you are using the cluster auto scaling feature for the first time, follow the on-screen prompts to activate the ESS service.
In the Node Scaling Configuration panel, set Node Scaling Method to Auto Scaling, configure scaling parameters, and then click OK.
Scale-out activities are automatically triggered based on the actual scheduling status.
Node Scaling Method can be changed after the configuration. You can change it to Auto Scaling. Read the instructions on this page and complete the operations. This feature is available to whitelisted users. To use this feature, submit a ticket.
Parameter
Description
Scale-in Threshold
Specify the ratio of the resource request of a node to the resource capacity of the node in a node pool that has node instant scaling enabled.
A scale-in activity is performed only when the combined total of CPU and memory utilization of a node is lower than the Scale-in Threshold.
GPU Scale-in Threshold
The scale-in threshold for GPU-accelerated nodes.
A scale-in activity is performed only when the combined total of CPU, memory, and GPU utilization of a node is lower than the Scale-in Threshold.
Defer Scale-in For
The interval between the time when the scale-in threshold is reached and the time when the scale-in activity (reduce the number of pods) starts. Unit: minutes. Default value: 10.
ImportantThe autoscaler performs scale-in activities only when Scale-in Threshold is configured and the Defer Scale-in For condition is met.
Step 3: Configure a node pool with auto scaling enabled
You can either modify existing node pools by switching their Scaling Mode to Auto, or create new node pools with auto scaling enabled. Key configurations are as follows:
Parameter | Description |
Scaling Mode | Manual and Auto scalings are supported. Computing resources are automatically adjusted based on your business requirements and policies to reduce cluster costs.
|
Instances | The Min. Instances and Max. Instances defined for a node pool exclude your existing instances. Note
|
Instance-related parameters | Select the ECS instances used by the worker node pool based on instance types or attributes. You can filter instance families by attributes such as vCPU, memory, instance family, and architecture. For more information about the instance specifications not supported by ACK and how to configure nodes, see ECS specification recommendations for ACK clusters. When the node pool is scaled out, ECS instances of the selected instance types are created. The scaling policy of the node pool determines which instance types are used to create new nodes during scale-out activities. Select multiple instance types to improve the success rate of node pool scale-out operations. The instance types of the nodes in the node pool. If you select only one, the fluctuations of the ECS instance stock affect the scaling success rate. We recommend that you select multiple instance types to increase the scaling success rate. If you select only GPU-accelerated instances, you can select Enable GPU Sharing on demand. For more information, see cGPU overview. |
Operating System | When you enable auto scaling, you can select an image based on Alibaba Cloud Linux, Windows, or Windows Core. If you select an image based on Windows or Windows Core, the system automatically adds the |
Node Labels | Node labels are automatically added to nodes that are added to the cluster by scale-out activities. Important Auto scaling can recognize node labels and taints only after the node labels and taints are mapped to node pool tags. A node pool can have only a limited number of tags. Therefore, you must limit the total number of ECS tags, taints, and node labels of a node pool that has auto scaling enabled to less than 12. |
Scaling Policy |
|
Use Pay-as-you-go Instances When Preemptible Instances Are Insufficient | You must set the Billing Method parameter to Preemptible Instance. After this feature is enabled, if enough preemptible instances cannot be created due to price or inventory constraints, ACK automatically creates pay-as-you-go instances to meet the required number of ECS instances. |
Enable Supplemental Preemptible Instances | You must set the Billing Method parameter to Preemptible Instance. After this feature is enabled, when a system receives a message that preemptible instances are reclaimed, the node pool with auto scaling enabled attempts to create new instances to replace the reclaimed preemptible ones. |
Scaling Mode | You must enble Node Scaling on the Node Pools page and set the Scaling Mode of the node pool to Auto.
|
Taints | After you add taints to a node, ACK no longer schedules pods to it. |
Step 4: (Optional) Verify the result
After you complete the preceding operations, you can use the node auto scaling feature. The node pool shows that auto scaling has started and the cluster-autoscaler component is automatically installed in the cluster.
Auto scaling is enabled for the node pool
On the Node Pools page, the node pools that have auto scaling enabled are displayed in the node pool list.
The cluster-autoscaler component is installed
In the left-side navigation pane of the details page, choose .
Select the kube-system namespace to view the cluster-autoscaler component.