Container Service for Kubernetes (ACK) provides node pools for you to manage nodes in groups. A node pool is a group of nodes that have the same configurations, such as instance specifications, operating system, labels, and taints. You can create one or more node pools of different types and configurations in an ACK cluster. After you create a node pool, you can manage nodes in the node pool in a centralized manner.
Before you create a node pool, we recommend that you read the Node pools to familiarize yourself with the basic information, use scenarios, relevant features, and billing rules of node pools.
Console operations
On the Node Pools page of the cluster that you want to manage in the ACK console, you can create, edit, or delete a node pool. You can also view the details of a node pool.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, find the cluster to manage and click its name. In the left-side navigation pane, choose .
Create a node pool
When you create a node pool in the ACK console, you can configure the basic, network, and storage configurations. Some node pool parameters, especially the parameters related to the node pool availability and network, cannot be modified after the node pool is created. The following tables describe these parameters. Creating a node pool in a cluster does not affect the nodes and applications deployed in other node pools of the cluster.
You can also create a node pool by calling the ACK API or by using Terraform. For more information, see CreateClusterNodePool or Use Terraform to create a node pool that has auto scaling enabled.
On the Node Pools page, click Create Node Pool. In the Create Node Pool dialog box, configure the node pool parameters.
After you create the node pool, you can modify the node pool parameters on the Edit Node Pool page. The Modifiable column in the following tables indicates whether the corresponding parameter can be modified after the node pool is created.
indicates that the parameter cannot be modified.
indicates that the parameter can be modified.
Basic configurations
Parameter
Description
Modifiable
Node Pool Name
Specify a node pool name.
Region
By default, the region in which the cluster resides is selected. You cannot change the region.
Confidential Computing
NoteTo use confidential computing, submit a ticket to apply to be added to the whitelist.
This parameter is available only when you select containerd for the Container Runtime parameter.
Specify whether to enable confidential computing. ACK provides an all-in-one cloud-native confidential computing solution based on hardware encryption technologies. Confidential computing ensures data security, integrity, and confidentiality. It simplifies the development and delivery of trusted or confidential applications to reduce costs. For more information, see TEE-based confidential computing.
Container Runtime
Specify the container runtime based on the Kubernetes version. For more information about how to select a container runtime, see Comparison among Docker, containerd, and Sandboxed-Container.
containerd: containerd is recommended for all Kubernetes versions.
Sandboxed-Container: supports Kubernetes 1.31 and earlier.
Docker (deprecated): supports Kubernetes 1.22 and earlier.
Scaling Mode
Manual and Auto scalings are supported. Computing resources are automatically adjusted based on your business requirements and policies to reduce cluster costs.
Manual: ACK adjusts the number of nodes in the node pool based on the value of the Expected Nodes parameter. The number of nodes is always the same as the value of the Expected Nodes parameter. For more information, see Manually scale a node pool.
Auto: When the capacity planning of the cluster cannot meet the requirements of pod scheduling, ACK automatically scales out nodes based on the configured minimum and maximum number of instances. By default, node instant scaling is enabled for clusters running Kubernetes 1.24 and later, and node auto scaling is enabled for clusters running Kubernetes versions earlier than 1.24. For more information, see Node scaling.
Automated O&M configurations
ACK provides the following options for managed node pool configurations. The options provide different levels of Automated O&M capabilities.
Auto Mode: After you enable this option, ACK dynamically scales based on the resource requirements of the workloads in the node pool. In this case, ACK takes over O&M responsibilities, such as OS upgrades, software upgrades, and vulnerability patching.
This option is available only for clusters with auto mode enabled. This feature is in canary release. To use it, submit a ticket.
Managed Node Pool: This option allows you to configure the following parameters for automated O&M capabilities. You can also configure a maintenance window for running automated O&M tasks.
Disable: disables automated O&M capabilities. If you select this option, you must manually maintain nodes and node pools.
For more information about the differences and usage notes of different managed node pool configurations, see Comparison of managed node pool configurations.
Network configurations
Parameter
Description
Modifiable
Network Settings
VPC
By default, the virtual private cloud (VPC) in which the cluster resides is selected. You cannot change the VPC.
vSwitch
When the node pool is being scaled out, new nodes are created in the zones of the selected vSwitches based on the policy that you select for the Scaling Policy parameter. You can select vSwitches in the zones that you want to use.
If no vSwitch is available, click Create vSwitch to create one. For more information, see Create and manage a vSwitch.
Instance and Image
Parameter
Description
Modifiable
Billing Method
The default billing method used when ECS instances are scaled in a node pool. You can select Pay-As-You-Go, Subscription, or Preemptible Instance.
If you select the Subscription billing method, you must configure the Duration parameter and choose whether to enable Auto Renewal.
Preemptible Instance: ACK supports only Preemptible Instance with a protection period. You must also configure the Upper Price Limit of Current Instance Spec parameter.
If the real-time market price of an instance type that you select is lower than the value of this parameter, a preemptible instance of this instance type is created. After the protection period (1 hour) ends, the system checks the spot price and resource availability of the instance type every 5 minutes. If the real-time market price exceeds your bid price or if the resource inventory is insufficient, the preemptible instance is released. For more information, see Best practices for preemptible instance-based node pools.
To ensure that all nodes in a node pool use the same billing method, ACK does not allow you to change the billing method of a node pool from pay-as-you-go or subscription to preemptible instances. For example, you cannot switch the billing method of a node pool between pay-as-you-go or subscription and preemptible instances.
ImportantIf you change the billing method of a node pool, the change takes effect only on newly added nodes. The existing nodes in the node pool still use the original billing method. For more information about how to change the billing method of existing nodes in a node pool, see Change the billing method of an instance from pay-as-you-go to subscription.
To ensure that all nodes use the same billing method, ACK does not allow you to change the billing method of a node pool from pay-as-you-go or subscription to preemptible instances, or change the billing method of a node pool from preemptible instances to pay-as-you-go or subscription.
Instance-related parameters
Select the ECS instances used by the worker node pool based on instance types or attributes. You can filter instance families by attributes such as vCPU, memory, instance family, and architecture. For more information about how to configure nodes, see ECS specification recommendations for ACK clusters.
When the node pool is scaled out, ECS instances of the selected instance types are created. The scaling policy of the node pool determines which instance types are used to create new nodes during scale-out activities. Select multiple instance types to improve the success rate of node pool scale-out operations.
If the node pool fails to be scaled out because the instance types are unavailable or the instances are out of stock, you can specify more instance types for the node pool. The ACK console automatically evaluates the scalability of the node pool. You can check the scalability of the node pool when you create the node pool or after you create the node pool.
If you select only GPU-accelerated instances, you can select Enable GPU Sharing on demand. For more information, see cGPU overview.
Operating System
Alibaba Cloud Marketplace images is in canary release.
Public Image: Public images of operating systems provided by Container Service for Kubernetes, such as Alibaba Cloud Linux 3 ACK-optimized, ContainerOS, Alibaba Cloud Linux 3, Ubuntu, and Windows. For more information, see OS images.
Custom Image: Use a custom operating system image. For more information, see How do I create a custom image based on an existing ECS instance and use it to create nodes?
NoteAfter you change the OS image of the node pool, the change takes effect only on newly added node. The existing nodes in the node pool still use the original OS image. For more information about how to upgrade or change the operating system, see Change the operating system.
To ensure that all nodes in the node pool use the same OS image, ACK allows you to only update the node OS image to the latest version. ACK does not allow you to change the type of OS image.
Security Hardening
Enable security hardening for the cluster. You cannot modify this parameter after the cluster is created.
Disable: disables security hardening for ECS instances.
MLPS Security Hardening: Alibaba Cloud provides baselines and the baseline check feature to help you check the compliance of Alibaba Cloud Linux 2 images and Alibaba Cloud Linux 3 images with the level 3 standards of Multi-Level Protection Scheme (MLPS) 2.0. MLPS Security Hardening enhances the security of OS images to meet the requirements of GB/T 22239-2019 Information Security Technology - Baseline for Classified Protection of Cybersecurity without compromising the compatibility and performance of the OS images. For more information, see ACK security hardening based on MLPS.
ImportantAfter you enable MLPS Security Hardening, remote logons through SSH are prohibited for root users. You can use Virtual Network Computing (VNC) to log on to the OS from the ECS console and create regular users that are allowed to log on through SSH. For more information, see Connect to an instance by using VNC.
OS Security Hardening: You can enable Alibaba Cloud Linux Security Hardening only when the system image is an Alibaba Cloud Linux 2 or Alibaba Cloud Linux 3 image.
Logon Type
If you select MLPS Security Hardening, only the Password option is supported. When Operating System is set to ContainerOS, the valid values are Key Pair and Later.
Valid values: Key Pair, Password, and Later.
Configure the logon type when you create the node pool:
Key Pair: Alibaba Cloud SSH key pairs provide a secure and convenient method to log on to ECS instances. An SSH key pair consists of a public key and a private key. SSH key pairs support only Linux instances.
Configure the Username (select root or ecs-user as the username) and the Key Pair parameters.
Password: The password must be 8 to 30 characters in length, and can contain letters, digits, and special characters.
Configure the Username (select root or ecs-user as the username) and the Password parameters.
Later: Bind a key pair or reset the password after the instance is created. For more information, see Bind an SSH key pair to an instance and Reset the logon password of an instance.
Username
If you select Key Pair or Password for Logon Type, you must select root or ecs-user as the username.
Storage configurations
Parameter
Description
Modifiable
System Disk
ESSD AutoPL, Enterprise SSD (ESSD), ESSD Entry, Standard SSD, and Ultra Disk are supported. The types of system disks that you can select vary based on the instance families that you select. Disk types that are not displayed in the drop-down list are not supported by the instance types that you select.
You can select More System Disk Types and select a disk type other than the current one in the System Disk section to improve the success rate of system disk creation. The system will attempt to create a system disk based on the specified disk types in sequence.
Data Disk
ESSD AutoPL, Enterprise SSD (ESSD), ESSD Entry, SSD, and Ultra Disk are supported. The data disk types that you can select vary based on the instance families that you select. Disk types that are not displayed in the drop-down list are not supported by the instance types that you select.
You can select Encryption for all disk types when you specify the type of data disk. By default, the default service CMK is used to encrypt the data disk. You can also use an existing CMK generated by using BYOK in KMS.
You can also use snapshots to create data disks in scenarios where container image acceleration and fast loading of large language models (LLMs) are required. This improves the system response speed and enhances the processing capability.
Make sure that a data disk is mounted to
/var/lib/container
on each node, and/var/lib/kubelet
and/var/lib/containerd
are mounted to the/var/lib/container
. For other data disks on the node, you can perform the initialization operation and customize their mount directories. For more information, see Can I mount a data disk to a custom directory in an ACK node pool?
NoteUp to 64 data disks can be attached to an ECS instance. The number of disks that can be attached to an ECS instance varies based on the instance type. To query the maximum number of data disks supported by each instance type, call the DescribeInstanceTypes operation and query the DiskQuantity parameter in the response.
Instance quantity
Parameter
Description
Modifiable
Expected Number of Nodes
The expected number of nodes in the node pool. We recommend that you configure at least two nodes to ensure that cluster components run as expected. You can configure the Expected Nodes parameter to adjust the number of nodes in the node pool. For more information, see Scale a node pool.
If you do not want to create nodes in the node pool, set this parameter to 0. You can manually modify this parameter to add nodes later.
Advanced configurations
Click Advanced Options (Optional) and configure the node scaling policy, resource group, ECS tags, and taints.
Click Confirm.
In this page, you can click Generate API Request Parameters in the top-left corner to produce Terraform or SDK sample parameters that match your node pool configuration.
Click Confirm. Then, in the node pool list:
Initializing status indicates the node pool is being created.
Active status indicates successful creation.
Modify a node pool
After you create a node pool, you can modify the configurations of the node pool in the ACK console. For example, you can change the billing method, vSwitches, instance specifications, and system disks that are used by the node pool. You can also enable or disable auto scaling for the node pool. For more information about the modifiable parameters, see Create a node pool.
Modifying a node pool does not affect the nodes and applications deployed in other node pools of the cluster.
In most scenarios, after you modify a node pool, the modified configurations apply only to newly added nodes. In specific scenarios, such as when you update the ECS tags or labels and taints of existing nodes, the modified configurations also apply to existing nodes in the node pool.
After you update the configurations of a node pool, nodes that are subsequently added to the node pool use the modified configurations.
To modify the node pool configuration, refer to the following steps. If you have made changes to the nodes through other methods, these changes will be overwritten when the node pool is updated.
On the Node Pools page, find the node pool that you want to modify and click Edit in the Actions column.
In the dialog box that appears, modify the parameters of the node pool based on the on-screen instructions.
On the Node Pools page, if the Status column of the node pool displays Updatng, the node pool is being modified. After the node pool is updated, the Status column displays Activated.
View a node pool
You can view the basic information, monitoring data, node information, and scaling events of a node pool in the ACK console.
Click the name of the node pool that you want to manage to view the following information on the details page of the node pool:
Click the Overview tab to view the cluster information, node pool information, and node configurations. If the cluster has auto scaling enabled, you can also view the auto scaling configurations.
Click the Monitor tab to view the node monitoring information provided by Managed Service for Prometheus. The monitoring information includes the resource watermarks in the node pool, such as CPU usage, memory usage, disk usage, and average CPU or memory utilization per node.
Click the Nodes tab to view the list of nodes in the node pool. You can drain a node, configure the scheduling settings of a node, or perform O&M operations on a node. You can also remove a node from the node pool. You can click Export to export the details of the nodes to a comma-separated values (CSV) file.
Click the Scaling Activities tab to view the latest scaling events of the node pool. Each event record provides a description of the scaling activity and the number of ECS instances after the scaling activity is performed. You can also view the reasons for scaling failures. For more information about the common error codes for scaling failures, see Manually scale a node pool.
Delete a node pool
The release rules of an ECS instance vary based on the billing method of the instance. When you remove a node from a node pool, we recommend that you perform the operations described in the following table. Before you delete a node pool, check whether the Expected Nodes parameter is configured for the node pool. This parameter may affect the node release process.
Node pool | Release rule |
Node pool that is configured with the Expected Nodes parameter |
|
Node pool that is not configured with the Expected Nodes parameter |
|
Optional: Click the name of the node pool that you want to manage. On the Overview tab, you can check whether the Expected Nodes parameter is configured. If a hyphen (-) is displayed, the Expected Nodes parameter is not configured.
Find the node pool you want to delete and choose
> Delete in the Actions column. Read and confirm the information in the dialog box and click OK.
What to do next
After the node pool is created, you can perform the following operations.
Action | Description | References |
Sync Node Pool | If the node information is abnormal, you can synchronize the node pool. | None |
Details | View the details of the node pool. | None |
Scale | Manual and Auto scalings are supported. Computing resources are automatically adjusted based on your business requirements and policies to reduce cluster costs.
| |
Edit | Modify the configurations of the node pool. For example, you can modify the vSwitches, managed node pool settings, billing method, and instance specifications. You can also enable or disable auto scaling for the node pool. | |
Monitor | View the node monitoring information provided by Managed Service for Prometheus. The monitoring information includes the resource watermarks in the node pool, such as CPU usage, memory usage, disk usage, and average CPU or memory utilization per node. | |
Logon Mode | Configure the logon type of nodes. You can specify a key pair or password. | |
Configure Managed Node Pool | Automate node O&M for the node pool. After you enable the managed node pool feature, O&M tasks such as node repair, kubelet updates, runtime updates, and OS Common Vulnerabilities and Exposures (CVE) patching are automatically performed. | |
Add Existing Node | Add an existing ECS instance to the cluster as a worker node. You can perform this operation to add a worker node that you have previously removed from the cluster. Specific limits and usage notes apply to this operation. Refer to the ACK documentation for details. | |
Clone | Clone a node pool that contains the expected number of nodes based on the current node pool configurations. | None |
Node Repair | When exceptions occur on a node in a managed node pool, ACK automatically repairs the node. However, you may still need to manually fix some complex node exceptions. For more information about the check items of node status and repair solutions for node exceptions, refer to the relevant topics in the ACK documentation. | |
CVE Patching (OS) | Patch CVE vulnerabilities in the node pool in batches to improve the stability, security, and compliance of the cluster. ACK may need to restart nodes to patch specific vulnerabilities. For more information about CVE patching and the usage notes for CVE patching, refer to the relevant topics in the ACK documentation. | |
Kubelet Configuration | Customize the kubelet parameters for all nodes in the node pool to manage the behavior of the nodes. For example, you can customize the kubelet parameters if you need to modify resource reservations to adjust the resource usage. We recommend that you do not use the CLI to customize kubelet parameters that are unavailable in the ACK console. | |
Containerd Configuration | Customize the containerd parameters for all nodes in the node pool. Example:
| |
OS Configuration | Customize the OS parameters of all nodes in the node pool to improve OS performance. We recommend that you do not use the CLI to customize OS parameters that are unavailable in the ACK console. | |
Change Operating System | Change or update the operating system of the node pool. | None |
Kubelet Update | Update the kubelet and containerd for all nodes in the node pool. | |
Delete | Delete the node pool if the node pool is no longer in use to save costs. The release rules of nodes in a node pool vary based on the billing method of the nodes and whether the Expected Nodes parameter is configured for the node pool. |
Comparison of managed node pool configurations
Configuration | Disabled | Managed node pool | Auto mode | |
Node pool | Instance type | Manual configuration | Manual configuration | Configurable. ACK provides intelligent recommendations for instance types. |
Billing method | Manual configuration | Manual configuration | Only pay-as-you-go is supported. | |
OS | Manual configuration | Manual configuration | Only ContainerOS is supported. | |
System disk | Manual configuration | Manual configuration | The recommended configurations are applied automatically, which is 20 GiB. | |
Data disk | Manual configuration | Manual configuration | Configurable. A data disk is used for temporary storage of the ContainerOS operating system. | |
Auto scaling | Optional for manual configuration | Optional for manual configuration | The node instant scaling feature is enabled by default. Manual configuration is supported. | |
Automatic responses to ECS system events | Not supported | Enabled by default | Enabled by default | |
Node auto repair | Not supported | Optional for manual configuration | Enabled by default | |
Automatic upgrade of the kubelet and runtime versions | Not supported | Optional for manual configuration | Enabled by default | |
OS CVE vulnerability auto repair | Not supported | Optional for manual configuration | Enabled by default |
After you enable auto mode for the node pool, it dynamically scales nodes based on your workload requirements, with a default maximum capacity of 50 nodes. You can modify the maximum number of instances by using the scaling feature of the node pool.
After you enable auto mode for the node pool, ACK takes over O&M responsibilities, such as OS upgrades, software upgrades, and vulnerability patching. These responsibilities include tasks like software version upgrades, software configuration modifications, restarts, and drain evictions. Avoid performing manual operations on the ECS nodes within the node pool, such as restarting, mounting data disks, or modifying configurations by logging into the nodes, to prevent conflicts with auto mode policies. We recommend that you set reasonable replica counts for your workloads, implement PreStop graceful shutdown strategies, and establish PodDisruptionBudget policies to ensure that nodes can be drained for maintenance without interrupting your business.
After you enable auto mode for the node pool, ACK enhances node security based on the ContainerOS operating system, which employs an immutable root file system. We recommend that you use PVC as persistent storage rather than node system path storage (such as HostPath).
After you enable auto mode for the node pool, ARM, GPU, on-premises disk, and other instance types are not supported. ACK has recommended default instance types that can meet application needs in most scenarios, and you can adjust them in the console based on your actual business requirements. We recommend that you set a sufficient number of instance types to enhance the resilience of the node pool and avoid scaling failures.
The auto mode aims to provide automated and intelligent operation and maintenance functions for Kubernetes clusters. However, in certain scenarios, you still need to fulfill some obligations. For more information, see Shared responsibility model.
FAQ
How do I create a custom image from an ECS instance and use the image to create a node?
After you create an ECS instance, you can customize the instance by performing operations such as installing software and deploying application environments. Then, you can create a custom image from the instance. Instances created from the custom image contain all of the customized items, which eliminates the need to configure these items for each new instance.
Log on to the ECS instance and run the following command to delete the specified files. For more information about how to log on to an ECS instance, see Use Workbench to connect to a Linux instance over SSH.
chattr -i /etc/acknode/nodeconfig-* rm -rf /etc/acknode systemctl disable ack-reconfig rm -rf /etc/systemd/system/ack-reconfig.service rm -rf /usr/local/bin/reconfig.sh rm -rf /var/lib/cloud systemctl stop kubelet systemctl disable kubelet rm -rf /etc/systemd/system/kubelet.service rm -rf /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
Create a custom image from the ECS instance. For more information about the procedure and usage notes, see Create a custom image from an instance.
Create a node pool. When you create the node pool, select Custom Image for the Operating System parameter. Configure other parameters based on the description previously mentioned in this topic.
Create a custom image based on the operating system supported by the ACK cluster. For more information, see OS images.
Do not build custom images on running ECS instances in an ACK cluster. To do this, you must first remove the ECS instances from the cluster. For more information, see Remove a node.
The predefined behavior logic in a custom image may affect operations such as cluster node initialization, container launching, node updates, and automatic recovery of nodes in a managed node pool. Before you use it in a production environment, ensure that the custom image has been tested and validated.
References
If a node is no longer in use, you can remove the node. For more information, see Remove a node.
ACK reserves a certain amount of node resources to run Kubernetes components and system processes. For more information, see Resource reservation policy.
When the resource capacity of the cluster cannot meet the requirements for pod scheduling, you can enable node scaling. For more information, see Node scaling.
The maximum number of pods on a worker node varies based on the network plug-in and cannot be adjusted in most cases. To increase the maximum number of pods in a cluster, you can scale out the node pools in the cluster, upgrade the instance specifications used by the cluster, and reset the pod CIDR block. For more information, see Increase the maximum number of pods in a cluster.