ACK uses node pools to group and manage nodes. A node pool is a logical group of nodes that share the same properties, such as instance type, operating system, labels, and taints. You can create multiple node pools with different configurations in a single cluster to streamline node management.
Before creating a node pool, read Node pools to learn about the basics, use cases, related features, and billing.
Procedure
You can create, edit, delete, and view node pools on the node pool page of the target cluster.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
Create a node pool
You can configure a node pool on the console, including its basic, network, and storage configurations. Note that some settings, especially those related to node pool availability and network configuration, cannot be changed after the node pool is created. Creating a node pool does not affect the nodes and services in other node pools.
Besides the console, ACK also lets you create node pools using the API and Terraform. See Create a node pool and Create an auto-scaling node pool with Terraform.
On the Node Pool page, click Create Node Pool. In the Create Node Pool dialog box, configure the settings.
After creating a node pool, you can modify its configuration items on the Edit Node Pool page. The Modification Supported column indicates whether an item can be modified after creation.
indicates an item cannot be modified, while
indicates it can be.Basic configuration
Parameter
Description
Modifiable
Node Pool Name
Enter a custom node pool name.
✓
Confidential computing
NoteOnly allowlisted users can configure confidential computing. To apply, submit a ticket.
This parameter is available only when the container runtime is set to containerd.
Specifies whether to enable confidential computing. Confidential computing is a cloud-native, one-stop container platform that provides hardware-based encryption for users with stringent security requirements. It ensures the security, integrity, and confidentiality of data in use (during computation) and simplifies the development, delivery, and management of trusted or confidential applications. For more information, see ACK-TEE Confidential Computing.
✗
Container Runtime
For selection guidance, see Compare containerd, sandboxed container, and Docker runtimes.
containerd (recommended): Community standard, supported for Kubernetes 1.20 and later.
Sandboxed container: Provides a strongly isolated environment based on lightweight virtualization technology. For procedures and limitations, see Create and manage sandboxed container node pools.
Docker (deprecated): Supported only for Kubernetes 1.22 and earlier. Creation is no longer supported.
✗
Scaling Mode
Manual: ACK adjusts the node count in the node pool based on the configured Expected Number of Nodes, maintaining the node count at the Expected Number of Nodes. For details, see Manually scale node pools.
Auto: When cluster capacity planning cannot meet application pod scheduling demands, ACK automatically scales node resources based on configured minimum and maximum instance counts. Clusters running Kubernetes 1.24 or later default to node instant scaling; clusters running earlier versions default to node autoscaling. For details, see Node scaling.
✓
Managed configuration
ACK offers three managed configurations with varying levels of automation for node pools.
Intelligent hosting: When intelligent hosting is enabled, ACK automatically scales the node pool based on workload demands. ACK also manages operational tasks, including operating system and software version upgrades, and fixing security vulnerabilities.
ImportantFor information about the capacity limits, operational boundaries, and storage specifications for intelligent hosting node pools, see Usage notes.
This feature is available only in ACK Managed Cluster Pro clusters that run Kubernetes 1.30 or later.
Managed node pool: Select your desired automation capabilities and specify a cluster maintenance window.
Disabled: Disables automation capabilities. You must manually manage the nodes and the node pool.
To compare managed configuration capabilities and considerations, see Comparison of managed configuration capabilities and considerations.
Network configuration
Parameter
Description
Modifiable
Network configuration
VPC
The cluster's VPC is selected by default and cannot be changed after the node pool is created.
Cloud resources and billing:
VPC✗
vSwitch
During scaling, nodes scale in or out in the availability zones of the selected vSwitches based on the scaling policy. For high availability, select vSwitches in two or more different availability zones.
To create a vSwitch, see Create and manage vSwitches.
✓
Instance and image configuration
Parameter
Description
Modification supported
Billing Method
The default billing method used when scaling out nodes in the node pool.
Pay-As-You-Go: Can be enabled and released on demand.
Subscription: Requires configuring Duration and Auto Renewal.
Preemptible Instance: Currently, only spot instances with a protection period are supported. You must also configure the Instance Price Cap.
The instance is created successfully when the real-time price of the specified instance type is below the maximum bid price. After the protection period (1 hour), the system checks the real-time price and inventory every 5 minutes. If the market price exceeds the bid price or inventory is insufficient, the spot instance is released. For usage recommendations, see Spot instance node pool best practices
To maintain node pool consistency, you cannot change a Pay-As-You-Go or Subscription node pool to a Preemptible Instance node pool, or vice versa.
ImportantChanging the billing method of a node pool affects only new nodes added during scale-outs. This change does not affect the billing method of existing nodes. To change the billing method for existing nodes, see Convert a pay-as-you-go instance to a subscription instance.
✓
Instance configurations
When scaling out, nodes are allocated from the configured ECS instance families. To improve scale-out success rates, select multiple instance types across multiple zones to avoid unavailability or insufficient inventory. The specific instance type used for scaling is determined by the configured Scaling Policy.
To ensure business stability and accurate resource scheduling, do not mix GPU and non-GPU instance types in the same node pool.
Configure instance types for scaling in one of two ways:
Specific types: Specify exact instance types based on vCPU, memory, family, architecture, and other dimensions.
Generalized configuration: Select instance types to use or exclude based on attributes (vCPU, memory, etc.) to further improve scale-out success rates. For details, see Configure node pools using specified instance attributes.
Refer to the console's elasticity strength recommendations for configuration, or view node pool elasticity strength after creation.
For ACK-unsupported instance types and node configuration recommendations, see ECS instance type configuration recommendations.
Cloud resource and billing information:
ECS instance,
GPU instance✓
Operating System
Marketplace Image is in phased release.
The default operating system image used when scaling out nodes in the node pool.
Public Image: Uses Alibaba Cloud Linux 3 container-optimized, ContainerOS, Alibaba Cloud Linux 3, Ubuntu, Windows, and other public OS images. For image details, cgroup versions, and usage limits, see Operating system.
Custom Image: Uses a custom OS image. For details, see How to create a custom image from an existing ECS instance and use it to create nodes.
To upgrade or change the operating system later, see Change operating system.
Alibaba Cloud Linux 2 and CentOS 7 are no longer maintained. Use supported operating systems. We recommend Alibaba Cloud Linux 3 container-optimized or ContainerOS.
✓
Security Hardening
When creating nodes, ACK applies the selected security baseline policy.
Disable: No security hardening is applied to ECS instances.
MLPS Security Hardening: Alibaba Cloud provides baseline check standards and scanning tools for Alibaba Cloud Linux MLPS 2.0 Level 3 images that comply with classified protection requirements. While ensuring native image compatibility and performance, these images are adapted for MLPS compliance to meet "GB/T22239-2019 Information Security Technology—Cybersecurity Classified Protection Basic Requirements." For details, see ACK MLPS hardening usage guide.
In this mode, the root user cannot log on remotely via SSH. Connect to the instance via VNC in the ECS console and create a regular user that supports SSH logon.
OS Security Hardening: Supported only for Alibaba Cloud Linux 2 or Alibaba Cloud Linux 3.
✗
Logon Type
When selecting MLPS Security Hardening, only Password is supported.
ContainerOS supports only Key Pair or Later. If using a key pair, you must start an administrative container after configuration to use it. For details, see Manage ContainerOS nodes.
When creating nodes, ACK pre-configures the specified key pair or password on the instance.
Set during creation:
Key Pair: Alibaba Cloud SSH key pairs provide a secure and convenient logon authentication method comprising a public key and a private key. Supported only for Linux instances.
Configure both the Username (root or ecs-user) and the required Key Pair.
Password: Configure the Username (root or ecs-user) and password.
Later: After instance creation, bind a key pair or reset the instance password yourself. For details, see Bind SSH key pairs and Reset instance logon password.
✓
Storage
Parameter
Description
Modifiable
System Disk
Select a cloud disk type based on your business needs, including ESSD AutoPL, ESSD, ESSD Entry, and previous-generation disks (SSD and ultra disk). Configure capacity, IOPS, and other parameters.
Available system disk types depend on the selected instance family. Disk types not displayed are unsupported.
Supports selecting More Disk Categories to configure disk types different from the primary System Disk, improving scale-out success rates. When creating nodes, ACK selects the first matching disk type from the specified order.
Cloud resource and billing information:
ECS block storage✓
Data Disk
Select a cloud disk type based on your business needs, including ESSD AutoPL, ESSD, ESSD Entry, and previous-generation disks (SSD and ultra disk). Configure capacity, IOPS, and other parameters.
Available data disk types depend on the selected instance family. Disk types not displayed are unsupported.
When mounting data disks, all cloud disk types support Encrypted. By default, Alibaba Cloud uses the service key (Default Service CMK) for encryption. You can also select a custom key (BYOK) pre-created in KMS.
During node creation, the last data disk is automatically formatted, and
/var/lib/containeris mounted to this disk./var/lib/kubeletand/var/lib/containerdare mounted to/var/lib/container.To customize mount directories, adjust the data disk initialization configuration. You can select only one data disk as the container runtime directory. For details, see Can I customize directory mounting for data disks in ACK node pools?
For scenarios requiring container image acceleration or rapid large model loading, use snapshots to create data disks, improving system response speed and processing capability.
Select Add Data Disk Type to configure disk types different from the primary Data Disk, improving scale-out success rates. When creating nodes, ACK selects the first matching disk type from the specified order.
An ECS instance can mount up to 64 data disks. The maximum number of disks supported varies by instance type. Query the disk quantity limit for an instance type using the DescribeInstanceTypes API (DiskQuantity).
Cloud resource and billing information:
ECS block storage✓
Instance count
Parameter
Description
Modifiable
Expected Number of Nodes
The total number of nodes the node pool should maintain. We recommend configuring at least two nodes to ensure normal operation of cluster components. Adjust the desired node count to scale the node pool in or out. For details, see Scale node pools.
If you do not need to create nodes, enter 0 and adjust manually later or add existing nodes.
✓
Advanced settings
Expand Advanced Options (Optional) to configure settings such as scaling policies, resource groups, ECS tags, and taints.
Click Confirm Configuration.
On the Confirm Configuration page, click Equivalent Code in the lower-left corner to generate sample Terraform or SDK parameters for the current node pool configuration.
In the node pool list, a node pool's Status is Initializing during creation and changes to Active upon successful creation.
Edit a node pool
After a node pool is created, you can adjust its settings in the ACK console, such as vSwitches, billing method, instance types, system disks, and enabling or disabling auto scaling. For a list of editable settings, see the parameter descriptions in Create a node pool.
Editing a node pool does not affect its existing nodes or the workloads running on them. After the configuration is updated, new nodes added to the node pool use the new configuration by default.
Configuration updates to a node pool apply only to new nodes. The settings of existing nodes remain unchanged, except when you use features such as Synchronize ECS tags of existing nodes and Synchronize labels and taints of existing nodes.
When you switch the Scaling Mode of a node pool:
From Manual to Auto: This enables auto scaling. You must also set the minimum and maximum number of instances.
From Auto to Manual: This disables auto scaling. The minimum number of instances is set to 0, and the maximum number of instances is set to 2000. The system automatically sets the desired number of nodes to the current number of nodes in the node pool.
Use the steps in this section to update the node pool configuration. If you modify nodes in other ways, node pool upgrades will overwrite those changes.
On the node pools page, find the node pool to edit and click Edit in the Actions column.
On the Edit Node Pool page, modify the settings, and then follow the on-screen instructions to complete the configuration.
While the node pool is being updated, its Status on the node pools page is Updating. When the update is complete, the Status changes to Active.
View a node pool
You can view the basic information and monitoring data for a node pool, as well as details about the nodes and a history of scaling activities in the pool.
Click the name of the target node pool to view the following information:
The Basic Information tab displays information about the cluster, the node pool, and the node configuration. If auto scaling is enabled for the cluster, this tab also shows the auto scaling configuration.
The Monitoring tab is integrated with Alibaba Cloud Prometheus Service to display the resource utilization of the node pool, including metrics such as CPU, memory, and disk usage, as well as the average CPU and memory utilization of the nodes.
The Node Management tab lists all nodes in the node pool and allows you to remove nodes, perform O&M tasks, drain nodes, and manage scheduling. Click Export to save detailed node information as a CSV file.
The Scaling Activities tab displays a history of recent scaling activities for node instances, including the resulting instance count and a description for each activity. If a scaling activity fails, you can view the failure reason. For information about common error codes related to scale-out and scale-in failures, see Manually scale a node pool.
Delete a node pool
The release rules for nodes vary by billing method. Before you delete a node pool, check whether an expected number of nodes is set for the pool. This setting directly affects the node release behavior.
Node pool | Release rules |
Node pool with expected number of nodes enabled |
|
Node pool with expected number of nodes disabled |
|
(Optional) Click the target node pool name. On the Basic Information tab, check if Desired number of nodes is configured. If this setting is not enabled, the Desired number of nodes field displays a hyphen (-).
In the Actions column of the target node pool, click
>Delete. Read the confirmation message carefully, and then click OK.ImportantA node's system disk and data disks share its lifecycle. When a node is released, its cloud disks are also released, and all data on them is permanently lost. To keep your data safe, use a PersistentVolume (PV) to decouple the data lifecycle from the node lifecycle.
Related operations
After a node pool becomes active, you can perform the following operations in the node pool list.
Action | Description | Related documentation |
Sync Node Pool | Synchronizes node pool data if node information is inconsistent. | None |
Details | Displays the configuration details of the node pool. | None |
Scaling |
| |
Edit | Lets you modify a node pool's configuration, such as its VSwitches, managed node pool settings, billing method, instance types, and whether auto scaling is enabled. | See Edit a node pool above. |
Monitoring | Integrates with Alibaba Cloud Prometheus Service to display the resource utilization of the node pool, including metrics such as CPU and memory usage, disk usage, and the average CPU and memory utilization of nodes. | See View a node pool above. |
Add Existing Nodes | Lets you add existing ECS instances to an ACK cluster as worker nodes or re-add worker nodes that have been removed from a node pool. This feature has important limitations and considerations. For more information, see the related documentation. | |
Configure Logon Method | Lets you set the logon method for nodes. You can choose between a key pair and a password. | See Instance and image configuration above. |
Managed Configuration | Enables automated operations for the node pool, such as node auto-healing, automatic upgrades for Kubelet and the container runtime, and automatic OS CVE fixes. | See Basic configuration above. |
Clone | Creates a new node pool by cloning the configuration of an existing one. | None |
Delete | Deletes a node pool that is no longer in use to reduce costs. The way nodes are released depends on whether a desired number of nodes is configured for the node pool and the billing method of the nodes. | See Delete a node pool above. |
Kubelet Configuration | Lets you customize Kubelet parameters for nodes at the node pool level to adjust their behavior, such as reserving resources to manage cluster-wide resource usage. | |
Containerd Configuration | Lets you customize Containerd parameters for nodes at the node pool level. For example, you can configure multiple mirrors for a specified image repository or skip TLS certificate verification for it. | |
OS Configuration | Lets you customize OS parameters for nodes at the node pool level to tune system performance. | |
Kubelet Upgrade | Upgrades the Kubelet and Containerd versions for nodes in the node pool. | |
Change OS | Changes the operating system type or upgrades the operating system version for nodes. | |
Fix CVE (OS) | Fixes OS CVEs in batches to improve cluster stability, security, and compliance. Some CVE fixes require a node restart. For more information about the feature and its considerations, see the related documentation. | |
Node Recovery | When a node in a managed node pool becomes unhealthy, ACK automatically attempts to recover it. However, complex failures may require manual intervention. For details about the checks and specific recovery actions that ACK provides, see the related documentation. |
Managed configuration capability comparison
Managed configuration | Disabled | Managed node pool | Intelligent management | |
Node pool configuration | Instance type | Manual configuration | Manual configuration | Configurable, with intelligent instance type recommendations. |
Billing method | Manual configuration | Manual configuration | Pay-as-you-go only. | |
Operating system | Manual configuration | Manual configuration | Supports only the container-optimized operating system ContainerOS. | |
System disk | Manual configuration | Manual configuration | Recommended default: 20 GiB. | |
Data disk | Manual configuration | Manual configuration | ContainerOS uses one data disk for temporary storage. The size is configurable. | |
Auto scaling | Can be enabled and configured manually. | Can be enabled and configured manually. | Instant node elasticity is enabled by default. Manual configuration is supported. | |
Automatic response to ECS system events | Not supported | Enabled by default | Enabled by default | |
Node self-healing | Not supported | Can be enabled and configured manually. | Enabled by default | |
Automatic kubelet and containerd upgrades | Manually configured using the automatic cluster upgrade feature. | Enabled by default | ||
Automatic OS CVE vulnerability fixes | Not supported | Can be enabled and configured manually. | Enabled by default | |
FAQ
Use a custom image to create nodes
After you create an ECS instance, you can customize it by installing software or deploying application environments. You can then create a custom image from the instance. New instances created from this image include your custom configurations, saving you from configuring each instance repeatedly.
Log on to the ECS instance and run the following commands to delete the specified files. For instructions on logging on to an instance, see Log on to a Linux instance by using Workbench.
chattr -i /etc/acknode/nodeconfig-* rm -rf /etc/acknode systemctl disable ack-reconfig rm -rf /etc/systemd/system/ack-reconfig.service rm -rf /usr/local/bin/reconfig.sh rm -rf /var/lib/cloud systemctl stop kubelet systemctl disable kubelet rm -rf /etc/systemd/system/kubelet.service rm -rf /etc/systemd/system/kubelet.service.d/10-kubeadm.confCreate a custom image from the ECS instance. For notes and step-by-step instructions, see Create a custom image from an instance.
Configure the node pool, set Operating system type to Custom image, and then continue the process to create the node pool.
Create custom images based on operating systems supported by ACK clusters. For details, see Operating system.
Do not create custom images from ECS instances currently running in an ACK cluster. If needed, remove the instance from the cluster first. For details, see Remove nodes.
Predefined logic in custom images may affect node initialization, container runtime, OS upgrades, and node auto-recovery in managed node pools. Before using in production, ensure thorough testing and validation.
Related documentation
You can remove a node when it is no longer needed. For more information, see Remove a node.
ACK reserves a portion of node resources to ensure that kube components and system processes can run. For more information, see Node resource reservation policy.
If your cluster has insufficient capacity to schedule application Pods, enable node scaling to automatically expand your node resources. For more information, see Node scaling.
The maximum number of Pods per worker node depends on the network plugin and is usually fixed. To increase the number of available Pods, you can scale out the node pool, upgrade the instance type, or rebuild the cluster with a new Pod CIDR block. For more information, see Adjust the number of Pods per node.