A node pool is a group of nodes that share the same configurations: instance type, operating system, labels, and taints. A cluster can have multiple node pools with different configurations. Creating or modifying one node pool does not affect nodes or workloads in other node pools.
Before creating a node pool, read Node pools to understand the available types, features, and billing rules.
Prerequisites
Before you begin, make sure you have:
An ACK cluster
Access to the ACK console
Create a node pool
You can create a node pool from the ACK console, via the API, or with Terraform. The console is the most common starting point; for API and Terraform, see CreateClusterNodePool and Use Terraform to create a node pool that has auto scaling enabled.
Some parameters — particularly those related to network and security — cannot be changed after creation. Review the Modifiable column in each parameter table before proceeding.
In the parameter tables below:
— cannot be modified after creation
— can be modified after creation
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the target cluster. In the left-side navigation pane, choose Nodes > Node Pools.
On the Node Pools page, click Create Node Pool. In the dialog box, configure the parameters described in the sections below.
Click Confirm. To generate Terraform or SDK sample code that matches your configuration, click Generate API Request Parameters in the top-left corner before confirming. After confirmation, the node pool list shows:
Initializing — creation in progress
Active — creation successful
Basic configurations
| Parameter | Description | Modifiable |
|---|---|---|
| Node Pool Name | A name for the node pool. | |
| Confidential computing | Encrypts data in use to protect confidentiality and integrity. Requires whitelist approval — submit a ticket to apply. Available only with the containerd runtime. For details, see TEE-based confidential computing. | |
| Container runtime | The runtime for containers in the node pool. containerd is recommended for all Kubernetes versions. Sandboxed-Container supports Kubernetes 1.31 and earlier. Docker is deprecated and supports Kubernetes 1.22 and earlier. For a comparison, see Comparison among Docker, containerd, and Sandboxed-Container. | |
| Scaling mode | Manual: ACK maintains the node count at the Expected Nodes value. Auto: ACK scales nodes automatically when pod scheduling capacity is insufficient, based on configured minimum and maximum instance counts. Clusters running Kubernetes 1.24 and later use node instant scaling by default; earlier versions use node auto scaling. |
Automated O&M configurations
Select one of the following options to set the level of automated O&M for the node pool. For a full comparison, see Comparison of managed node pool configurations.
Auto mode: ACK takes full O&M responsibility — OS upgrades, software upgrades, vulnerability patching, and dynamic scaling based on workload demand. Available only for clusters with auto mode enabled.
Managed node pool: Configure the following automated O&M parameters. You can also set a maintenance window for scheduled tasks.
Disable: No automated O&M. All node maintenance must be performed manually.
Network configurations
| Parameter | Description | Modifiable |
|---|---|---|
| VPC | The virtual private cloud (VPC) of the cluster. Cannot be changed. | |
| vSwitch | New nodes are created in the zones of the selected vSwitches during scale-out. Select vSwitches in your target zones. If none are available, click Create vSwitch. For details, see Create and manage vSwitches. |
Instance and image
| Parameter | Description | Modifiable |
|---|---|---|
| Billing method | Pay-As-You-Go, Subscription, or Preemptible Instance. Subscription requires a Duration and optionally Auto Renewal. Preemptible instances have a 1-hour protection period; afterward, the system checks spot price and availability every 5 minutes and releases the instance if the market price exceeds your bid or inventory is insufficient. ACK does not allow switching between pay-as-you-go/subscription and preemptible instances. Billing method changes apply only to newly added nodes. For preemptible instance best practices, see Best practices for preemptible instance-based node pools. | |
| Instance type | Select ECS instances for the node pool, filtering by vCPU, memory, instance family, and architecture. Select multiple instance types to improve scale-out reliability. For GPU-accelerated instances, you can enable GPU sharing. For unsupported specifications, see ECS specification recommendations for ACK clusters. | |
| Operating system | Public image: ACK-provided images including Alibaba Cloud Linux 3 ACK-optimized, ContainerOS, Alibaba Cloud Linux 3, Ubuntu, and Windows. Custom image: An image you create. For details, see OS images. OS changes apply only to newly added nodes. | |
| Security hardening | Disable: no hardening. MLPS Security Hardening: aligns with Multi-Level Protection Scheme (MLPS) 2.0 level-3 standards for Alibaba Cloud Linux 2/3 images; SSH root login is blocked — use Virtual Network Computing (VNC) to log in. OS Security Hardening: available for Alibaba Cloud Linux 2/3 images. Cannot be changed after creation. | |
| Logon type | Key Pair, Password, or Later. MLPS Security Hardening supports Password only. ContainerOS supports Key Pair and Later only. Password must be 8–30 characters, containing letters, digits, and special characters. | |
| Username | Select root or ecs-user when using Key Pair or Password logon. |
Storage configurations
| Parameter | Description | Modifiable |
|---|---|---|
| System disk | Supported types: ESSD AutoPL, Enterprise SSD (ESSD), ESSD Entry, Standard SSD, Ultra Disk. Available types depend on the instance family. For ESSD, you can set a custom performance level (PL): PL 2 requires more than 460 GiB; PL 3 requires more than 1,260 GiB. Encryption is available for Enterprise SSD (ESSD) using the default service CMK or a BYOK key from KMS. To improve creation success rate, add fallback disk types under More System Disk Types. | |
| Data disk | Supported types: ESSD AutoPL, Enterprise SSD (ESSD), ESSD Entry, SSD, Ultra Disk. Up to 64 data disks per ECS instance (varies by instance type; query via DescribeInstanceTypes → DiskQuantity). Mount the data disk to /var/lib/container; ACK mounts /var/lib/kubelet and /var/lib/containerd under /var/lib/container. You can also use snapshots to create data disks for container image acceleration or fast LLM loading. |
Instance quantity
| Parameter | Description | Modifiable |
|---|---|---|
| Expected number of nodes | The target node count. Set to at least 2 for cluster components to run as expected. Set to 0 to create an empty node pool and add nodes later. |
Advanced configurations
Click Advanced Options (Optional) to configure the following parameters.
| Parameter | Description | Modifiable |
|---|---|---|
| Resource group | The resource group to which the node pool belongs. Each resource can belong to only one resource group. | |
| Scaling mode (advanced) | Requires Auto scaling mode. Standard mode: scales by creating and releasing ECS instances. Swift mode: scales by creating, stopping, and starting ECS instances — stopped nodes incur disk fees only (no compute fees). Swift mode does not apply to local disk instance families such as big data and local SSD. | |
| Scaling policy | Priority: scales using vSwitches in the order listed (highest priority first). Cost Optimization: creates instances in ascending vCPU unit price order; preemptible instances are preferred. Distribution Balancing: distributes instances evenly across zones (requires multiple vSwitches). | |
| Use pay-as-you-go instances when preemptible instances are insufficient | Requires Preemptible Instance billing. When preemptible instances are reclaimed, the node pool creates pay-as-you-go instances as replacements. | |
| Enable supplemental preemptible instances | Requires Preemptible Instance billing. When preemptible instances are reclaimed, the node pool attempts to create replacement preemptible instances. | |
| ECS tags | Tags added to ECS instances during auto scaling. ACK and Auto Scaling automatically add 3 tags (ack.aliyun.com:<Cluster ID>, ack.alibabacloud.com/nodepool-id:<Node pool ID>, acs:autoscaling:scalingGroupId:<Scaling group ID>), leaving room for at most 17 custom tags per instance. | |
| Taints | Taints control pod scheduling. Set taints at the node pool level rather than on individual nodes — this way, you manage all nodes by updating the node pool once instead of updating each node individually. A taint has a key, value, and effect. Key: 1–63 characters (letters, digits, -, _, .); must start and end with a letter or digit. Value: up to 63 characters; same character set; can be left blank. Effect: NoSchedule (prevents scheduling), NoExecute (evicts existing pods without toleration), or PreferNoSchedule (prefers to avoid scheduling). | |
| Node labels | Labels are key-value pairs. Set labels at the node pool level rather than on individual nodes — this simplifies management by letting you update all nodes through a single node pool change. Key: 1–63 characters; same rules as taint keys. The following prefixes are reserved and cannot be used: kubernetes.io/, k8s.io/, and any prefix ending in these. Usable exceptions: kubelet.kubernetes.io/ and node.kubernetes.io. | |
| Container image acceleration | Nodes automatically detect whether images support on-demand loading and accelerate container startup accordingly. Requires containerd version 1.6.34 or later. | |
| (Deprecated) CPU policy | The CPU management policy for kubelet. None (default) or Static (enhanced CPU affinity for specific pods). Instead of using this field, customize kubelet parameters directly — see Customize the kubelet parameters of a node pool. | |
| Custom node name | Changes the node name, ECS instance name, and ECS instance hostname. A custom node name consists of a prefix (required), IP substring, and suffix (optional). Length: 2–64 characters; must start and end with a lowercase letter or digit. | |
| Worker RAM role | Assigns a Resource Access Management (RAM) role to the node pool. Use Custom to assign a dedicated role and reduce the risk of sharing one RAM role across all cluster nodes. Requires ACK managed clusters running Kubernetes 1.22 or later. For details, see Use custom worker RAM roles. | |
| Pre-defined custom data | Scripts that run before nodes join the cluster. Requires whitelist approval in the Quota Center console. For details, see User-data scripts. | |
| User data | Scripts that run after nodes join the cluster. To check execution status, log on to a node and run grep cloud-init /var/log/messages. For details, see User-data scripts. | |
| CloudMonitor agent | Installs the CloudMonitor agent on new nodes for monitoring in the CloudMonitor console. Applies to newly added nodes only. | |
| Public IP | Assigns a public IPv4 address to each new node. If enabled, configure Bandwidth Billing Method and Peak Bandwidth. Applies to newly added nodes only. To enable internet access for existing nodes, associate an EIP — see Associate an EIP with an ECS instance. | |
| Custom security group | Select Basic Security Group or Advanced Security Group. The type cannot be changed after creation. Each ECS instance supports up to 5 security groups. If you select an existing security group, configure security group rules manually — see Configure security group rules to enforce access control on ACK clusters. | |
| RDS whitelist | Adds node IP addresses to the whitelist of an ApsaraDB RDS instance. | |
| Deployment set | Distributes ECS instances across different physical servers for high availability. Create the deployment set in the ECS console first, then select it here. The maximum node count per pool is 20 × number of zones (zones = number of vSwitches). Cannot be changed after creation. For details, see Best practices for associating deployment sets with node pools. | |
| Private pool type | Controls whether to use an ECS capacity reservation. Open: automatically matches an open private pool; falls back to the public pool if no match. Do Not Use: uses only the public pool. Specified: uses the specified private pool; fails if unavailable. For details, see Private pools. |
Modify a node pool
After a node pool is created, edit it from Node Pools > Edit in the Actions column.
Key behaviors to know before making changes:
Most configuration changes apply only to newly added nodes. Exceptions: ECS tags, labels, and taints also propagate to existing nodes.
Modifying a node pool does not affect other node pools or their workloads.
When modifying node pool configurations, any changes you have made directly to individual nodes may be overwritten.
Changing Scaling mode:
Manual → Auto: enables auto scaling; configure the minimum and maximum instance counts.
Auto → Manual: disables auto scaling; minimum is set to 0, maximum to 2,000; Expected Nodes is set to the current node count.
During modification, the Status column shows Updating. After completion, it shows Activated.
View a node pool
Click the name of a node pool to open its details page, with the following tabs:
Overview: cluster info, node pool configuration, and node settings. If auto scaling is enabled, auto scaling configurations are also shown.
Monitor: node resource metrics from Managed Service for Prometheus — CPU usage, memory usage, disk usage, and average per-node utilization.
Nodes: the node list. Drain nodes, configure scheduling, perform O&M, or remove nodes. Click Export to download node details as a CSV file.
Scaling activities: scaling event history, instance counts after each event, and failure reasons. For common error codes, see Manually scale a node pool.
Delete a node pool
Node release behavior depends on the billing method and whether Expected Nodes is configured.
| Node pool type | Pay-as-you-go nodes | Subscription nodes |
|---|---|---|
| Expected Nodes configured | Released when the pool is deleted; all nodes removed from the API server | Retained after deletion; removed from the API server |
| Expected Nodes not configured | Non-manually-added, non-subscription nodes are released; released nodes are removed from the API server | Not released; not removed from the API server |
To release a subscription node: change its billing method to pay-as-you-go first (see Change the billing method from subscription to pay-as-you-go), then release it from the ECS console.
(Optional) Click the node pool name. On the Overview tab, check whether Expected Nodes is configured. A hyphen (–) indicates it is not.
Find the node pool and choose
> Delete in the Actions column. Confirm the information and click OK.
What's next
After creating a node pool, the most common follow-up tasks are:
Scale the node pool: adjust capacity manually or configure auto scaling. See Manually scale a node pool and Node scaling.
Monitor node health: view CPU, memory, and disk metrics under the Monitor tab.
Enable managed O&M: configure auto repair, auto update, and CVE patching. See Enable auto repair for nodes.
Remove a node: when a node is no longer needed, see Remove a node.
For additional operations — including cloning a node pool, customizing kubelet or containerd configurations, and changing the OS — refer to the ACK documentation.
Comparison of managed node pool configurations
| Feature | Disabled | Managed node pool | Auto mode |
|---|---|---|---|
| Instance type | Manual | Manual | Manual; ACK provides intelligent recommendations |
| Billing method | Manual | Manual | Pay-as-you-go only |
| OS | Manual | Manual | ContainerOS only |
| System disk | Manual | Manual | 20 GiB (auto-applied) |
| Data disk | Manual | Manual | Configurable (temporary ContainerOS storage) |
| Auto scaling | Optional | Optional | Node instant scaling enabled by default |
| Automated O&M | Responds to ECS system events | Not supported | Enabled by default |
| Node auto repair | Not supported | Optional | Enabled by default |
| Automatic kubelet and runtime upgrade | Not supported | Optional | Enabled by default |
| OS CVE auto repair | Not supported | Optional | Enabled by default |
Auto mode has specific operational constraints:
Default maximum capacity is 50 nodes. Increase the limit using the node pool scaling feature.
ACK manages OS upgrades, software upgrades, vulnerability patching, restarts, and drain evictions. Avoid manual operations on ECS nodes in the pool — such as restarting, mounting data disks, or modifying configurations by logging in — to prevent conflicts. Set appropriate workload replica counts, PreStop graceful shutdown strategies, and PodDisruptionBudget policies to protect your workloads during node maintenance.
ContainerOS uses an immutable root file system. Use PVC for persistent storage instead of HostPath.
ARM, GPU, and on-premises disk instance types are not supported. Configure enough instance types to improve scaling resilience.
For shared responsibilities in auto mode, see Shared responsibility model.
FAQ
How do I create a custom image and use it for a node pool?
Create a custom image from an ECS instance after installing the software and dependencies you need. Instances created from the image inherit all customizations.
Before creating the image:
Base the image on an ACK-supported OS. See OS images.
Do not use a running ECS instance in an ACK cluster as the source. Remove it from the cluster first — see Remove a node.
Custom image behavior may interfere with node initialization, container startup, node updates, and auto repair. Test the image in a non-production environment before deploying it.
Log on to the ECS instance and run the following commands to clear ACK configuration files. For how to connect, see Use Workbench to connect to a Linux instance over SSH.
chattr -i /etc/acknode/nodeconfig-* rm -rf /etc/acknode systemctl disable ack-reconfig rm -rf /etc/systemd/system/ack-reconfig.service rm -rf /usr/local/bin/reconfig.sh rm -rf /var/lib/cloud systemctl stop kubelet systemctl disable kubelet rm -rf /etc/systemd/system/kubelet.service rm -rf /etc/systemd/system/kubelet.service.d/10-kubeadm.confCreate a custom image from the instance. For the procedure and usage notes, see Create a custom image from an instance.
Create a node pool. For Operating System, select Custom Image and choose the image you created. Configure all other parameters as described in this topic.
References
Node resources are reserved for Kubernetes components and system processes. See Resource reservation policy.
When cluster capacity cannot meet pod scheduling requirements, enable node scaling. See Node scaling.
To increase the maximum number of pods, scale out node pools, upgrade instance specifications, or reset the pod CIDR block. See Increase the maximum number of pods in a cluster.
If a node is no longer needed, see Remove a node.