All Products
Search
Document Center

Container Service for Kubernetes:Create and manage node pools

Last Updated:Mar 26, 2026

A node pool is a group of nodes that share the same configurations: instance type, operating system, labels, and taints. A cluster can have multiple node pools with different configurations. Creating or modifying one node pool does not affect nodes or workloads in other node pools.

Before creating a node pool, read Node pools to understand the available types, features, and billing rules.

Prerequisites

Before you begin, make sure you have:

Create a node pool

You can create a node pool from the ACK console, via the API, or with Terraform. The console is the most common starting point; for API and Terraform, see CreateClusterNodePool and Use Terraform to create a node pool that has auto scaling enabled.

Some parameters — particularly those related to network and security — cannot be changed after creation. Review the Modifiable column in each parameter table before proceeding.

In the parameter tables below:

  • Not modifiable — cannot be modified after creation

  • Modifiable — can be modified after creation

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the target cluster. In the left-side navigation pane, choose Nodes > Node Pools.

  3. On the Node Pools page, click Create Node Pool. In the dialog box, configure the parameters described in the sections below.

  4. Click Confirm. To generate Terraform or SDK sample code that matches your configuration, click Generate API Request Parameters in the top-left corner before confirming. After confirmation, the node pool list shows:

    • Initializing — creation in progress

    • Active — creation successful

Basic configurations

ParameterDescriptionModifiable
Node Pool NameA name for the node pool.Modifiable
Confidential computingEncrypts data in use to protect confidentiality and integrity. Requires whitelist approval — submit a ticket to apply. Available only with the containerd runtime. For details, see TEE-based confidential computing.Not modifiable
Container runtimeThe runtime for containers in the node pool. containerd is recommended for all Kubernetes versions. Sandboxed-Container supports Kubernetes 1.31 and earlier. Docker is deprecated and supports Kubernetes 1.22 and earlier. For a comparison, see Comparison among Docker, containerd, and Sandboxed-Container.Not modifiable
Scaling modeManual: ACK maintains the node count at the Expected Nodes value. Auto: ACK scales nodes automatically when pod scheduling capacity is insufficient, based on configured minimum and maximum instance counts. Clusters running Kubernetes 1.24 and later use node instant scaling by default; earlier versions use node auto scaling.Modifiable

Automated O&M configurations

Select one of the following options to set the level of automated O&M for the node pool. For a full comparison, see Comparison of managed node pool configurations.

  • Auto mode: ACK takes full O&M responsibility — OS upgrades, software upgrades, vulnerability patching, and dynamic scaling based on workload demand. Available only for clusters with auto mode enabled.

  • Managed node pool: Configure the following automated O&M parameters. You can also set a maintenance window for scheduled tasks.

    Click to view parameters

    ParameterDescriptionModifiable
    Auto recovery ruleWhen enabled, ACK monitors node status and automatically repairs faulty nodes. If Restart Faulty Node is selected, ACK may drain the node and replace the system disk. For trigger conditions and repair events, see Enable auto repair for nodes.Modifiable
    Auto update ruleWhen Automatically Update Kubelet and Containerd is selected, ACK updates kubelet and containerd whenever a new version is available. For details, see Update a node pool.Modifiable
    Auto CVE patching (OS)Configures automatic patching for high-, medium-, and low-risk CVE vulnerabilities. If Restart Nodes if Necessary to Patch CVE Vulnerabilities is enabled, ACK restarts nodes as needed; otherwise, restart manually. For details, see Patch OS CVE vulnerabilities for node pools.Modifiable
    Maintenance windowImage, runtime, and Kubernetes version updates run during this window. Click Set, then configure Cycle, Started At, and Duration.Modifiable
  • Disable: No automated O&M. All node maintenance must be performed manually.

Network configurations

ParameterDescriptionModifiable
VPCThe virtual private cloud (VPC) of the cluster. Cannot be changed.Not modifiable
vSwitchNew nodes are created in the zones of the selected vSwitches during scale-out. Select vSwitches in your target zones. If none are available, click Create vSwitch. For details, see Create and manage vSwitches.Modifiable

Instance and image

ParameterDescriptionModifiable
Billing methodPay-As-You-Go, Subscription, or Preemptible Instance. Subscription requires a Duration and optionally Auto Renewal. Preemptible instances have a 1-hour protection period; afterward, the system checks spot price and availability every 5 minutes and releases the instance if the market price exceeds your bid or inventory is insufficient. ACK does not allow switching between pay-as-you-go/subscription and preemptible instances. Billing method changes apply only to newly added nodes. For preemptible instance best practices, see Best practices for preemptible instance-based node pools.Modifiable
Instance typeSelect ECS instances for the node pool, filtering by vCPU, memory, instance family, and architecture. Select multiple instance types to improve scale-out reliability. For GPU-accelerated instances, you can enable GPU sharing. For unsupported specifications, see ECS specification recommendations for ACK clusters.Modifiable
Operating systemPublic image: ACK-provided images including Alibaba Cloud Linux 3 ACK-optimized, ContainerOS, Alibaba Cloud Linux 3, Ubuntu, and Windows. Custom image: An image you create. For details, see OS images. OS changes apply only to newly added nodes.Modifiable
Security hardeningDisable: no hardening. MLPS Security Hardening: aligns with Multi-Level Protection Scheme (MLPS) 2.0 level-3 standards for Alibaba Cloud Linux 2/3 images; SSH root login is blocked — use Virtual Network Computing (VNC) to log in. OS Security Hardening: available for Alibaba Cloud Linux 2/3 images. Cannot be changed after creation.Not modifiable
Logon typeKey Pair, Password, or Later. MLPS Security Hardening supports Password only. ContainerOS supports Key Pair and Later only. Password must be 8–30 characters, containing letters, digits, and special characters.Modifiable
UsernameSelect root or ecs-user when using Key Pair or Password logon.Modifiable

Storage configurations

ParameterDescriptionModifiable
System diskSupported types: ESSD AutoPL, Enterprise SSD (ESSD), ESSD Entry, Standard SSD, Ultra Disk. Available types depend on the instance family. For ESSD, you can set a custom performance level (PL): PL 2 requires more than 460 GiB; PL 3 requires more than 1,260 GiB. Encryption is available for Enterprise SSD (ESSD) using the default service CMK or a BYOK key from KMS. To improve creation success rate, add fallback disk types under More System Disk Types.Modifiable
Data diskSupported types: ESSD AutoPL, Enterprise SSD (ESSD), ESSD Entry, SSD, Ultra Disk. Up to 64 data disks per ECS instance (varies by instance type; query via DescribeInstanceTypes → DiskQuantity). Mount the data disk to /var/lib/container; ACK mounts /var/lib/kubelet and /var/lib/containerd under /var/lib/container. You can also use snapshots to create data disks for container image acceleration or fast LLM loading.Modifiable

Instance quantity

ParameterDescriptionModifiable
Expected number of nodesThe target node count. Set to at least 2 for cluster components to run as expected. Set to 0 to create an empty node pool and add nodes later.Modifiable

Advanced configurations

Click Advanced Options (Optional) to configure the following parameters.

ParameterDescriptionModifiable
Resource groupThe resource group to which the node pool belongs. Each resource can belong to only one resource group.Modifiable
Scaling mode (advanced)Requires Auto scaling mode. Standard mode: scales by creating and releasing ECS instances. Swift mode: scales by creating, stopping, and starting ECS instances — stopped nodes incur disk fees only (no compute fees). Swift mode does not apply to local disk instance families such as big data and local SSD.Modifiable
Scaling policyPriority: scales using vSwitches in the order listed (highest priority first). Cost Optimization: creates instances in ascending vCPU unit price order; preemptible instances are preferred. Distribution Balancing: distributes instances evenly across zones (requires multiple vSwitches).Modifiable
Use pay-as-you-go instances when preemptible instances are insufficientRequires Preemptible Instance billing. When preemptible instances are reclaimed, the node pool creates pay-as-you-go instances as replacements.Modifiable
Enable supplemental preemptible instancesRequires Preemptible Instance billing. When preemptible instances are reclaimed, the node pool attempts to create replacement preemptible instances.Modifiable
ECS tagsTags added to ECS instances during auto scaling. ACK and Auto Scaling automatically add 3 tags (ack.aliyun.com:<Cluster ID>, ack.alibabacloud.com/nodepool-id:<Node pool ID>, acs:autoscaling:scalingGroupId:<Scaling group ID>), leaving room for at most 17 custom tags per instance.Modifiable
TaintsTaints control pod scheduling. Set taints at the node pool level rather than on individual nodes — this way, you manage all nodes by updating the node pool once instead of updating each node individually. A taint has a key, value, and effect. Key: 1–63 characters (letters, digits, -, _, .); must start and end with a letter or digit. Value: up to 63 characters; same character set; can be left blank. Effect: NoSchedule (prevents scheduling), NoExecute (evicts existing pods without toleration), or PreferNoSchedule (prefers to avoid scheduling).Modifiable
Node labelsLabels are key-value pairs. Set labels at the node pool level rather than on individual nodes — this simplifies management by letting you update all nodes through a single node pool change. Key: 1–63 characters; same rules as taint keys. The following prefixes are reserved and cannot be used: kubernetes.io/, k8s.io/, and any prefix ending in these. Usable exceptions: kubelet.kubernetes.io/ and node.kubernetes.io.Modifiable
Container image accelerationNodes automatically detect whether images support on-demand loading and accelerate container startup accordingly. Requires containerd version 1.6.34 or later.Modifiable
(Deprecated) CPU policyThe CPU management policy for kubelet. None (default) or Static (enhanced CPU affinity for specific pods). Instead of using this field, customize kubelet parameters directly — see Customize the kubelet parameters of a node pool.Modifiable
Custom node nameChanges the node name, ECS instance name, and ECS instance hostname. A custom node name consists of a prefix (required), IP substring, and suffix (optional). Length: 2–64 characters; must start and end with a lowercase letter or digit.Modifiable
Worker RAM roleAssigns a Resource Access Management (RAM) role to the node pool. Use Custom to assign a dedicated role and reduce the risk of sharing one RAM role across all cluster nodes. Requires ACK managed clusters running Kubernetes 1.22 or later. For details, see Use custom worker RAM roles.Modifiable
Pre-defined custom dataScripts that run before nodes join the cluster. Requires whitelist approval in the Quota Center console. For details, see User-data scripts.Modifiable
User dataScripts that run after nodes join the cluster. To check execution status, log on to a node and run grep cloud-init /var/log/messages. For details, see User-data scripts.Modifiable
CloudMonitor agentInstalls the CloudMonitor agent on new nodes for monitoring in the CloudMonitor console. Applies to newly added nodes only.Modifiable
Public IPAssigns a public IPv4 address to each new node. If enabled, configure Bandwidth Billing Method and Peak Bandwidth. Applies to newly added nodes only. To enable internet access for existing nodes, associate an EIP — see Associate an EIP with an ECS instance.Modifiable
Custom security groupSelect Basic Security Group or Advanced Security Group. The type cannot be changed after creation. Each ECS instance supports up to 5 security groups. If you select an existing security group, configure security group rules manually — see Configure security group rules to enforce access control on ACK clusters.Not modifiable
RDS whitelistAdds node IP addresses to the whitelist of an ApsaraDB RDS instance.Modifiable
Deployment setDistributes ECS instances across different physical servers for high availability. Create the deployment set in the ECS console first, then select it here. The maximum node count per pool is 20 × number of zones (zones = number of vSwitches). Cannot be changed after creation. For details, see Best practices for associating deployment sets with node pools.Not modifiable
Private pool typeControls whether to use an ECS capacity reservation. Open: automatically matches an open private pool; falls back to the public pool if no match. Do Not Use: uses only the public pool. Specified: uses the specified private pool; fails if unavailable. For details, see Private pools.Modifiable

Modify a node pool

After a node pool is created, edit it from Node Pools > Edit in the Actions column.

Key behaviors to know before making changes:

  • Most configuration changes apply only to newly added nodes. Exceptions: ECS tags, labels, and taints also propagate to existing nodes.

  • Modifying a node pool does not affect other node pools or their workloads.

  • When modifying node pool configurations, any changes you have made directly to individual nodes may be overwritten.

  • Changing Scaling mode:

    • Manual → Auto: enables auto scaling; configure the minimum and maximum instance counts.

    • Auto → Manual: disables auto scaling; minimum is set to 0, maximum to 2,000; Expected Nodes is set to the current node count.

During modification, the Status column shows Updating. After completion, it shows Activated.

View a node pool

Click the name of a node pool to open its details page, with the following tabs:

  • Overview: cluster info, node pool configuration, and node settings. If auto scaling is enabled, auto scaling configurations are also shown.

  • Monitor: node resource metrics from Managed Service for Prometheus — CPU usage, memory usage, disk usage, and average per-node utilization.

  • Nodes: the node list. Drain nodes, configure scheduling, perform O&M, or remove nodes. Click Export to download node details as a CSV file.

  • Scaling activities: scaling event history, instance counts after each event, and failure reasons. For common error codes, see Manually scale a node pool.

Delete a node pool

Node release behavior depends on the billing method and whether Expected Nodes is configured.

Node pool typePay-as-you-go nodesSubscription nodes
Expected Nodes configuredReleased when the pool is deleted; all nodes removed from the API serverRetained after deletion; removed from the API server
Expected Nodes not configuredNon-manually-added, non-subscription nodes are released; released nodes are removed from the API serverNot released; not removed from the API server

To release a subscription node: change its billing method to pay-as-you-go first (see Change the billing method from subscription to pay-as-you-go), then release it from the ECS console.

  1. (Optional) Click the node pool name. On the Overview tab, check whether Expected Nodes is configured. A hyphen (–) indicates it is not.

  2. Find the node pool and choose image > Delete in the Actions column. Confirm the information and click OK.

What's next

After creating a node pool, the most common follow-up tasks are:

For additional operations — including cloning a node pool, customizing kubelet or containerd configurations, and changing the OS — refer to the ACK documentation.

Comparison of managed node pool configurations

FeatureDisabledManaged node poolAuto mode
Instance typeManualManualManual; ACK provides intelligent recommendations
Billing methodManualManualPay-as-you-go only
OSManualManualContainerOS only
System diskManualManual20 GiB (auto-applied)
Data diskManualManualConfigurable (temporary ContainerOS storage)
Auto scalingOptionalOptionalNode instant scaling enabled by default
Automated O&MResponds to ECS system eventsNot supportedEnabled by default
Node auto repairNot supportedOptionalEnabled by default
Automatic kubelet and runtime upgradeNot supportedOptionalEnabled by default
OS CVE auto repairNot supportedOptionalEnabled by default
Important

Auto mode has specific operational constraints:

  • Default maximum capacity is 50 nodes. Increase the limit using the node pool scaling feature.

  • ACK manages OS upgrades, software upgrades, vulnerability patching, restarts, and drain evictions. Avoid manual operations on ECS nodes in the pool — such as restarting, mounting data disks, or modifying configurations by logging in — to prevent conflicts. Set appropriate workload replica counts, PreStop graceful shutdown strategies, and PodDisruptionBudget policies to protect your workloads during node maintenance.

  • ContainerOS uses an immutable root file system. Use PVC for persistent storage instead of HostPath.

  • ARM, GPU, and on-premises disk instance types are not supported. Configure enough instance types to improve scaling resilience.

  • For shared responsibilities in auto mode, see Shared responsibility model.

FAQ

How do I create a custom image and use it for a node pool?

Create a custom image from an ECS instance after installing the software and dependencies you need. Instances created from the image inherit all customizations.

Important

Before creating the image:

  • Base the image on an ACK-supported OS. See OS images.

  • Do not use a running ECS instance in an ACK cluster as the source. Remove it from the cluster first — see Remove a node.

  • Custom image behavior may interfere with node initialization, container startup, node updates, and auto repair. Test the image in a non-production environment before deploying it.

  1. Log on to the ECS instance and run the following commands to clear ACK configuration files. For how to connect, see Use Workbench to connect to a Linux instance over SSH.

    chattr -i /etc/acknode/nodeconfig-*
    rm -rf /etc/acknode
    systemctl disable ack-reconfig
    rm -rf /etc/systemd/system/ack-reconfig.service
    rm -rf /usr/local/bin/reconfig.sh
    rm -rf /var/lib/cloud
    systemctl stop kubelet
    systemctl disable kubelet
    rm -rf /etc/systemd/system/kubelet.service
    rm -rf /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
  2. Create a custom image from the instance. For the procedure and usage notes, see Create a custom image from an instance.

  3. Create a node pool. For Operating System, select Custom Image and choose the image you created. Configure all other parameters as described in this topic.

References