You can add existing ECS instances to a cluster as worker nodes, or re-add worker nodes that were removed from a node pool. This allows you to quickly reuse computing resources without interrupting existing workloads.
ACK provides two methods for adding nodes: auto mode and manual mode. An instance's billing method and instance type remain unchanged after it is added.
|
Item |
Auto mode |
Manual mode |
|
OS reset |
Resets and initializes the instance's operating system based on the current configuration of the node pool.
|
Preserves the instance's original operating system for greater flexibility. |
|
Use cases |
You want the instance configuration to be consistent with the node pool for standardized management. |
You need to preserve the instance's existing operating system or specific configurations. |
Limitations
Before you begin, make sure that your environment and instances meet the following requirements.
|
Category |
Item |
Description |
|
Instance and node pool |
Cluster node quota |
The total number of nodes in the cluster cannot exceed the quota. To request a quota increase, go to Quota Center. The default node quota for an ACK Basic cluster is 10. |
|
Instance ownership |
The instance and the cluster must be in the same Alibaba Cloud account, region, and VPC. Otherwise, migrate the instance or create a new instance or cluster that meets the requirements. You cannot add an ECS instance from the other end of a VPC peering connection. |
|
|
Cluster ownership |
You cannot add an instance that already belongs to another ACK cluster. You must first remove the node from the original cluster before adding it to the new one. |
|
|
Scaling group (ESS) ownership |
You cannot add an instance that is already part of another scaling group. You must manually remove it from the scaling group first. |
|
|
Node pool type |
|
|
|
Operating system |
|
|
|
Instance type |
|
|
|
Network |
API Server access |
The IP address of the instance must be in the API Server access whitelist. Otherwise, the node cannot communicate with the control plane. For more information, see Configure access control for the API Server. |
|
Security group |
To change the security group type of an instance or to add an instance to the node pool's security group in advance, see Associate a security group with an instance (primary ENI). To request a security group quota increase, see View or increase ECS quotas. |
|
|
Terway - Maximum pods |
The maximum number of pods that the instance supports must meet the following requirements: The maximum number of pods supported in different elastic network interface (ENI) modes depends on the maximum number of ENIs that the instance type supports. For information about how to calculate this limit, see How to calculate the pod quota for a node.
If the requirement is not met, upgrade or downgrade node resources or purchase a new instance. |
|
|
Terway - vSwitch configuration |
If the instance and the node pool are in different availability zones, you must update the Terway vSwitch configuration. Otherwise, Terway allocates pod IP addresses from the vSwitch of the node's primary ENI, which can cause pod IP allocation errors. For more information, see Modify pod vSwitches. |
|
|
Terway - ENI |
When you add the instance, its existing bound ENIs are retained, and pod IP addresses are allocated from the vSwitches associated with these ENIs. Ensure the instance has only one primary ENI. If a pod IP address does not belong to a configured vSwitch, remove the node from the cluster, delete all secondary ENIs, and then add the node back to the cluster. |
|
|
Terway - Worker RAM role |
The instance must be bound to the node pool's Worker RAM role to prevent permission issues that could lead to an incorrect calculation of the maximum available pods (MaxPod). On the Node Pools page, click a node pool name to view its Worker RAM role on the Basic Information tab. To grant the RAM role, see Grant a RAM role to an ECS instance. |
|
|
Terway - IPv6 dual-stack |
If IPv6 dual-stack is enabled for the cluster, you must assign an IPv6 address to the instance's primary ENI. For more information, see IPv6 communication. |
|
|
Flannel |
The number of custom route entries in the system route table of the cluster's VPC cannot exceed the route table quota. To request a quota increase, go to Quota Center. |
Usage notes
-
Data backup: Before you begin, create a manual snapshot of the instance's system disk and data disks to prevent data loss.
To ensure you have sufficient snapshot quota, we recommend deleting unnecessary manual and automatic snapshots to avoid creation failures.
-
Instance release and billing: For node pools that do not have the expected number of nodes enabled, instances added to the node pool are not released when you delete the cluster or node pool. You must manually remove the nodes. Monitor the ECS billing status to avoid unexpected charges.
Procedure
Time required: The node addition process, which includes system disk replacement (auto mode only) and node initialization, takes about 5 minutes. The actual time may vary depending on network conditions, OS image size, and other factors.
Adding an existing node does not affect the existing nodes and applications in the cluster. To avoid compatibility issues, we recommend that you do not initialize an ECS instance that already has services created on it as a worker node.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
On the Node Pools page, find the target node pool, click
in the Actions column, and then click Add Existing Node.If the target ECS instance is not in the server list, it does not meet the conditions for being added to the cluster. You can select Show Unavailable Instances to view the unavailable ECS instances and the reasons. For more information about the reasons, see Limitations and Usage notes.
-
Read the on-screen notes carefully and select a method for adding the node.
Manual add
With this method, you obtain an installation command and run it on the target instance. You can add only one ECS instance at a time.
-
Set Method to Manual. In the list of existing cloud servers, select the ECS instance to add, and then click Next.
-
On the Specify Instance Information page, confirm the cluster and instance details. Configure the data disk and instance name, then click Next.
Parameter
Description
Data Disk
Specifies whether to store container and image data on a data disk. This separates the system disk from data disks to improve stability.
-
If the ECS instance has a data disk attached and the file system of the last data disk is not initialized, ACK automatically formats the last data disk as ext4. This disk is then used exclusively to store data in /var/lib/containerd or /var/lib/docker (the default data directories for container runtimes) and /var/lib/kubelet (the default data directory for the kubelet component).
Important-
The existing data on the formatted data disk will be lost. We recommend that you create a snapshot to back up your data in advance.
-
If you store containers and images on a data disk, only the ext4 and xfs file systems are supported.
-
-
If the ECS instance does not have a data disk attached, ACK does not automatically attach a new data disk, regardless of whether you select this option.
Retain Instance Name
-
Enabled: Uses the instance name as the node name.
-
Disabled: ACK renames the node based on the custom node naming rules.
-
-
On the Complete page, copy the node join command automatically generated by ACK for use in a later step, and then click Finish.
-
Log on to the ECS console. In the left-side navigation pane, choose Instances & Images > Instance. Select the region where the cluster is located, and then select the target instance.
-
Click Connect for the target instance and select a remote connection method.
-
Follow the on-screen instructions to enter and run the script you copied in step 3 to automatically configure and add the instance to the cluster.
After the script runs successfully, a success message appears. In the node list, wait for the new node's status to change to Ready.
Worker node joined successfully + exit_code=0 + set +x
Auto add
You can automatically add instances from the console.
-
Set Method to Auto. From the list of existing cloud servers, select the desired ECS instances and click Next.
-
On the Specify Instance Information page, confirm the cluster and instance information as prompted. Configure the data disk and instance name, and then click Next.
Parameter
Description
Data Disk
Specifies whether to store container and image data on a data disk. This separates the system disk from data disks to improve stability.
-
If the ECS instance has a data disk attached and the file system of the last data disk is not initialized, ACK automatically formats the last data disk as ext4. This disk is then used exclusively to store data in /var/lib/containerd or /var/lib/docker (the default data directories for container runtimes) and /var/lib/kubelet (the default data directory for the kubelet component).
Important-
The existing data on the formatted data disk will be lost. We recommend that you create a snapshot to back up your data in advance.
-
If you store containers and images on a data disk, only the ext4 and xfs file systems are supported.
-
-
If the ECS instance does not have a data disk attached, ACK does not automatically attach a new data disk, regardless of whether you select this option.
Logon method and password
If the node pool's Logon Type is configured as Password, you must reset the instance password.
Retain Instance Name
-
Enabled: Uses the instance name as the node name.
-
Disabled: ACK renames the node based on the custom node naming rules.
-
-
In the dialog box that appears, read the notes carefully and then click OK.
After the node is added, you can wait for it to initialize in the node list until its status changes to Ready.
FAQ
Does adding nodes affect workloads?
Adding an existing node, in either manual or auto mode, does not affect existing cluster workloads.
How does instance scaling affect workloads?
Upgrading or downgrading an ECS instance can include changing the instance type, public bandwidth billing method, public bandwidth, or data disk billing method. For more information, see Overview of instance configuration changes. The impact on the ECS instance varies based on the upgrade or downgrade method.
-
Operations that do not require a restart: The impact on your business depends on your specific scenario.
-
Operations that require an ECS instance restart: Operations such as upgrading or downgrading the instance type cause service disruptions. Before you perform such an operation, such as upgrading or downgrading node resources, we recommend that you check the current workload to determine if you need to add redundant nodes to take over the pods. Then, drain the node to be upgraded or downgraded and remove it from the scaling group and the ACK cluster. For more information, see Remove nodes.
After the upgrade or downgrade is complete, add the node back to the cluster by following the instructions in this topic.
Can I use different instance types?
Yes. ACK allows you to manage nodes of multiple instance types in the same node pool. This helps prevent scale-out failures caused by instance type unavailability or insufficient inventory. Before you add an ECS instance, make sure that its instance type is included in the node pool's list of instance types. Follow these steps:
-
Edit or create a node pool and configure the required node instance types. For more information, see Create and manage a node pool.
-
Drain and remove the target node. Do not release the ECS instance. For more information, see Remove nodes.
-
Add the ECS instances of different instance types to the node pool by following the instructions in the Limitations and Procedure sections of this topic.
How to move nodes between clusters?
ACK does not support moving nodes directly between clusters. However, you can achieve this by adding an existing node. Follow these steps:
-
Drain and remove the target node from the source cluster. Do not release the ECS instance. For more information, see Remove nodes.
-
Add the target ECS instance to a node pool in the destination cluster by following the instructions in the Limitations and Procedure sections of this topic.
Can I add a node with an EOL OS?
-
Manual mode: You can manually add an existing instance that runs an unsupported operating system to a node pool. However, you must ensure that the OS version of the instance is compatible with the current cluster version. For more information, see Operating systems.
For example, CentOS 7 and Alibaba Cloud Linux 2 are supported only in clusters of version 1.30 and earlier.
-
Auto mode: Yes. ACK initializes the instance using the OS image specified in the node pool configuration.
Is user data overwritten when adding nodes?
Whether the original instance's user data is overwritten depends on the addition method.
-
Auto mode: ACK initializes the system disk, overwriting the instance's original user data with the user data configured for the node pool.
-
Manual mode: The user data of the original instance is not overwritten. After the instance is added to the node pool, it continues to use its original user data.
How to fix node addition timeouts?
Check the network connectivity between the node and the API Server. First, verify that the security group meets the requirements. For information about security group limitations when adding an existing node, see Limitations. For information about other network connectivity issues, see Network management FAQ.
Does adding a node change the expected count?
Yes. After you add an existing node, the Expected Nodes count increases by the number of nodes added. For example, if the Expected Nodes for a node pool is set to 5 and you add one ECS instance to the node pool, the count automatically becomes 6.
References
-
In addition to the console, you can add ECS instances to an ACK cluster by calling an API operation (Obtain the script for adding existing nodes to a node pool) or running a CLI command (Add existing ECS instances).
-
Older clusters created before the node pool feature was released may contain free nodes (nodes that do not belong to any node pool). For centralized management, you can migrate them to a node pool. For more information, see Migrate free nodes to a node pool.
-
If a node, pod, or another component is not working as expected, you can troubleshoot the issue. For more information, see Troubleshoot node exceptions, Troubleshoot pod exceptions, and FAQ about nodes and node pools.