ACK regularly releases new operating system image versions that provide new features, optimizations, and bug fixes. You should promptly upgrade the operating system image version of your node pools. You can also switch the operating system type as needed, for example, to replace an operating system that has reached its end of life (EOL) with a supported one.
For more information about the operating system types, latest image versions that ACK supports, and the limitations of some operating systems, see Release notes for OS images.
Precautions
This operation updates the operating system in batches by replacing the system disks of nodes. Do not save important data on system disks, or make sure to back up the data in advance. Data disks are not affected during the upgrade. We recommend that you perform this operation during off-peak hours.
When you update a node by replacing system disks, ACK drains the node and evicts the pods from the node to other available nodes based on PodDisruptionBudget (PDB). To ensure high service availability, we recommend using a multi-replica deployment strategy to distribute workloads across multiple nodes. You can also configure PDB for key services to control the number of pods that are interrupted at the same time.
The default timeout period for node draining is 30 minutes. If the pod migration fails to be completed within the timeout period, ACK terminates the update to ensure service stability.
When you update a node by replacing the system disk, ACK reinitializes the node according to the current node pool configurations, including node logon methods, labels, taints, operating system images, and runtime versions. Normally, node pool configurations are updated by editing a node pool. If you made changes to the node in other ways, these changes will be overwritten during the update.
If pods on a node use hostPath volumes and the hostPath volumes points to a system disk, data in the hostPath volumes is lost after the node is updated by replacing system disks.
If your cluster uses other custom configurations, such as swap partitions, kubelet configurations modified by using the CLI, or runtime configurations, the cluster may fail to be updated or the custom configurations may be overwritten during the update.
Some ACK operating systems use cgroup v2 by default. For more information about the precautions for cgroup v2, see OS images.
If you have standalone nodes, which are worker nodes not managed by a node pool, you must migrate them to a node pool. For more information, see Migrate standalone nodes to a node pool.
In ContainerOS 3.4.0, the system disk is set to read-only mode. A data disk must be attached to ensure that the system can start. Therefore, when you upgrade to ContainerOS 3.4 or a later version, follow the procedure below. Other versions are not affected.
If you customize the GPU driver version for nodes in a node pool by specifying a version number or using an OSS URL, the operating system and the driver version may be incompatible after you upgrade the OS image. See NVIDIA driver versions supported by ACK and select the latest compatible driver.
Procedure
Follow these steps to update the operating system image to the latest version or replace the operating system type. To avoid compatibility risks, run a precheck scan first.
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, find the cluster to manage and click its name. In the left navigation pane, choose .
In the Node Pools list, find the target node pool, and in the Actions column, select
> Change Operating System.Click Precheck to scan for potential risks of replacing the operating system image and view the check results.
Normal: The precheck is successful. You can proceed to the next step.
Abnormal: The current running status of the cluster is not affected. Follow the recommended solutions to fix the issues.
After the precheck is successful, configure the parameters as described in the following table and click Start Replacement.
Configuration Item
Description
Destination Version
Select the target image and version.
Current Version
The current operating system version.
Update Node
Specify the nodes whose operating systems you want to replace. You can select all nodes or some nodes.
Ignore Warnings
Specifies whether to ignore warning-level check items at the node pool level and continue with the upgrade. An example of a warning-level check item is that a pod in the node pool uses a HostPath that points to the system disk.
Batch Replace
Maximum Number of Nodes per Batch
The system updates nodes in sequence based on the specified maximum number of concurrent nodes.
Automatic Pause Policy
The policy to pause the replacement of operating systems on nodes.
Interval Between Batches
If you set Auto-pause Policy to Do Not Pause, you can specify an interval between batches. Valid values: 5 to 120 minutes.
Auto Snapshot
The upgrade is performed by replacing system disks. If the system disks of nodes contain important business data, create snapshots for the nodes before you update the operating system. This lets you back up and restore data. Using snapshots incurs snapshot fees. If the snapshots are no longer needed after the upgrade, delete them promptly.
ImportantTo avoid incompatibility risks when you replace an operating system, see Release notes for OS images.
References
For more information about how to upgrade the kubelet and container runtime versions of a node pool, see Update a node pool.
For more information about the procedure for and logic behind upgrading nodes by replacing system disks, see Reference: In-place updates and updates by replacing system disks.