All Products
Search
Document Center

Container Service for Kubernetes:Update an ACK cluster

Last Updated:Jan 22, 2024

Outdated cluster versions may have security and stability issues. To ensure business continuity, Container Service for Kubernetes (ACK) uses in-place updates to update ACK clusters. You can update the Kubernetes version of an ACK cluster in the ACK console, or update the control planes and node pools of the cluster separately. This topic describes the usage notes before and after an update and the procedure for updating an ACK cluster.

Why ACK clusters need updates

ACK guarantees the stability of the latest three Kubernetes major versions. For example, ACK discontinues the support for Kubernetes 1.22 when ACK is updated to support the following even-numbered major versions: Kubernetes 1.24, 1.26, and 1.28. In this case, you can no longer create ACK clusters that run Kubernetes 1.22.

Proactive updates provide the following benefits:

  • Reduced security and stability risks: New Kubernetes versions are usually released to add optimizations and patch security and stability vulnerabilities. Using outdated Kubernetes clusters may pose security and stability risks to your businesses.

  • Improved technical support and customer service: ACK no longer releases security patches or repairs for outdated Kubernetes versions. In addition, ACK does not guarantee the quality of technical support for outdated Kubernetes versions. You can enjoy improved technical support and customer service when using new Kubernetes versions.

  • New features: The iteration of open source Kubernetes usually comes with new features and improvements. ACK will also support these features to optimize your development and maintenance experience.

In addition, for security purposes, ACK reserves the right to force outdated ACK clusters to update to the earliest Kubernetes version supported by ACK. We recommend that you perform the following steps to proactively update your ACK clusters.

Important

When you update an ACK cluster, ACK performs a precheck on the cluster, but ACK does not guarantee that all incompatible features, configurations, and APIs will be detected. According to the Shared responsibility model, we recommend that you pay attention to the release of Kubernetes versions by checking the documentation, information in the console, and internal messages, and learn the update notes of the corresponding version before you update the cluster.

For more information about how ACK supports Kubernetes versions, see Support for Kubernetes versions.

Usage notes (important)

Kubernetes versions

You can update an ACK cluster from a major version only to the next major version. For example, to update the Kubernetes version of an ACK cluster from 1.24 to 1.28, you need to first update the cluster to 1.26 and then update it to 1.28.

To view the Kubernetes version of an ACK cluster, log on to the ACK console and check the Version column of the cluster on the Clusters page. Before you update to a Kubernetes version, read the following release notes for the corresponding Kubernetes version to learn the version details, deprecated APIs, and usage notes for updates. This helps you avoid compatibility issues caused by feature updates in new Kubernetes versions.

Note

If the YAML file of your Helm chart uses deprecated resources, modify the file at the earliest opportunity. For more information, see the preceding release notes and Deprecated APIs.

Features and custom configurations

If your ACK cluster uses the features listed in the following table, read the considerations and suggested solutions.

Feature

Consideration

Suggested solution

FlexVolume

Object Storage Service (OSS) volumes that are mounted by using FlexVolume 1.11.2.5 or earlier are remounted during a cluster update.

After the update is complete, you need to recreate the pods that use OSS volumes.

FlexVolume is deprecated. We recommend that you upgrade from FlexVolume to CSI. For more information, see Upgrade from FlexVolume to CSI.

Auto scaling of nodes

  • If auto scaling is enabled, the cluster automatically updates Cluster Autoscaler to the latest version after the cluster is updated. This ensures that the auto scaling feature can function as normal.

  • With auto scaling enabled, nodes in swift mode may shut down and fail to be updated.

  • Make sure that Cluster Autoscaler is updated to the latest version. For more information, see Auto scaling of nodes.

  • If nodes in swift mode failed to be updated after the cluster is updated, we recommend that you manually remove the nodes.

Resource reservation

After you update the Kubernetes version of an ACK cluster to 1.18, ACK automatically configures resource reservation. If resource reservation is not configured for the cluster and the resource usage of nodes is high, ACK may fail to schedule evicted pods to the nodes after the cluster is updated.

Reserve sufficient resources on the nodes. We recommend that you reserve at least 50% CPU resources and at least 70% memory resources. For more information, see Resource reservation policy.

LoadBalancer configurations

ACK clusters require Server Load Balancer (SLB) instances to handle external access. However, if an SLB instance is configured with externalTrafficPolicy: Local, traffic is forwarded only to node-local pods. If your application pods are deployed on other nodes, traffic cannot reach these pods.

Check whether the SLB instance is configured with externalTrafficPolicy: Local in case the SLB instance cannot forward traffic to the application pods. For more information, see What Can I Do if the Cluster Cannot Access the IP Address of the SLB Instance Exposed by the LoadBalancer Service.

API Server

When ACK updates a cluster, it attempts to update the control planes without interrupting communication with the applications in the cluster. However, communication with the API server may be temporarily interrupted. The interruption affects applications that strongly rely on the API server. For example, if your application needs to watch (or list) resources, the watch operation is interrupted when the API server restarts. To resolve this problem, you need to configure the application to automatically reperform the watch operation when an interruption occurs.

Applications that do not need to access the API server are not affected.

Startup probe

If the pods in a cluster are configured with a startup probe, the pods may temporarily enter the NotReady state after kubelet is restarted.

We recommend that you deploy multiple replicated pods and spread the pods across nodes. This ensures that your application still has sufficient pods when one of the nodes restarts.

kubectl

After a cluster is updated, we recommend that you update kubectl on your on-premises machine.

If you do not update kubectl, the kubectl version may become incompatible with the API server version. As a result, the invalid object doesn't have additional properties error may occur.

Install or update kubectl. For more information, see Install kubectl.

If your cluster uses custom configurations, read the descriptions in the following table.

Feature

Description

Network

To update a cluster, you need to use Yum to download the required software packages. If your cluster uses custom network configurations or a custom OS image, you need to ensure that Yum can run as normal. You can run the yum makecache command to check the status of Yum.

OS image

Custom OS images are not strictly validated by ACK. ACK does not guarantee the success of cluster updates if your cluster uses a custom OS image.

Others

If your cluster uses other custom configurations, such as swap partitions or kubelet configurations modified by using the CLI, the cluster may fail to be updated or the custom configurations may be lost during the update.

Update procedure, methods, and duration

Procedure

image.png

  • Preparations and precheck:

    • Usage notes: Before you update a cluster, read the release notes for the corresponding Kubernetes version to learn the usage notes for updates. This helps you avoid compatibility issues caused by feature updates. For more information, see the Kubernetes versions section.

    • Precheck: Run a precheck to identify potential update risks. If update risks are identified, follow the instructions in the console or refer to Cluster check items and suggestions on how to fix cluster issues to fix the issues.

  • Cluster update: After the cluster passes the precheck, you can update the cluster, including the control planes and node pools. For more information about the update procedure, see Procedures for updating control planes and node pools. ACK allows you to update a cluster in the following modes:

    • Update concurrently: The control planes and node pools are updated concurrently.

    • Update separately: ACK first updates the control planes and then updates the node pools.

    Control plane updates involve the key component kube-apiserver. Node pool updates involve the kubelet and the dependent components. To ensure cluster stability and reliability, ACK must ensure that kube-apiserver is up to two versions earlier than the kubelet. To do this, ACK needs to update the control planes separately and then updates the node pools during off-peak hours.

  • After cluster update: Verify the versions of the cluster and kubelet, check whether the node pools run as normal, and check whether the applications in the cluster run as normal.

Update methods

  • Control planes: Control plane components, such as kube-apiserver, kube-controller-manager, cloud-controller-manager, and kube-scheduler, are updated. For ACK managed clusters and ACK Serverless clusters, ACK directly updates the managed control plane components. For ACK dedicated clusters, ACK uses in-place updates to update the control plane components in order to ensure business continuity and reduce potential risks posed by data migration and configuration modifications.

  • Node pools: During a node pool update, the kubelet, OS image, and container runtime are updated. To replace the OS or update the container runtime from Docker to containerd, ACK needs to replace the system disks of the nodes when performing the update. We recommend that you back up the data in the system disks of the nodes before you update the cluster. In other scenarios, ACK uses in-place updates to update node pools. For more information, see Node pool updates.

Update duration

For ACK managed clusters and ACK Serverless clusters, it requires about 5 minutes to update the control planes. For ACK dedicated clusters, ACK needs to update the master nodes one after one. It requires about 8 minutes to update a master node. Nodes in a node pool are updated in batches. It requires about 5 minutes to update a batch.

Procedure

Update control planes and node pools concurrently

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to upgrade and choose More > Operations > Upgrade Cluster in the Actions column.

  3. On the Upgrade Cluster page, select Control Planes and All Node Pools for Update Mode in the Update Items section and set the Maximum Number of Nodes to Repair per Batch parameter in the Batch Update Policy section. Then, click Precheck.

    After the precheck is completed, click View Details to view the report.

  4. After the cluster passes the precheck, click Start Update.

    During the update, do not add or remove nodes. To add or remove nodes, you need to first cancel the update. You can check the update progress in the lower part of the Upgrade Cluster page and perform the following operations on demand:

    • Pause and resume the update: You can click Pause to pause the update. To resume the update, click Continue.

      After you pause the update, the cluster remains in an intermediate state. Do not perform any operations on the cluster when the update is paused and complete the update at the earliest opportunity. The update is terminated after the cluster remains in the Paused state for seven days. ACK will automatically delete the events and logs related to the update.

    • Cancel the update: You can click Cancel and then click OK in the message that appears to cancel the update. After you cancel the update, ACK continues to update the nodes in the current batch and the update cannot be rolled back. The remaining batches are not updated.

      Note
      • If errors occur during the update, ACK pauses the update. The cause of failure is displayed in the lower part of the page. You can follow the suggestions on how to fix the error.

      • Do not modify the resources in the kube-upgrade namespace during the update unless an error occurs.

    After the update is complete, you can go to the Clusters page and check the Kubernetes version of your cluster to verify that the control plane components are updated. You can also choose Notes > Nodes in the left-side navigation pane to view the Kubernetes version of the nodes.

Update only control planes

Before you update the node pools, you need to update the control planes first.

Update control planes

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the cluster that you want to upgrade and choose More > Operations > Upgrade Cluster in the Actions column.

  3. In the Update Items section of the Upgrade Cluster page, set Update Mode to Control Planes Only and click Precheck.

    After the precheck is complete, click View Details to view the report.

    • If Result displays Normal in the report, the cluster passes the precheck and you can update the cluster.

    • If Result displays Abnormal in the report, the cluster can still run as expected and the cluster status does not change. Click the Troubleshoot tab and follow the suggestions displayed on the page to fix the issues. For more information, see Cluster check items and suggestions on how to fix cluster issues.

      Note

      If your cluster runs Kubernetes 1.20 or later, the precheck checks whether deprecated APIs are used in your cluster. The precheck result is for reference only and does not determine whether the cluster is updatable. For more information, see Deprecated APIs.

  4. After the cluster passes the precheck, click Start Update.

    You can view the update progress in the lower part of the Upgrade Cluster page. After the update is complete, you can go to the Clusters page and check the Kubernetes version of your cluster to verify that the control plane components are updated.

Next step: Update node pools

After the control planes are updated, new nodes are added to the cluster based on the updated Kubernetes version. We recommend that you update the existing nodes during off-peak hours at the earliest opportunity and confirm the kubelet version after the update is complete. For more information, see Node pool updates.

FAQ about cluster updates

What do I do if the update for master nodes in an ACK dedicated cluster times out?

Cause

The self-signed server certificate of the admission webhook component does not contain the Subject Alternative Name field. As a result, the master components fail to start up.

Solution

Run the following commands to check whether the self-signed server certificate of the admission webhooks contains the Subject Alternative Name field. You need to run the following commands on nodes that have kubectl configured.

  1. Run the following command to query the admission webhooks in the cluster:

    kubectl get mutatingwebhookconfigurations

    Expected output

    NAME                                      WEBHOOKS   AGE
    ack-node-local-dns-admission-controller   1          27h
  2. Run the following command to query the Service configured for an admission webhook:

    kubectl get mutatingwebhookconfigurations ack-node-local-dns-admission-controller -oyaml | grep service -A 5
        service:
          name: ack-node-local-dns-admission-controller
          namespace: kube-system
          path: /inject
          port: 443
      failurePolicy: Ignore
  3. Run the following command to query the cluster IP address of the Service:

    kubectl -n kube-system get service ack-node-local-dns-admission-controller

    Expected output

    NAME                                      TYPE        CLUSTER-IP        EXTERNAL-IP   PORT(S)   AGE
    ack-node-local-dns-admission-controller   ClusterIP   192.168.XX.XX   <none>        443/TCP   27h
  4. Run the following command to use the cluster IP address to access the admission webhook, obtain its certificate, and check whether the Subject Alternative Name field exists:

    openssl s_client -connect 192.168.XX.XX:443 -showcerts </dev/null 2>/dev/null|openssl x509 -noout -text

What do I do if a cluster update fails and the following error is returned: the aliyun service is not running on the instance?

Cause

The Cloud Assistant agent becomes unavailable. As a result, the update command fails to be sent to the cluster.

Solution

Start or stop the Cloud Assistant agent. Then, update the cluster again. For more information, see Start, stop, or uninstall the Cloud Assistant Agent.

How do I handle the PLEG not healthy error?

The containers or container runtime does not respond. You need to restart the nodes and initiate the update again.

Procedures for updating control planes and node pools

Update control planes

ACK updates control planes based on the following procedure.

ACK managed clusters and ACK Serverless clusters

  1. Update control planes and managed components, such as kube-apiserver, kube-controller-manager, and kube-scheduler.

  2. Update Kubernetes components, such as kube-proxy.

ACK dedicated clusters

  1. When ACK identifies that the etcd and container runtime in your cluster need to be updated, it updates the etcd and container runtime on each master node in sequence.

  2. ACK updates only one master node at a time and displays the ID of the master node.

  3. Update master components, such as kube-apiserver, kube-controller-manager, and kube-scheduler.

  4. Update the kubelet on master nodes.

  5. Update Kubernetes components, such as kube-proxy, after all master nodes are updated.

Update node pools

ACK updates the nodes in your cluster in batches.

  • ACK updates node pools one after one.

  • The nodes in a node pool are updated in batches. The first batch includes one node. The number of nodes increases based on the powers of two in subsequent batches. The batch update policy still applies after you resume a paused update. You can specify the batch size on the Node Pool Upgrade page. We recommend that you set the batch size to 10. For more information, see Node pool updates.

References