This topic describes how to update the Kubernetes version of a Container Service for Kubernetes (ACK) cluster in the ACK console. You can view the Kubernetes version of a cluster on the Clusters page of the ACK console and check whether the Kubernetes version can be updated. The stability of ACK clusters that run earlier Kubernetes versions is not guaranteed and you may fail to update these clusters to the latest version. However, these issues do not adversely affect your applications deployed in these clusters. We recommend that you update your ACK clusters at the earliest opportunity. This topic describes how to update the Kubernetes version of an ACK cluster.

Precautions

Usage notes

  • ACK guarantees the stability of the three latest Kubernetes minor versions. You can update ACK clusters of the previous two minor versions to the latest minor version. For example, you can update ACK clusters from Kubernetes V1.20 or Kubernetes V1.22 to Kubernetes V1.24. The stability of ACK clusters that run other earlier Kubernetes minor versions is not guaranteed and you may fail to update these clusters to the latest minor version. However, these issues do not adversely affect your applications deployed in these clusters. We recommend that you update your ACK clusters at the earliest opportunity. For more information about Kubernetes versions supported by ACK, see Support for Kubernetes versions.
  • Applications that run in the cluster are not interrupted during the update. Applications that are strongly reliant on the API server may be temporarily interrupted. We recommend that you perform the update during off-peak hours.
  • The system runs a precheck before updating an ACK cluster. The cluster can be updated only after it passes the precheck.
  • You can update the Kubernetes version of an ACK cluster only to the next version. For example, the Kubernetes version of an ACK cluster is 1.18. If you want to update the Kubernetes version to 1.24, you must manually update the version to 1.20, 1.22, and then 1.24 in sequence.
  • The cluster update policy defines how the ACK cluster is updated. The default policy is batch update. Batch update is performed during the worker node update phase to update worker nodes in batches. The policy works in the following manner:
    • Multiple node pools are updated one after one.
    • The nodes in a node pool are updated in batches. The first batch includes one node. The number of nodes increases based on the powers of two in subsequent batches. The batch update policy still applies after you resume a paused update process.
    • The system updates no more than 10 nodes in each batch.

Considerations

  • To update a cluster, you need to use Yum to download the required software packages. If you have modified the network configuration of the nodes or used a custom OS image, make sure that Yum can function as expected on the nodes. You can run the yum makecache command to perform a check.
  • Do not add nodes, remove nodes, or perform other operations on the cluster during the update process.
  • If you specify a custom operating system image, we cannot guarantee a successful node update because custom images are not strictly validated by ACK.
  • If you have modified the configurations of the ACK cluster, such as swap partition changes or kubelet configuration changes by using the CLI, the update may fail or your custom configurations may be lost.
  • After the update is completed, we recommend that you update kubectl on your on-premises machine. For more information about how to install kubectl, see Install kubectl. Otherwise, the kubectl version may be incompatible with the API server version. As a result, an error similar to invalid object doesn't have additional properties may occur.
  • For ACK clusters that run Kubernetes 1.20 or later, the system checks whether deprecated APIs are used in the clusters. For more information, see Deprecated APIs.
  • If the resource usage of the cluster is excessively high, the system may fail to schedule the evicted pods promptly when the node update fails. We recommend that you reserve resources for the cluster nodes. Do not use more than 50% of CPU resources or more than 70% of memory resources.

Kubernetes version descriptions

  • Kubernetes 1.24 no longer supports the Docker runtime. Before you update a cluster to Kubernetes 1.24, you need to use the node pool feature to update the container runtime of the nodes from Docker to containerd in batches. For more information, see Change the container runtime from Docker to containerd. If you use an ACK Pro cluster, the master nodes automatically update from Docker to containerd during the cluster update. All containers on the master nodes will be recreated. Back up your custom containers on the master nodes before you update the cluster.
  • ACK clusters that run Kubernetes 1.20 do not support the selfLink field. If FlexVolume is used and alicloud-nas-controller is installed in your cluster, you must update the image version of alicloud-nas-controller to v1.14.8.17-7b898e5-aliyun or later before you can update an earlier Kubernetes version to 1.20.
  • Before you update an ACK cluster from Kubernetes 1.16 to Kubernetes 1.18, read Usage notes for updating CSI block volumes if the application in the cluster uses a disk volume whose type is Block Volume.
  • Before you update an ACK cluster to Kubernetes 1.14, make sure that the existing configurations allow access to the IP addresses of the Server Load Balancer (SLB) instances that are used by LoadBalancer Services. For more information, see What Can I Do if the Cluster Cannot Access the IP Address of the SLB Instance Exposed by the LoadBalancer Service.
  • Object Storage Service (OSS) volumes that are mounted to the Kubernetes cluster by using FlexVolume 1.11.2.5 or earlier will be remounted during the update. You must recreate the pods that use OSS volumes after the update is completed.
  • After you update an ACK cluster to Kubernetes 1.18, ACK automatically configures resource reservation by default. Workloads on cluster nodes may be evicted when the resource usage of the nodes is high and resource reservation is not configured. For more information about how to configure resource reservation, see Resource reservation policy.

How ACK clusters are updated

The cluster update process consists of the precheck, control plane update, and node update. The following sections describe the cluster update process in detail.

Control plane update

Control plane update for ACK managed and serverless Kubernetes (ASK) clusters
  1. Control plane and management components, including kube-apiserver, kube-controller-manager, and kube-scheduler, are updated.
  2. Kubernetes components, such as kube-proxy, are updated.
Control plane update for ACK Pro clusters
  1. Optional. The etcd and container runtime on the master nodes are updated in sequence.
  2. The system updates only one master node each time and displays the ID of the master node.
  3. Master components, including kube-apiserver, kube-controller-manager, and kube-scheduler, are updated.
  4. The kubelet on master nodes is updated.
  5. Kubernetes components, such as kube-proxy, are updated after all master nodes are updated.

Node update

Cluster nodes are updated based on the batch update policy. The batch update policy specifies the following rules:
  • Multiple node pools are updated one after one.
  • The nodes in a node pool are updated in batches. The first batch includes one node. The number of nodes increases based on the powers of two in subsequent batches. The batch update policy still applies after you resume a paused update process.
  • The system updates no more than 10 nodes in each batch.

Step 1: Perform a precheck

Note If the cluster that you want to update is not deployed in the production environment, we recommend that you check whether the cluster meets the update requirements before you start the update in the production environment.

Before you update a cluster, you must check the health status of the cluster and make sure that the cluster meets the update requirements.

  1. Log on to the ACK console and click Clusters in the left-side navigation pane.
  2. On the Clusters page, find the cluster on which you want to perform a check and choose More > Cluster Check in the Actions column.
  3. In the left-side navigation pane of the Container Intelligence Service page, choose Cluster Check > Upgrade Check.
  4. On the Upgrade Check page, click Start.
  5. In the Upgrade Check panel, select the check box under Warning and click Start.
    After the precheck is completed, click Details.
    • If the result is normal in the report, the cluster passes the precheck and you can update the cluster.
    • If the result is abnormal in the report, follow the suggestions displayed on the page to fix the issues.
      Note
      • The precheck is performed only before clusters are updated. A cluster can still run as normal and the cluster status does not change if the cluster fails to pass the precheck.
      • If your cluster runs Kubernetes 1.20 or later, the precheck checks whether deprecated APIs are used in your cluster. The precheck result is for reference only and does not determine whether the cluster is updatable. For more information, see Deprecated APIs.

Step 2: Update the cluster

Note If your cluster shows that the Kubernetes version can be updated to the latest version, the current Kubernetes version is discontinued. This does not adversely affect the applications that are deployed in the cluster. In this scenario, we recommend that you update your cluster during off-peak hours at the earliest opportunity.
  1. Log on to the ACK console and click Clusters in the left-side navigation pane.
  2. On the Clusters page, find the cluster that you want to upgrade and choose More > Upgrade Cluster in the Actions column.
  3. Click Upgrade.
    Note If you click Precheck, the system performs a precheck on the cluster as described in Step 1: Perform a precheck.
  4. In the message that appears, click Confirm.
    You can view the progress of the update.
    Note
    • If you want to pause the update, click Pause. After you pause the update, you can click Continue to resume the update process. Do not perform operations after you pause the update of a cluster. In addition, we recommend that you resume and complete the update at your earliest convenience. If the update is paused for more than 7 days, the system automatically terminates the update process. The events and log data that are generated during the update process are also deleted.
    • After the update is paused, you can click Cancel to cancel the update. After you cancel the update, the system will complete updating the nodes in the current batch and skip the nodes that have not been updated. After the update is canceled, you cannot roll back the nodes that have been updated. However, you can continue to update the nodes that have not been updated.
After the update is completed, you can go to the Clusters page and check the Kubernetes version of your cluster to verify that the control plane components are updated. You can go to the Nodes > Nodes page and check the kubelet version to verify that the cluster nodes are updated.
Note
  • If an error occurs during the update, the system automatically pauses the update process. The cause of failure is displayed in the lower part of the page. You can follow the suggestions to fix the error.
  • Do not modify the resources in the kube-upgrade namespace during the upgrade process unless an error occurs.
  • If an error occurs during the update process, the update is paused. You must troubleshoot the error and delete the failed pods in the kube-upgrade namespace. You can restart the update after the error is fixed.

Deprecated APIs

If your cluster runs Kubernetes 1.20 or later, the precheck checks whether deprecated APIs are used in your cluster. After the precheck is complete, the deprecated APIs that are used by your cluster are displayed on the precheck result page.

For example, before you update your cluster from Kubernetes 1.20 to Kubernetes 1.22, the system checks whether deprecated APIs are used in your cluster by scanning the audit logs that were generated the previous day.
  • The precheck result is for reference only. You can proceed with the update even if your cluster uses deprecated APIs.
  • If you continue to use the deprecated APIs in Kubernetes 1.22, potential security risks may exist.
The following table describes the types of deprecated APIs. Before you update a cluster that uses deprecated APIs, we recommend that you refer to the Type column of the following table and perform operations that correspond to the type of deprecated API used by the cluster.
TypeSuggestionExample
coreKey Kubernetes components: ACK updates key Kubernetes components. You do not need to update the components. Information about the components is not displayed on the precheck page. apiserver, scheduler, and kube-controller-manager
ackACK components: ACK components require manual update. You can update ACK components based on the instructions on the Add-ons page of the ACK console.
Note
  • You can go to the Container Service for Kubernetes (ACK) console, choose Operations > Add-ons in the left-side navigation pane, and then update the components. After the components are updated, no deprecated APIs are displayed.
  • Deprecated APIs in the precheck result are for reference only. You can proceed with the update even if your cluster uses deprecated APIs. After your cluster is updated, your cluster uses new APIs. To avoid security risks, we recommend that you do not use deprecated APIs after you update your cluster.
metrics-server and nginx-ingress-controller
OpenSourceOpen source components: ACK lists some open source components in the console. You can decide whether to update the components. Other open source components may be classified into the unknown type.
Note Deprecated APIs in the precheck result are for reference only. You can proceed with the update even if your cluster uses deprecated APIs. Update the components based on your business requirements.
rancher and elasticsearch-operator
unknownUnknown sources: Deprecated APIs that do not belong to the previous types are classified into the unknown type. ACK lists the unknown components in the console. You can decide whether to update the components. The components require manual update.
Note Deprecated APIs in the precheck result are for reference only. You can proceed with the update even if your cluster uses deprecated APIs. Update the components based on your business requirements.
kubectl, agent, Go-http-client, and okhttp

Troubleshoot cluster update failures

What do I do if the master node update of an ACK dedicated cluster times out?

Cause

The self-signed server certificate of the admission webhook component does not contain the Subject Alternative Name field. As a result, the master component fails to start up.

Solution

Run the following commands to check whether the self-signed server certificate of the admission webhooks contains the Subject Alternative Name field. You need to run the following commands on nodes that have kubectl configured.

  1. Run the following command to query the admission webhooks in the cluster:
    kubectl get mutatingwebhookconfigurations
    Expected output:
    NAME                                      WEBHOOKS   AGE
    ack-node-local-dns-admission-controller   1          27h
  2. Run the following command to query the Service configured for an admission webhook:
    kubectl get mutatingwebhookconfigurations ack-node-local-dns-admission-controller -oyaml | grep service -A 5
        service:
          name: ack-node-local-dns-admission-controller
          namespace: kube-system
          path: /inject
          port: 443
      failurePolicy: Ignore
  3. Run the following command to query the cluster IP of the Service:
    kubectl -n kube-system get service ack-node-local-dns-admission-controller
    Expected output:
    NAME                                      TYPE        CLUSTER-IP        EXTERNAL-IP   PORT(S)   AGE
    ack-node-local-dns-admission-controller   ClusterIP   192.168.XX.XX   <none>        443/TCP   27h
  4. Run the following command to use the cluster IP to access the admission webhook, obtain its certificate, and check whether the Subject Alternative Name field exists:
    openssl s_client -connect 192.168.XX.XX:443 -showcerts </dev/null 2>/dev/null|openssl x509 -noout -text
    98

What do I do if the etcd update of an ACK dedicated cluster fails?

Cause

Cloud Assistant is unavailable.

Solution

Cloud Assistant becomes unavailable. As a result, the update command fails to be issued. Update Cloud Assistant and try again. For more information about how to update Cloud Assistant, see Upgrade or disable upgrades for the Cloud Assistant client.

What do I do if the PLEG module of nodes is unhealthy?

You need to restart the nodes and initiate the update again when the containers or container runtime does not respond.