If cGPU Basic Edition is installed in a dedicated Kubernetes cluster, cGPU cannot work as normal after you migrate the cluster workloads to a professional Kubernetes cluster. Professional Kubernetes clusters support only cGPU Professional Edition. In this case, you must upgrade cGPU Basic Edition to cGPU Professional Edition in the professional Kubernetes cluster after the migration is completed. This topic describes how to upgrade cGPU Basic Edition to cGPU Professional Edition in a professional Kubernetes cluster.
- Log on to the ACK console.
- In the left-side navigation pane of the ACK console, click Clusters.
- On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.
- In the left-side navigation pane of the details page, choose .
- On the Jobs page, click Create from YAML in the upper-right corner.
- On the Create page, set Sample Template to Custom. Copy the following YAML template to the Template section. This template is used to create a Job that uninstalls cGPU Basic Edition and modifies the labels of GPU-accelerated nodes.
apiVersion: batch/v1 kind: Job metadata: name: gpushare-migration namespace: kube-system spec: backoffLimit: 0 template: spec: serviceAccount: admin containers: - name: gpushare-migration # Replace <cn-beijing> in the following image address with the ID of the region where the cluster is deployed. image: registry-vpc.cn-beijing.aliyuncs.com/acs/gpushare-migration:v0.1.0 env: - name: CHANGE_LABELS_INFO value: "cgpu=true::ack.node.gpu.schedule=cgpu,gpushare=true::ack.node.gpu.schedule=share" restartPolicy: Never
- Click Create. Click the Job name gpushare-migration to view the deployment progress.
On the details page of the gpushare-migration Job, click the Pods tab. If the state of the pod is Completed, it indicates that the Job succeeds.
- Install cGPU Professional Edition. For more information, see Step 1: Install ack-ai-installer.
- Install a GPU memory inspection tool in the cluster. For more information, see Step 4: Install and use the GPU scheduling inspection tool.