Install ack-kserve, an Alibaba Cloud-optimized KServe distribution, to serve ML models on ACK.
Installation involves deploying cert-manager and ack-kserve.
Prerequisites
Ensure the following:
-
An ACK managed Pro cluster running Kubernetes 1.22 or later is created.
-
The NGINX Ingress controller is installed.
Step 1: Install cert-manager
ack-kserve depends on cert-manager for TLS certificate management.
-
Log on to the ACK console. In the left-side navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Applications > Helm.
-
On the Helm page, click Deploy. In Basic Information, set Application Name, select
cert-managerin the Chart section, and click Next. -
In Parameters, review Chart Version and Parameters, then click OK.
After deployment, cert-manager appears on the Helm page.
Step 2: Install ack-kserve
By default, ack-kserve is deployed in RawDeployment mode and integrated with the NGINX Ingress controller.
-
Log on to the ACK console. In the left-side navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Applications > Helm.
-
On the Helm page, click Deploy. In Basic Information, set Application Name, select
ack-kservein the Chart section, and click Next. -
In Parameters, review Chart Version and Parameters, then click OK. After deployment, ack-kserve appears on the Helm page.
-
Verify that ack-kserve is running:
kubectl get pod -n kserveIf all pods show
runningin theSTATUScolumn, ack-kserve is installed.
(Optional) Step 3: View or update ack-kserve
View component details
On the Helm page, find ack-kserve and click View Details in the Actions column. The details page shows Basic Information, Parameters, and History tabs.
Update the component
On the Helm page, find ack-kserve and click Update in the Actions column. In the Update Release panel, change the chart version or modify parameters.
(Optional) Step 4: Uninstall the ack-kserve add-on
Uninstall components in this order to avoid orphaned resources:
-
Delete KServe custom resources
-
Delete KServe CustomResourceDefinitions (CRDs)
-
Uninstall ack-kserve
-
Uninstall cert-manager
-
Delete cert-manager CRDs
Delete KServe custom resources and CRDs
Deleting a CRD automatically deletes all associated custom resources. Custom resources cannot be restored after deletion. Verify that your InferenceService resources are no longer needed.
-
Back up and delete all InferenceService custom resources:
# List all InferenceService resources in the cluster kubectl get isvc --all-namespaces # Back up all resources to a file kubectl get isvc --all-namespaces -oyaml > isvc.yaml.bak # Delete all resources kubectl delete isvc --all -
Delete the KServe CRDs.
NoteDelete all custom resources before deleting a CRD. Otherwise, CRD deletion fails.
kubectl delete crd clusterservingruntimes.serving.kserve.io kubectl delete crd clusterstoragecontainers.serving.kserve.io kubectl delete crd inferencegraphs.serving.kserve.io kubectl delete crd inferenceservices.serving.kserve.io kubectl delete crd predictors.serving.kserve.io kubectl delete crd servingruntimes.serving.kserve.io kubectl delete crd trainedmodels.serving.kserve.io
Uninstall ack-kserve
-
Log on to the ACK console. In the left-side navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Applications > Helm.
-
On the Helm page, find ack-kserve and click Delete in the Actions column. In the dialog box, click OK.
Uninstall cert-manager
Ensure no other cluster components depend on cert-manager before uninstalling. Removing cert-manager while in use causes service interruptions.
-
Log on to the ACK console. In the left-side navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Applications > Helm.
-
On the Helm page, find cert-manager and click Delete in the Actions column. In the dialog box, click OK.
-
Delete the cert-manager CRDs:
kubectl delete crd certificaterequests.cert-manager.io kubectl delete crd certificates.cert-manager.io kubectl delete crd challenges.acme.cert-manager.io kubectl delete crd clusterissuers.cert-manager.io kubectl delete crd issuers.cert-manager.io kubectl delete crd orders.acme.cert-manager.io
Troubleshooting
Error during ack-kserve installation: TLS certificate verification failed
failed to call webhook: Post "https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s": tls: failed to verify certificate: x509: certificate signed by unknown authority
ack-kserve requires cert-manager to be fully ready. This error occurs when cert-manager is not installed or its pods are not yet running.
-
Check whether cert-manager is installed:
kubectl get crd | grep certificates.cert-manager.ioExpected output if cert-manager is installed:
certificates.cert-manager.io 2024-05-06T07:09:17ZIf there is no output, install cert-manager first.
-
Check whether cert-manager pods are ready:
kubectl -n cert-manager get poAll pods should show
1/1in theREADYcolumn:NAME READY STATUS RESTARTS AGE cert-manager-7f4bb44d5b-jrrfn 1/1 Running 0 23h cert-manager-cainjector-79544456cc-qp5pp 1/1 Running 0 23h cert-manager-webhook-f74ccb647-7m5dt 1/1 Running 0 23hIf all pods are ready, uninstall ack-kserve and reinstall it.