Deploying machine learning models on Kubernetes requires setting up model serving infrastructure that integrates with your cluster's storage, networking, and observability stack. ack-kserve is an Alibaba Cloud-optimized distribution of open-source KServe that handles this integration automatically—connecting with Alibaba Cloud storage, logging, and network capabilities so you can focus on serving models rather than configuring infrastructure.
This topic describes how to install, manage, and uninstall ack-kserve in an ACK cluster.
Prerequisites
Before you begin, ensure that you have:
-
An ACK managed Pro cluster running Kubernetes 1.22 or later. See Create an ACK managed cluster
-
The NGINX Ingress controller installed. See Manage the NGINX Ingress controller
Step 1: Install cert-manager
ack-kserve depends on cert-manager for TLS certificate management. Install cert-manager before installing ack-kserve.
-
Log on to the ACK console. In the left-side navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Applications > Helm.
-
In the upper-left corner of the Helm page, click Deploy. In the Basic Information step, set the Application Name field, select
cert-managerin the Chart section, and click Next. -
In the Parameters step, review the Chart Version and Parameters values, then click OK.
After deployment, the cert-manager component appears on the Helm page.
Step 2: Install ack-kserve
By default, ack-kserve is deployed in RawDeployment mode and integrated with the NGINX Ingress controller.
-
Log on to the ACK console. In the left-side navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Applications > Helm.
-
In the upper-left corner of the Helm page, click Deploy. In the Basic Information step, set the Application Name field, select
ack-kservein the Chart section, and click Next. -
In the Parameters step, review the Chart Version and Parameters values, then click OK. After deployment, the ack-kserve component appears on the Helm page.
-
Verify that ack-kserve is running:
kubectl get pod -n kserveIf
runningis returned for theSTATUSparameter in the output, the ack-kserve component is installed.
(Optional) Step 3: View or update ack-kserve
View component details
On the Helm page, find the ack-kserve component and click View Details in the Actions column. The details page shows the Basic Information, Parameters, and History tabs.
Update the component
On the Helm page, find the ack-kserve component and click Update in the Actions column. In the Update Release panel, change the chart version or modify the parameter settings.
(Optional) Step 4: Uninstall ack-kserve components
Uninstall components in the following order to avoid leaving orphaned resources:
-
Delete KServe custom resources
-
Delete KServe CustomResourceDefinitions (CRDs)
-
Uninstall ack-kserve
-
Uninstall cert-manager
-
Delete cert-manager CRDs
Delete KServe custom resources and CRDs
Custom resources cannot be restored after deletion. Before proceeding, make sure your InferenceService resources are no longer needed by your workloads.
-
Back up and delete all InferenceService custom resources:
# List all InferenceService resources in the cluster kubectl get isvc --all-namespaces # Back up all resources to a file kubectl get isvc --all-namespaces -oyaml > isvc.yaml.bak # Delete all resources kubectl delete isvc --all -
Delete the KServe CRDs.
NoteBefore you delete a CRD, you must delete all relevant custom resources. Otherwise, the CRD deletion fails.
kubectl delete crd clusterservingruntimes.serving.kserve.io kubectl delete crd clusterstoragecontainers.serving.kserve.io kubectl delete crd inferencegraphs.serving.kserve.io kubectl delete crd inferenceservices.serving.kserve.io kubectl delete crd predictors.serving.kserve.io kubectl delete crd servingruntimes.serving.kserve.io kubectl delete crd trainedmodels.serving.kserve.io
Uninstall ack-kserve
-
Log on to the ACK console. In the left-side navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Applications > Helm.
-
On the Helm page, find ack-kserve and click Delete in the Actions column. In the dialog box, click OK.
Uninstall cert-manager
Before you uninstall cert-manager, make sure that cert-manager is not used by other components in the cluster. This prevents service interruptions caused by the uninstallation of cert-manager.
-
Log on to the ACK console. In the left-side navigation pane, click Clusters.
-
On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Applications > Helm.
-
On the Helm page, find cert-manager and click Delete in the Actions column. In the dialog box, click OK.
-
Delete the cert-manager CRDs:
kubectl delete crd certificaterequests.cert-manager.io kubectl delete crd certificates.cert-manager.io kubectl delete crd challenges.acme.cert-manager.io kubectl delete crd clusterissuers.cert-manager.io kubectl delete crd issuers.cert-manager.io kubectl delete crd orders.acme.cert-manager.io
Troubleshooting
Error during ack-kserve installation: TLS certificate verification failed
failed to call webhook: Post "https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s": tls: failed to verify certificate: x509: certificate signed by unknown authority
ack-kserve requires cert-manager to be fully ready before installation. This error occurs when cert-manager is not installed or its pods are not yet running.
-
Check whether cert-manager is installed:
kubectl get crd | grep certificates.cert-manager.ioIf cert-manager is installed, the output is similar to:
certificates.cert-manager.io 2024-05-06T07:09:17ZIf there is no output, install cert-manager first. See Step 1: Install cert-manager.
-
Check whether cert-manager pods are ready:
kubectl -n cert-manager get poAll pods should show
1/1in theREADYcolumn. The expected output is similar to:NAME READY STATUS RESTARTS AGE cert-manager-7f4bb44d5b-jrrfn 1/1 Running 0 23h cert-manager-cainjector-79544456cc-qp5pp 1/1 Running 0 23h cert-manager-webhook-f74ccb647-7m5dt 1/1 Running 0 23hIf all pods are ready, uninstall ack-kserve and reinstall it. See Uninstall ack-kserve.