ack-kserve is Alibaba Cloud's optimized distribution of open source KServe, integrated with Alibaba Cloud storage, logging, and network capabilities. It gives Kubernetes engineers a declarative API for deploying and managing machine learning inference services in Container Service for Kubernetes (ACK) clusters.
By default, ack-kserve runs in RawDeployment mode with the NGINX Ingress Controller. This mode gives you direct control over Kubernetes resources—suitable for most inference workloads, including GPU-accelerated models and long-running inference requests.
This topic walks you through installing, verifying, updating, and uninstalling ack-kserve.
Prerequisites
Before you begin, make sure you have:
-
An ACK Edge cluster running version 1.22 or later. For more information, see Create an ACK Edge cluster.
-
The NGINX Ingress Controller installed. For more information, see Deploy the Ingress Controller in an ACK Edge cluster
Step 1: Install cert-manager
ack-kserve requires cert-manager to provision TLS certificates for its webhook. Install cert-manager before installing ack-kserve.
-
Log on to the Container Service Management Console. In the left navigation pane, click Clusters.
-
On the Clusters page, click your cluster name. In the left navigation pane, choose Applications > Helm.
-
In the upper-left corner of the Helm page, click Deploy. In the Basic Information step of the Deploy panel, set Application Name, select
cert-managerin the Chart section, and then click Next. -
In the Parameters step, review the Chart Version and the Parameters field, then click OK.
After deployment, cert-manager appears on the Helm page.
Step 2: Install ack-kserve
-
Log on to the Container Service Management Console. In the left navigation pane, click Clusters.
-
On the Clusters page, click your cluster name. In the left navigation pane, choose Applications > Helm.
-
In the upper-left corner of the Helm page, click Deploy. In the Basic Information step of the Deploy panel, set Application Name, select
ack-kservein the Chart section, and then click Next. -
In the Parameters step, review the Chart Version and the Parameters field, then click OK. After deployment,
ack-kserveappears on the Helm page. -
Update the
inferenceservice-configconfiguration to point to your NGINX Ingress class. On the Helm page, click kserve, then click inferenceservice-config. Click Edit YAML and set theingressClassNamefield to theingressClassResource.namevalue you specified when you installed the NGINX Ingress Controller.
-
Verify that
ack-kserveis running.kubectl get pod -n kserveIf
Runningis returned for theSTATUScolumn, the ack-kserve component is installed successfully.
(Optional) Step 3: View or update ack-kserve
-
Log on to the Container Service Management Console. In the left navigation pane, click Clusters.
-
On the Clusters page, click your cluster name. In the left navigation pane, choose Applications > Helm.
-
To view component details, find
ack-kserveand click View Details in the Actions column. The details page shows the Basic Information, Parameters, and History tabs. -
To update the component, find
ack-kserveand click Update in the Actions column. In the Update Release panel, change the version and modify parameters as needed.
(Optional) Step 4: Uninstall ack-kserve
Deleting a CustomResourceDefinition (CRD) also deletes all associated custom resources. This action cannot be undone. Before proceeding, make sure the resources are no longer needed.
Follow this order: delete KServe custom resources → delete KServe CRDs → uninstall ack-kserve → uninstall cert-manager → delete cert-manager CRDs.
Delete KServe custom resources
Back up and delete all InferenceService resources before uninstalling the component.
# List all InferenceService resources across namespaces
kubectl get isvc --all-namespaces
# Save a backup
kubectl get isvc --all-namespaces -oyaml > isvc.yaml.bak
# Delete all InferenceService resources
kubectl delete isvc --all
Delete KServe CRDs
Delete all KServe CustomResourceDefinitions (CRDs). You must delete all associated custom resources before deleting a CRD—otherwise the deletion fails.
kubectl delete crd clusterservingruntimes.serving.kserve.io
kubectl delete crd clusterstoragecontainers.serving.kserve.io
kubectl delete crd inferencegraphs.serving.kserve.io
kubectl delete crd inferenceservices.serving.kserve.io
kubectl delete crd predictors.serving.kserve.io
kubectl delete crd servingruntimes.serving.kserve.io
kubectl delete crd trainedmodels.serving.kserve.io
Uninstall ack-kserve
-
Log on to the Container Service Management Console. In the left navigation pane, click Clusters.
-
On the Clusters page, click your cluster name. In the left navigation pane, choose Applications > Helm.
-
On the Helm page, find ack-kserve and click Delete in the Actions column. In the Delete dialog box, click OK.
Uninstall cert-manager
Before uninstalling cert-manager, confirm that no other components in the cluster depend on it. Removing cert-manager while other components still use it causes service interruptions.
-
Log on to the Container Service Management Console. In the left navigation pane, click Clusters.
-
On the Clusters page, click your cluster name. In the left navigation pane, choose Applications > Helm.
-
On the Helm page, find cert-manager and click Delete in the Actions column. In the Delete dialog box, click OK.
-
Delete the cert-manager CRDs.
kubectl delete crd certificaterequests.cert-manager.io kubectl delete crd certificates.cert-manager.io kubectl delete crd challenges.acme.cert-manager.io kubectl delete crd clusterissuers.cert-manager.io kubectl delete crd issuers.cert-manager.io kubectl delete crd orders.acme.cert-manager.io
Troubleshooting
TLS webhook error during ack-kserve installation
Error message:
failed to call webhook: Post "https://cert-manager-webhook.cert-manager.svc:443/validate?timeout=30s": tls: failed to verify certificate: x509: certificate signed by unknown authority
This error means cert-manager is not installed or its pods are not ready yet. ack-kserve depends on cert-manager to issue TLS certificates for its admission webhook.
Check whether cert-manager is installed:
kubectl get crd | grep certificates.cert-manager.io
If cert-manager is installed, the output includes:
certificates.cert-manager.io 2024-05-06T07:09:17Z
If no output is returned, install cert-manager. For more information, see Step 1: Install cert-manager.
Check whether cert-manager pods are ready:
kubectl -n cert-manager get po
All three pods should be 1/1 Running:
NAME READY STATUS RESTARTS AGE
cert-manager-7f4bb44d5b-jrrfn 1/1 Running 0 23h
cert-manager-cainjector-79544456cc-qp5pp 1/1 Running 0 23h
cert-manager-webhook-f74ccb647-7m5dt 1/1 Running 0 23h
If all pods are ready but the error persists, uninstall ack-kserve and reinstall it. For more information, see Uninstall ack-kserve.