All Products
Search
Document Center

Container Compute Service:Deploy the KServe component

Last Updated:Feb 27, 2026

KServe is a Kubernetes-based framework for serving machine learning models. It lets you deploy trained models -- such as those using TFServing, TorchServe, or Triton inference servers -- as Kubernetes CustomResourceDefinitions (CRDs), which simplifies and accelerates deploying, updating, and scaling models. The core component, KServe Controller, can be installed through the ACS console.

How KServe works

The KServe controller manages InferenceService custom resources and creates Knative Services to automate resource scaling.

When traffic increases, the KServe controller scales the Deployment of a Knative Service accordingly. When no requests are received, it scales the Service pods to zero. This auto scaling mechanism maximizes resource efficiency and reduces waste.

image

Model serving runtimes

KServe includes two built-in model serving runtimes:

RuntimeDescription
ModelServerA Python-based runtime that implements KServe prediction protocol v1
MLServerA runtime that implements KServe prediction protocol v2 with both REST and gRPC support

Both runtimes provide out-of-the-box model serving. For more complex use cases, build a custom model server using KServe's API primitives or tools such as BentoML.

Serverless features

After you deploy models with Knative InferenceService, the following serverless features become available:

  • Scale to zero

  • Auto scaling based on requests per second (RPS), concurrency, and CPU and GPU metrics

  • Version management

  • Traffic management

  • Security authentication

  • Out-of-the-box metrics

Deploy KServe

Prerequisites

Before you begin, ensure that you have:

  • Knative deployed in your ACS cluster. For more information, see Deploy Knative

Procedure

  1. Log on to the ACS console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, find the target cluster and click its ID. In the left-side navigation pane, choose Applications > Knative.

  3. On the Components tab, find KServe and click Deploy in the Actions column. Click Confirm in the dialog box. The deployment may take several minutes to complete.

Verify the deployment

After the deployment completes, check the Status column of the KServe component on the Components tab. A status of Deployed confirms that the component is installed.

Next steps

Quickly deploy an InferenceService using KServe