KServe is an open source cloud-native model service platform designed to simplify the process of deploying and running machine learning models on Kubernetes. KServe supports multiple machine learning frameworks and provides the auto scaling feature. KServe allows you to deploy models by defining simple YAML configuration files with declarative APIs. This way, you can easily configure and manage model services.
Framework
KServe provides a series of CustomResourceDefinitions (CRDs) to manage and deliver machine learning model services. KServe provides easy-to use advanced interfaces and standardized data plane protocols for a wide range of models such as TensorFlow, XGBoost, scikit-learn, PyTorch, and Huggingface Transformer/LLM. In addition, KServe encapsulates the complex operations of AutoScaling, networking, health checking, and server configuration to implement features including GPU auto scaling, Scale to Zero, and Canary Rollouts. These features simplify the deployment and maintenance process of AI models.
For more information, see KServe.
Deployment modes
KServe provides the following three deployment modes: Raw Deployment, Serverless, and ModelMesh. The supported KServe features vary based on the deployment mode.
Deployment mode | Description | References |
Raw Deployment | Raw Deployment is the simplest deployment mode of KServe that depends on only cert-manager and gateways. This deployment mode supports features such as AutoScaling, Prometheus monitoring, Canary Rollouts with specific gateways, and GPU auto scaling. | |
Serverless | The Serverless deployment mode depends on cert-manager, gateways, and Knative. This deployment mode supports features such as autoscaling, Scale to Zero, Canary Rollouts, and GPU autoscaling. | |
ModelMesh | The ModelMesh deployment mode depends on cert-manager, Knative, and ModelMesh. For example, ModelMesh is used to deploy Service Mesh (ASM). This deployment mode supports features such as AutoScaling, Scale to Zero, Canary Rollouts, and GPU auto scaling. | N/A |
ack-kserve installation
For more information about how to deploy and manage ack-kserve in a Container Service for Kubernetes (ACK) cluster, see Install ack-kserve.