The ACK Gateway with Inference Extension component is an enhanced component based on the Kubernetes community Gateway API and its Inference Extension. ACK Gateway with Inference Extension supports Layer 4 and Layer 7 routing services in Kubernetes and provides intelligent load balancing for large language models (LLMs) inference services. This topic introduces the ACK Gateway with Inference Extension component and describes its usage notes and release notes.
Component information
The ACK Gateway with Inference Extension component is built based on the Envoy Gateway project. It is compatible with Gateway API and integrates inference extensions provided by Gateway API. This component is used to provide load balancing and routing features for LLM inference services.
Usage notes
The installation and use of the ACK Gateway with Inference Extension component depends on the CRD provided by the Gateway API component. Before installation, make sure that the Gateway API component is installed in the cluster. For more information, see Install the application backup component.
ACK Gateway with Inference Extension is available only to whitelisted users. If you cannot find this component on the Add-ons page in the ACK console, submit a ticket.
Release notes
March 2025
Version number | Release date | Description | Impact |
v1.3.0-aliyun.1 | 2025-03-12 |
| No impact on workloads. |