All Products
Search
Document Center

Container Service for Kubernetes:ACK Gateway with Inference Extension

Last Updated:Mar 21, 2025

The ACK Gateway with Inference Extension component is an enhanced component based on the Kubernetes community Gateway API and its Inference Extension. ACK Gateway with Inference Extension supports Layer 4 and Layer 7 routing services in Kubernetes and provides intelligent load balancing for large language models (LLMs) inference services. This topic introduces the ACK Gateway with Inference Extension component and describes its usage notes and release notes.

Component information

The ACK Gateway with Inference Extension component is built based on the Envoy Gateway project. It is compatible with Gateway API and integrates inference extensions provided by Gateway API. This component is used to provide load balancing and routing features for LLM inference services.

Usage notes

The installation and use of the ACK Gateway with Inference Extension component depends on the CRD provided by the Gateway API component. Before installation, make sure that the Gateway API component is installed in the cluster. For more information, see Install the application backup component.

Note

ACK Gateway with Inference Extension is available only to whitelisted users. If you cannot find this component on the Add-ons page in the ACK console, submit a ticket.

Release notes

March 2025

Version number

Release date

Description

Impact

v1.3.0-aliyun.1

2025-03-12

  • Gateway API v1.2 is supported.

  • Inference Extension is supported to provide intelligent load balancing for LLM inference scenarios.

No impact on workloads.