All Products
Search
Document Center

Container Service for Kubernetes:Use Gateway with Inference Extension to expose Services

Last Updated:Mar 26, 2026

Gateway with Inference Extension provides all basic features of Gateway API and open source Envoy Gateway extensions. This topic covers four common traffic management tasks: HTTP path routing, request header modification, proportional traffic splitting, and TLS termination.

How it works

Gateway with Inference Extension is built on the Envoy Gateway project. Its architecture has two layers:

  • Control plane (Envoy Gateway components): monitors traffic routing rules, then dynamically creates and manages Envoy Proxy instances. The control plane updates forwarding rules but does not directly forward traffic.

  • Data plane (Envoy Proxy instances): handles efficient and reliable traffic processing and forwarding.

Prerequisites

Before you begin, ensure that you have:

Set up the environment

Complete this section once before running any of the traffic management tasks below.

Step 1: Deploy the backend applications

Save the following YAML as backend.yaml and apply it to your cluster:

kubectl apply -f backend.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: backend
---
apiVersion: v1
kind: Service
metadata:
  name: backend
  labels:
    app: backend
    service: backend
spec:
  ports:
    - name: http
      port: 3000
      targetPort: 3000
  selector:
    app: backend
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: backend
      version: v1
  template:
    metadata:
      labels:
        app: backend
        version: v1
    spec:
      serviceAccountName: backend
      containers:
        - image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/envoygateway-echo-basic:v20231214-v1.0.0-140-gf544a46e
          imagePullPolicy: IfNotPresent
          name: backend
          ports:
            - containerPort: 3000
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace

Create a second backend application by copying backend.yaml, replacing every occurrence of backend with backend-2, then applying the updated manifest.

Step 2: Verify the GatewayClass

Installing Gateway with Inference Extension automatically creates a default GatewayClass named ack-gateway. Verify that it exists:

kubectl get gatewayclass

The output should show ACCEPTED: True, which confirms that Envoy Gateway is managing the GatewayClass:

NAME          CONTROLLER                                      ACCEPTED   AGE
ack-gateway   gateway.envoyproxy.io/gatewayclass-controller   True       2m31s

If the GatewayClass was not created automatically, create it manually:

  1. Save the following YAML as gatewayclass.yaml:

    apiVersion: gateway.networking.k8s.io/v1
    kind: GatewayClass
    metadata:
      name: ack-gateway
    spec:
      controllerName: gateway.envoyproxy.io/gatewayclass-controller
  2. Apply the manifest:

    kubectl apply -f gatewayclass.yaml

Step 3: Create the Gateway

Save the following YAML as gateway.yaml and apply it:

kubectl apply -f gateway.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: ack-gateway
spec:
  gatewayClassName: ack-gateway
  listeners:
    - name: http
      protocol: HTTP
      port: 80

The control plane automatically provisions an Envoy Proxy Deployment and a LoadBalancer Service based on your Gateway configuration. The Service listens on the ports defined in the Gateway spec. For billing details on the provisioned Server Load Balancer (SLB) instance, see SLB billing.

To customize the Envoy Proxy Deployment specs and Service parameters, see Customize EnvoyProxy. To enable the Horizontal Pod Autoscaler (HPA) for the gateway, see EnvoyProxy API reference.

Step 4: Get the gateway IP address

export GATEWAY_HOST=$(kubectl get gateway/ack-gateway -o jsonpath='{.status.addresses[0].value}')

Route HTTP traffic by path prefix

An HTTPRoute matches incoming requests and forwards them to a backend Service. The following example routes all requests with the /get path prefix to the backend Service.

  1. Save the following YAML as httproute.yaml and apply it:

    kubectl apply -f httproute.yaml
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
        - backendRefs:
            - group: ""
              kind: Service
              name: backend
              port: 3000
              weight: 1
          matches:
            - path:
                type: PathPrefix
                value: /get
  2. Send a test request:

    curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

    The backend echo application returns the request details. A response similar to the following confirms the route is working:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": ["*/*"],
      "User-Agent": ["curl/8.9.1"],
      "X-Envoy-External-Address": ["115.XX.XXX.55"],
      "X-Forwarded-For": ["115.XX.XXX.55"],
      "X-Forwarded-Proto": ["http"],
      "X-Request-Id": ["953b2f8f-26d3-4ba9-93ba-a482b197b1ff"]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }

Add a request header

Use the RequestHeaderModifier filter to inject headers into requests before they reach the backend. The following example adds added-header: foo to every request matching the /get prefix.

  1. Update httproute.yaml with the following content and apply it:

    kubectl apply -f httproute.yaml
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /get
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 1
        filters:
        - type: RequestHeaderModifier
          requestHeaderModifier:
            add:
            - name: "added-header"
              value: "foo"
  2. Send a test request:

    curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

    The response should include "Added-Header": ["foo"], which confirms the header was injected:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": ["*/*"],
      "Added-Header": ["foo"],
      "User-Agent": ["curl/8.9.1"],
      "X-Envoy-External-Address": ["115.XX.XXX.55"],
      "X-Forwarded-For": ["115.XX.XXX.55"],
      "X-Forwarded-Proto": ["http"],
      "X-Request-Id": ["d37f19e5-25c1-45cf-90e5-51453e7ae3ed"]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }

Split traffic proportionally

Set weight values in backendRefs to distribute traffic across multiple Services. Each Service receives a share proportional to its weight divided by the sum of all weights — the weights do not need to add up to 100.

The following example sends 80% of traffic to backend and 20% to backend-2 by setting weights of 8 and 2.

  1. Update httproute.yaml with the following content and apply it:

    kubectl apply -f httproute.yaml
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /get
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 8
        - group: ""
          kind: Service
          name: backend-2
          port: 3000
          weight: 2
  2. Send 20 requests and check the distribution:

    for i in $(seq 1 20); do curl -sS -H "Host: www.example.com" http://$GATEWAY_HOST/get |grep backend; done | \
        sed -E 's/".*"(backend(-2)?)-[0-9a-zA-Z]*-.*/\1/'

    The output shows which backend handled each request. About 80% should go to backend and 20% to backend-2:

     backend-2
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend-2
     backend-2
     backend
     backend
     backend-2
     backend
     backend
     backend
     backend

Terminate TLS

Add a TLS listener to the Gateway to accept HTTPS traffic and terminate the TLS connection at the gateway.

  1. Generate a self-signed certificate and create a Secret:

    openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -subj '/O=example Inc./CN=example.com' -keyout example.com.key -out example.com.crt
    openssl req -out www.example.com.csr -newkey rsa:2048 -nodes -keyout www.example.com.key -subj "/CN=www.example.com/O=example organization"
    openssl x509 -req -days 365 -CA example.com.crt -CAkey example.com.key -set_serial 0 -in www.example.com.csr -out www.example.com.crt
    kubectl create secret tls example-cert --key=www.example.com.key --cert=www.example.com.crt
  2. Add a TLS listener to the Gateway that references the certificate Secret:

    kubectl patch gateway ack-gateway --type=json --patch '
      - op: add
        path: /spec/listeners/-
        value:
          name: https
          protocol: HTTPS
          port: 443
          tls:
            mode: Terminate
            certificateRefs:
            - kind: Secret
              group: ""
              name: example-cert
      '
  3. Verify the Gateway spec now includes both listeners:

    kubectl get gateway/ack-gateway -o yaml | grep spec: -A 20

    The output should show both an HTTP listener on port 80 and an HTTPS listener on port 443:

    spec:
      gatewayClassName: ack-gateway
      listeners:
      - allowedRoutes:
          namespaces:
            from: Same
        name: http
        port: 80
        protocol: HTTP
      - allowedRoutes:
          namespaces:
            from: Same
        name: https
        port: 443
        protocol: HTTPS
        tls:
          certificateRefs:
          - group: ""
            kind: Secret
            name: example-cert
          mode: Terminate
    status:
  4. Send a test request over HTTPS:

    curl -H "Host: www.example.com" --resolve "www.example.com:443:${GATEWAY_HOST}" \
    --cacert example.com.crt https://www.example.com/get

    A successful response confirms TLS termination is working. Note that X-Forwarded-Proto is https:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": ["*/*"],
      "User-Agent": ["curl/8.9.1"],
      "X-Envoy-External-Address": ["115.XX.XXX.55"],
      "X-Forwarded-For": ["115.XX.XXX.55"],
      "X-Forwarded-Proto": ["https"],
      "X-Request-Id": ["ac539756-3826-474b-be2f-5e57fdd49dac"]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }

What's next