All Products
Search
Document Center

Container Service for Kubernetes:Access services through Gateway with Inference Extension

Last Updated:Oct 23, 2025

The Gateway API is an official Kubernetes project that provides the next-generation API for routing and load balancing in Kubernetes. This topic describes how to use the Gateway API to configure the basic features supported by Gateway with Inference Extension.

How it works

Gateway with Inference Extension is a component built on the Envoy Gateway project. It supports all basic Gateway API features and open source Envoy Gateway extension resources.

The Envoy Gateway architecture includes:

  • Control plane: Consists of Envoy Gateway components that monitor traffic rules in the cluster. The control plane dynamically creates and manages Envoy proxy instances and updates their forwarding rules in real time. The control plane does not directly forward service traffic.

  • Data plane: Consists of running Envoy proxy instances that process and forward service traffic to ensure efficient and reliable communication.

Scenarios

Preparations

  1. Create the backend and backend-2 test applications. Save the following YAML content as backend.yaml, and then run the kubectl apply -f backend.yaml command.

    Note

    To create the backend-2 application, replace all instances of backend in the YAML file with backend-2.

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: backend
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: backend
      labels:
        app: backend
        service: backend
    spec:
      ports:
        - name: http
          port: 3000
          targetPort: 3000
      selector:
        app: backend
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: backend
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: backend
          version: v1
      template:
        metadata:
          labels:
            app: backend
            version: v1
        spec:
          serviceAccountName: backend
          containers:
            - image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/envoygateway-echo-basic:v20231214-v1.0.0-140-gf544a46e
              imagePullPolicy: IfNotPresent
              name: backend
              ports:
                - containerPort: 3000
              env:
                - name: POD_NAME
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.name
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.namespace
  2. A GatewayClass is created by default when you install Gateway with Inference Extension. Run the following command to view the GatewayClass.

    kubectl get gatewayclass
    NAME          CONTROLLER                                      ACCEPTED   AGE
    ack-gateway   gateway.envoyproxy.io/gatewayclass-controller   True       2m31s

    If the GatewayClass resource is not found, you can create it manually.

    Create a GatewayClass

    Save the following YAML content as gatewayclass.yaml, and then run the kubectl apply -f gatewayclass.yaml command.

    apiVersion: gateway.networking.k8s.io/v1
    kind: GatewayClass
    metadata:
      name: ack-gateway
    spec:
      controllerName: gateway.envoyproxy.io/gatewayclass-controller
  3. Create a Gateway resource. Save the following YAML content as gateway.yaml, and then run the kubectl apply -f gateway.yaml command.

    apiVersion: gateway.networking.k8s.io/v1
    kind: Gateway
    metadata:
      name: ack-gateway
    spec:
      gatewayClassName: ack-gateway
      listeners:
        - name: http
          protocol: HTTP
          port: 80

    The control plane of the Gateway with Inference Extension component creates an EnvoyProxy deployment and a corresponding `LoadBalancer` Service based on the Gateway resource. This Service listens on the specified port. For more information about the billing of Server Load Balancer instances, see Billing of Server Load Balancer.

    You can also customize the specifications of the EnvoyProxy deployment and Service parameters, or enable Horizontal Pod Autoscaling (HPA) for the gateway.

  4. Retrieve the gateway address.

    export GATEWAY_HOST=$(kubectl get gateway/ack-gateway -o jsonpath='{.status.addresses[0].value}')

HTTP routing based on path prefix matching

The following example shows how to configure an HTTPRoute to match the /get path prefix and test the configuration.

  1. Create an HTTPRoute resource. Save the following YAML content as httproute.yaml, and then run the kubectl apply -f httproute.yaml command.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
        - backendRefs:
            - group: ""
              kind: Service
              name: backend
              port: 3000
              weight: 1
          matches:
            - path:
                type: PathPrefix
                value: /get
  2. Test the access.

    curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

    Expected output:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "http"
      ],
      "X-Request-Id": [
       "953b2f8f-26d3-4ba9-93ba-a482b197b1ff"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
     }

Add a request header

Update the HTTPRoute configuration to add a header to routed requests.

  1. Update the HTTPRoute resource. Save the following YAML content as httproute.yaml, and then run the kubectl apply -f httproute.yaml command.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /get
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 1
        filters:
        - type: RequestHeaderModifier
          requestHeaderModifier:
            add:
            - name: "added-header"
              value: "foo"
  2. Test the access.

    curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

    Expected output: The sample application returns request information in its response. You can see the added added-header in the response, which indicates that the operation was successful.

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "Added-Header": [
       "foo"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "http"
      ],
      "X-Request-Id": [
       "d37f19e5-25c1-45cf-90e5-51453e7ae3ed"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }  

Split traffic by weight

Update the HTTPRoute configuration to add a routing rule for the backend-2 service and configure traffic weights for the backend and backend-2 services.

Note

The Gateway API does not require the sum of weights in backendRef to be 100. The traffic proportion for a single service is calculated as follows: . All request traffic is distributed according to this ratio.

  1. Update the HTTPRoute resource. Save the following YAML content as httproute.yaml, and then run the kubectl apply -f httproute.yaml command.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /get
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 8
        - group: ""
          kind: Service
          name: backend-2
          port: 3000
          weight: 2
  2. Test the access by sending 20 consecutive requests and check the traffic ratio between the two services.

    The following command includes extra processing to show only backend and backend-2 in the output.
    for i in $(seq 1 20); do curl -sS -H "Host: www.example.com" http://$GATEWAY_HOST/get |grep backend; done | \
        sed -E 's/".*"(backend(-2)?)-[0-9a-zA-Z]*-.*/\1/'

    Expected output: The traffic is split between the two services at a ratio of approximately 80% to 20%.

     backend-2
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend-2
     backend-2
     backend
     backend
     backend-2
     backend
     backend
     backend
     backend

Process TLS traffic

Update the Gateway resource by configuring a certificate and adding a TLS listener to verify that TLS traffic is processed correctly.

  1. Generate a certificate and create a Secret.

    openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -subj '/O=example Inc./CN=example.com' -keyout example.com.key -out example.com.crt
    openssl req -out www.example.com.csr -newkey rsa:2048 -nodes -keyout www.example.com.key -subj "/CN=www.example.com/O=example organization"
    openssl x509 -req -days 365 -CA example.com.crt -CAkey example.com.key -set_serial 0 -in www.example.com.csr -out www.example.com.crt
    kubectl create secret tls example-cert --key=www.example.com.key --cert=www.example.com.crt
  2. Update the Gateway resource. Add a TLS listener and reference the certificate created in the previous step.

    kubectl patch gateway ack-gateway --type=json --patch '
      - op: add
        path: /spec/listeners/-
        value:
          name: https
          protocol: HTTPS
          port: 443
          tls:
            mode: Terminate
            certificateRefs:
            - kind: Secret
              group: ""
              name: example-cert
      '
  3. Check if the changes have taken effect.

    kubectl get gateway/ack-gateway -o yaml | grep spec: -A 20

    Expected output:

    spec:
      gatewayClassName: ack-gateway
      listeners:
      - allowedRoutes:
          namespaces:
            from: Same
        name: http
        port: 80
        protocol: HTTP
      - allowedRoutes:
          namespaces:
            from: Same
        name: https
        port: 443
        protocol: HTTPS
        tls:
          certificateRefs:
          - group: ""
            kind: Secret
            name: example-cert
          mode: Terminate
    status:

    The output shows that the changes are in effect.

  4. Test the access.

    curl -H Host: www.example.com --resolve "www.example.com:443:${GATEWAY_HOST}" \
    --cacert example.com.crt https://www.example.com/get

    Expected output:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "https"
      ],
      "X-Request-Id": [
       "ac539756-3826-474b-be2f-5e57fdd49dac"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }