All Products
Search
Document Center

Container Service for Kubernetes:Use Gateway with Inference Extension to expose Services

Last Updated:Jan 22, 2026

Gateway API is an official Kubernetes project for next-generation Ingress and load balancing APIs in Kubernetes. You can use Gateway API to configure traffic routing rules. This topic describes how to use Gateway with Inference Extension to configure the basic features of Gateway API.

How it works

Gateway with Inference Extension is developed based on the Envoy Gateway project, providing all basic features of Gateway API and open source Envoy Gateway extensions.

The architecture of Envoy Gateway comprises the following components:

  • Control plane: Envoy Gateway components.

    • The control plane is responsible for monitoring the traffic routing rules of the cluster and dynamically creating and managing Envoy Proxy instances.

    • The control plane updates the traffic forwarding rules of Envoy Proxy instances but is not directly used to forward traffic.

  • Data plane: Envoy Proxy instances.

    • The data plane is responsible for implementing efficient and reliable traffic processing and forwarding.

Prerequisites

Preparations

  1. Create the backend application. Save the following YAML manifest as backend.yaml and run the command kubectl apply -f backend.yaml to apply it to your cluster.

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: backend
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: backend
      labels:
        app: backend
        service: backend
    spec:
      ports:
        - name: http
          port: 3000
          targetPort: 3000
      selector:
        app: backend
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: backend
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: backend
          version: v1
      template:
        metadata:
          labels:
            app: backend
            version: v1
        spec:
          serviceAccountName: backend
          containers:
            - image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/envoygateway-echo-basic:v20231214-v1.0.0-140-gf544a46e
              imagePullPolicy: IfNotPresent
              name: backend
              ports:
                - containerPort: 3000
              env:
                - name: POD_NAME
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.name
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.namespace
  2. Create the backend-2 application by creating a copy of the backend.yaml file and replace all occurrences of the string backend with backend-2. Then, apply the new manifest.

  3. Install the Gateway with Inference Extension automatically creates a default GatewayClass. Verify its creation as shown below.

    kubectl get gatewayclass
    NAME          CONTROLLER                                      ACCEPTED   AGE
    ack-gateway   gateway.envoyproxy.io/gatewayclass-controller   True       2m31s

    If the GatewayClass resource was not created automatically, you must create it manually.

    Create the GatewayClass

    1. Save the following YAML content as gatewayclass.yaml.

      apiVersion: gateway.networking.k8s.io/v1
      kind: GatewayClass
      metadata:
        name: ack-gateway
      spec:
        controllerName: gateway.envoyproxy.io/gatewayclass-controller
    2. Apply the manifest.

      kubectl apply -f gatewayclass.yaml
  4. Create the Gateway resource. Save the following YAML content as gateway.yaml. Then, run the command kubectl apply -f gateway.yaml to apply it.

    apiVersion: gateway.networking.k8s.io/v1
    kind: Gateway
    metadata:
      name: ack-gateway
    spec:
      gatewayClassName: ack-gateway
      listeners:
        - name: http
          protocol: HTTP
          port: 80

    The control plane of Gateway with Inference Extension will automatically provision an Envoy Proxy Deployment and a corresponding Service of type LoadBalancer based on the Gateway resource you created. The Service will listen on the port(s) specified in your Gateway configuration. For information on the billing of the provisioned Server Load Balancing (SLB) instance, see SLB billing.

    You can also customize the Envoy Proxy Deployment specifications and the Service parameters, or enable the Horizontal Pod Autoscaler (HPA) for the gateway.

  5. Get the IP address of the gateway.

    export GATEWAY_HOST=$(kubectl get gateway/ack-gateway -o jsonpath='{.status.addresses[0].value}')

Configure HTTP routing based on path prefix matching

The following sample YAML file is used to create an HTTPRoute that matches the /get prefix.

  1. Create an HTTPRoute. Save the following YAML content as httproute.yaml. Then, run the command kubectl apply -f httproute.yaml to apply it.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
        - backendRefs:
            - group: ""
              kind: Service
              name: backend
              port: 3000
              weight: 1
          matches:
            - path:
                type: PathPrefix
                value: /get
  2. Access the application.

    curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

    Expected output:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "http"
      ],
      "X-Request-Id": [
       "953b2f8f-26d3-4ba9-93ba-a482b197b1ff"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
     }

Add a request header

Update the HTTPRoute by adding a header to the request.

  1. Update the HTTPRoute resource. Save the following YAML content as httproute.yaml. Then, run the command kubectl apply -f httproute.yaml to apply it.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /get
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 1
        filters:
        - type: RequestHeaderModifier
          requestHeaderModifier:
            add:
            - name: "added-header"
              value: "foo"
  2. Access the application.

    curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

    Expected output:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "Added-Header": [
       "foo"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "http"
      ],
      "X-Request-Id": [
       "d37f19e5-25c1-45cf-90e5-51453e7ae3ed"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }%  

    Since the sample application echoes the request information in its response, you will see the added-header in the output. This confirms that the operation was successful.

Configure proportional traffic splitting

Update the HTTPRoute based on the following YAML template. In the following YAML template, routing rules for the backend-2 Service are added and traffic weights are specified for the backend and backend-2 Services.

Note

The sum of the weights that you specify in the backendRefs parameter does not need to be 100. Proportion of traffic routed to a Service =. All requests are routed to Services based on their traffic proportions.

  1. Update the HTTPRoute. Save the following YAML content as httproute.yaml. Then, run the command kubectl apply -f httproute.yaml to apply it.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: ack-gateway
      hostnames:
        - "www.example.com"
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /get
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 8
        - group: ""
          kind: Service
          name: backend-2
          port: 3000
          weight: 2
  2. Access the application 20 times and then check the actual traffic ratio between the two Services.

    The following command sends 20 requests and processes the output to show only the backend Service (backend or backend-2) that handled each request.
    for i in $(seq 1 20); do curl -sS -H "Host: www.example.com" http://$GATEWAY_HOST/get |grep backend; done | \
        sed -E 's/".*"(backend(-2)?)-[0-9a-zA-Z]*-.*/\1/'

    Expected output:

     backend-2
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend-2
     backend-2
     backend
     backend
     backend-2
     backend
     backend
     backend
     backend

    The output shows that about 80% of the requests are forwarded to the backend Service and the remaining 20% is forwarded to the backend-2 Service.

Process TLS traffic

Update the Gateway by adding a certificate and a TLS listener.

  1. Generate a certificate and create a Secret.

    openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -subj '/O=example Inc./CN=example.com' -keyout example.com.key -out example.com.crt
    openssl req -out www.example.com.csr -newkey rsa:2048 -nodes -keyout www.example.com.key -subj "/CN=www.example.com/O=example organization"
    openssl x509 -req -days 365 -CA example.com.crt -CAkey example.com.key -set_serial 0 -in www.example.com.csr -out www.example.com.crt
    kubectl create secret tls example-cert --key=www.example.com.key --cert=www.example.com.crt
  2. Update the Gateway resource. Add a TLS listener and the certificate you created in the preceding step.

    kubectl patch gateway ack-gateway --type=json --patch '
      - op: add
        path: /spec/listeners/-
        value:
          name: https
          protocol: HTTPS
          port: 443
          tls:
            mode: Terminate
            certificateRefs:
            - kind: Secret
              group: ""
              name: example-cert
      '
  3. Check whether the modification is successful.

    kubectl get gateway/ack-gateway -o yaml | grep spec: -A 20

    Expected output:

    spec:
      gatewayClassName: ack-gateway
      listeners:
      - allowedRoutes:
          namespaces:
            from: Same
        name: http
        port: 80
        protocol: HTTP
      - allowedRoutes:
          namespaces:
            from: Same
        name: https
        port: 443
        protocol: HTTPS
        tls:
          certificateRefs:
          - group: ""
            kind: Secret
            name: example-cert
          mode: Terminate
    status:

    The output shows that the modification is successful.

  4. Access the application.

    curl -H Host: www.example.com --resolve "www.example.com:443:${GATEWAY_HOST}" \
    --cacert example.com.crt https://www.example.com/get

    Expected output:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "https"
      ],
      "X-Request-Id": [
       "ac539756-3826-474b-be2f-5e57fdd49dac"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }