All Products
Search
Document Center

Container Compute Service:Expose Services using Gateway with Inference Extension

Last Updated:Jul 28, 2025

Gateway API is an official Kubernetes project for next-generation Ingress and load balancing APIs in Kubernetes. You can use Gateway API to configure traffic routing rules. This topic describes how to use Gateway with Inference Extension to configure the basic features of Gateway API.

Background information

Gateway with Inference Extension is developed based on the Envoy Gateway project, providing all basic features of Gateway API and open source Envoy Gateway extensions.

The architecture of Envoy Gateway consists of the following components:

  • Control plane: consists of Envoy Gateway components. The control plane is responsible for monitoring the traffic routing rules of the cluster and dynamically creating and managing Envoy proxy instances. The control plane updates the traffic forwarding rules of Envoy proxy instances but is not directly used to forward traffic.

  • Data plane: consists of running Envoy proxy instances. The data plane is responsible for implementing efficient and reliable traffic processing and forwarding.

Prerequisites

Preparations

  1. Create two applications: backend and backend-2.

    Note

    After you create the backend application, modify its YAML file by replacing with backend-2 and use the modified YAML file to create the backend-2 application.

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: backend
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: backend
      labels:
        app: backend
        service: backend
    spec:
      ports:
        - name: http
          port: 3000
          targetPort: 3000
      selector:
        app: backend
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: backend
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: backend
          version: v1
      template:
        metadata:
          labels:
            app: backend
            version: v1
        spec:
          serviceAccountName: backend
          containers:
            - image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/envoygateway-echo-basic:v20231214-v1.0.0-140-gf544a46e
              imagePullPolicy: IfNotPresent
              name: backend
              ports:
                - containerPort: 3000
              env:
                - name: POD_NAME
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.name
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.namespace
  2. Create a GatewayClass and set controllerName to gateway.envoyproxy.io/gatewayclass-controller.

    apiVersion: gateway.networking.k8s.io/v1
    kind: GatewayClass
    metadata:
      name: eg
    spec:
      controllerName: gateway.envoyproxy.io/gatewayclass-controller
  3. Create a Gateway.

    apiVersion: gateway.networking.k8s.io/v1
    kind: Gateway
    metadata:
      name: eg
    spec:
      gatewayClassName: eg
      listeners:
        - name: http
          protocol: HTTP
          port: 80

    After you create a Gateway, the control plane of Gateway with Inference Extension automatically uses the Gateway configurations to create an EnvoyProxy Deployment and a corresponding LoadBalancing Service that listens on the specified port. For more information about the billing rules of Server Load Balancing (SLB) instances, see SLB billing.

    You can customize the configurations of the EnvoyProxy Deployment and customize the parameters of the Service. You can also enable horizontal pod autoscaling for the gateway.

  4. Obtain the IP address of the gateway.

    export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')

Configure HTTP routing based on path prefix matching

The following sample YAML file is used to create an HTTPRoute that matches the /get prefix.

  1. Create an HTTPRoute.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: eg
      hostnames:
        - "www.example.com"
      rules:
        - backendRefs:
            - group: ""
              kind: Service
              name: backend
              port: 3000
              weight: 1
          matches:
            - path:
                type: PathPrefix
                value: /get
  2. Access the application.

    curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

    Expected output:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "http"
      ],
      "X-Request-Id": [
       "953b2f8f-26d3-4ba9-93ba-a482b197b1ff"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
     }

Add a request header

Update the HTTPRoute by adding a header to the request.

  1. Update the HTTPRoute.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: eg
      hostnames:
        - "www.example.com"
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /get
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 1
        filters:
        - type: RequestHeaderModifier
          requestHeaderModifier:
            add:
            - name: "added-header"
              value: "foo"
  2. Access the application.

    curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

    Expected output:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "Added-Header": [
       "foo"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "http"
      ],
      "X-Request-Id": [
       "d37f19e5-25c1-45cf-90e5-51453e7ae3ed"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }%  

    The function of the application is to return the content of the request. If the request header is displayed in the output, the header is successfully added.

Configure proportional traffic splitting

Update the HTTPRoute based on the following YAML template. In the following YAML template, routing rules for the backend-2 Service are added and traffic weights are specified for the backend and backend-2 Services.

Note

The sum of the weights that you specify in the backendRefs parameter does not need to be 100. Proportion of traffic routed to a Service =. All requests are routed to Services based on their traffic proportions.

  1. Update the HTTPRoute.

    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: backend
    spec:
      parentRefs:
        - name: eg
      hostnames:
        - "www.example.com"
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /get
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 8
        - group: ""
          kind: Service
          name: backend-2
          port: 3000
          weight: 2
  2. Access the application 20 times and then check the actual traffic ratio between the two Services.

    Note

    The following command is specifically used to display only backend and backend-2 in the output.

    for i in $(seq 1 20); do curl -sS -H "Host: www.example.com" http://$GATEWAY_HOST/get |grep backend; done | \
        sed -E 's/".*"(backend(-2)?)-[0-9a-zA-Z]*-.*/\1/'

    Expected output:

     backend-2
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend
     backend-2
     backend-2
     backend
     backend
     backend-2
     backend
     backend
     backend
     backend

    The output shows that about 80% of the requests are forwarded to the backend Service and the remaining 20% is forwarded to the backend-2 Service.

Process TLS traffic

Update the Gateway by adding a certificate and a TLS listener.

  1. Generate a certificate and use it to create a Secret.

    openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -subj '/O=example Inc./CN=example.com' -keyout example.com.key -out example.com.crt
    openssl req -out www.example.com.csr -newkey rsa:2048 -nodes -keyout www.example.com.key -subj "/CN=www.example.com/O=example organization"
    openssl x509 -req -days 365 -CA example.com.crt -CAkey example.com.key -set_serial 0 -in www.example.com.csr -out www.example.com.crt
    kubectl create secret tls example-cert --key=www.example.com.key --cert=www.example.com.crt
  2. Add a TLS listener and the certificate you created in the preceding step.

    kubectl patch gateway eg --type=json --patch '
      - op: add
        path: /spec/listeners/-
        value:
          name: https
          protocol: HTTPS
          port: 443
          tls:
            mode: Terminate
            certificateRefs:
            - kind: Secret
              group: ""
              name: example-cert
      '
  3. Check whether the modification is successful.

    kubectl get gateway/eg -o yaml | grep spec: -A 20

    Expected output:

    spec:
      gatewayClassName: eg
      listeners:
      - allowedRoutes:
          namespaces:
            from: Same
        name: http
        port: 80
        protocol: HTTP
      - allowedRoutes:
          namespaces:
            from: Same
        name: https
        port: 443
        protocol: HTTPS
        tls:
          certificateRefs:
          - group: ""
            kind: Secret
            name: example-cert
          mode: Terminate
    status:

    The output shows that the modification is successful.

  4. Access the application.

    curl -H Host:www.example.com --resolve "www.example.com:443:${GATEWAY_HOST}" \
    --cacert example.com.crt https://www.example.com/get

    Expected output:

    {
     "path": "/get",
     "host": "www.example.com",
     "method": "GET",
     "proto": "HTTP/1.1",
     "headers": {
      "Accept": [
       "*/*"
      ],
      "User-Agent": [
       "curl/8.9.1"
      ],
      "X-Envoy-External-Address": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-For": [
       "115.XX.XXX.55"
      ],
      "X-Forwarded-Proto": [
       "https"
      ],
      "X-Request-Id": [
       "ac539756-3826-474b-be2f-5e57fdd49dac"
      ]
     },
     "namespace": "default",
     "ingress": "",
     "service": "",
     "pod": "backend-5bff7XXXXX-XXXXX"
    }