Use Gateway with Inference Extension to expose Services - Container Service for Kubernetes

Gateway API is an official Kubernetes project for next-generation Ingress and load balancing APIs in Kubernetes. You can use Gateway API to configure traffic routing rules. This topic describes how to use Gateway with Inference Extension to configure the basic features of Gateway API.

How it works

Gateway with Inference Extension is developed based on the Envoy Gateway project, providing all basic features of Gateway API and open source Envoy Gateway extensions.

The architecture of Envoy Gateway comprises the following components:

Control plane: Envoy Gateway components.
- The control plane is responsible for monitoring the traffic routing rules of the cluster and dynamically creating and managing Envoy Proxy instances.
- The control plane updates the traffic forwarding rules of Envoy Proxy instances but is not directly used to forward traffic.
Data plane: Envoy Proxy instances.
- The data plane is responsible for implementing efficient and reliable traffic processing and forwarding.

Prerequisites

The Container Service for Kubernetes (ACK) managed cluster is running Kubernetes 1.30 or later. Upgrade your cluster if needed.
Gateway with Inference Extension is installed in the cluster.

Preparations

Create the backend application. Save the following YAML manifest as backend.yaml and run the command kubectl apply -f backend.yaml to apply it to your cluster.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: backend
---
apiVersion: v1
kind: Service
metadata:
  name: backend
  labels:
    app: backend
    service: backend
spec:
  ports:
    - name: http
      port: 3000
      targetPort: 3000
  selector:
    app: backend
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: backend
      version: v1
  template:
    metadata:
      labels:
        app: backend
        version: v1
    spec:
      serviceAccountName: backend
      containers:
        - image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/envoygateway-echo-basic:v20231214-v1.0.0-140-gf544a46e
          imagePullPolicy: IfNotPresent
          name: backend
          ports:
            - containerPort: 3000
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace

Create the backend-2 application by creating a copy of the backend.yaml file and replace all occurrences of the string backend with backend-2. Then, apply the new manifest.

Install the Gateway with Inference Extension automatically creates a default GatewayClass. Verify its creation as shown below.

kubectl get gatewayclass
NAME          CONTROLLER                                      ACCEPTED   AGE
ack-gateway   gateway.envoyproxy.io/gatewayclass-controller   True       2m31s

If the GatewayClass resource was not created automatically, you must create it manually.

Create the GatewayClass

Save the following YAML content as gatewayclass.yaml.

apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: ack-gateway
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller

Apply the manifest.
```
kubectl apply -f gatewayclass.yaml
```

Create the Gateway resource. Save the following YAML content as gateway.yaml. Then, run the command kubectl apply -f gateway.yaml to apply it.
```
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: ack-gateway
spec:
  gatewayClassName: ack-gateway
  listeners:
    - name: http
      protocol: HTTP
      port: 80
```
The control plane of Gateway with Inference Extension will automatically provision an Envoy Proxy Deployment and a corresponding Service of type LoadBalancer based on the Gateway resource you created. The Service will listen on the port(s) specified in your Gateway configuration. For information on the billing of the provisioned Server Load Balancing (SLB) instance, see SLB billing.
You can also customize the Envoy Proxy Deployment specifications and the Service parameters, or enable the Horizontal Pod Autoscaler (HPA) for the gateway.

Get the IP address of the gateway.

export GATEWAY_HOST=$(kubectl get gateway/ack-gateway -o jsonpath='{.status.addresses[0].value}')

Configure HTTP routing based on path prefix matching

The following sample YAML file is used to create an HTTPRoute that matches the /get prefix.

Create an HTTPRoute. Save the following YAML content as httproute.yaml. Then, run the command kubectl apply -f httproute.yaml to apply it.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: backend
spec:
  parentRefs:
    - name: ack-gateway
  hostnames:
    - "www.example.com"
  rules:
    - backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
          weight: 1
      matches:
        - path:
            type: PathPrefix
            value: /get

Access the application.

curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

Expected output:

{
 "path": "/get",
 "host": "www.example.com",
 "method": "GET",
 "proto": "HTTP/1.1",
 "headers": {
  "Accept": [
   "*/*"
  ],
  "User-Agent": [
   "curl/8.9.1"
  ],
  "X-Envoy-External-Address": [
   "115.XX.XXX.55"
  ],
  "X-Forwarded-For": [
   "115.XX.XXX.55"
  ],
  "X-Forwarded-Proto": [
   "http"
  ],
  "X-Request-Id": [
   "953b2f8f-26d3-4ba9-93ba-a482b197b1ff"
  ]
 },
 "namespace": "default",
 "ingress": "",
 "service": "",
 "pod": "backend-5bff7XXXXX-XXXXX"
 }

Add a request header

Update the HTTPRoute by adding a header to the request.

Update the HTTPRoute resource. Save the following YAML content as httproute.yaml. Then, run the command kubectl apply -f httproute.yaml to apply it.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: backend
spec:
  parentRefs:
    - name: ack-gateway
  hostnames:
    - "www.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /get
    backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
      weight: 1
    filters:
    - type: RequestHeaderModifier
      requestHeaderModifier:
        add:
        - name: "added-header"
          value: "foo"

Access the application.

curl -H "Host: www.example.com" http://$GATEWAY_HOST/get

Expected output:

{
 "path": "/get",
 "host": "www.example.com",
 "method": "GET",
 "proto": "HTTP/1.1",
 "headers": {
  "Accept": [
   "*/*"
  ],
  "Added-Header": [
   "foo"
  ],
  "User-Agent": [
   "curl/8.9.1"
  ],
  "X-Envoy-External-Address": [
   "115.XX.XXX.55"
  ],
  "X-Forwarded-For": [
   "115.XX.XXX.55"
  ],
  "X-Forwarded-Proto": [
   "http"
  ],
  "X-Request-Id": [
   "d37f19e5-25c1-45cf-90e5-51453e7ae3ed"
  ]
 },
 "namespace": "default",
 "ingress": "",
 "service": "",
 "pod": "backend-5bff7XXXXX-XXXXX"
}%

Since the sample application echoes the request information in its response, you will see the added-header in the output. This confirms that the operation was successful.

Configure proportional traffic splitting

Update the HTTPRoute based on the following YAML template. In the following YAML template, routing rules for the backend-2 Service are added and traffic weights are specified for the backend and backend-2 Services.

Note

The sum of the weights that you specify in the backendRefs parameter does not need to be 100. Proportion of traffic routed to a Service = $\frac{Service weight}{Sum of all Service weights}$ . All requests are routed to Services based on their traffic proportions.

Update the HTTPRoute. Save the following YAML content as httproute.yaml. Then, run the command kubectl apply -f httproute.yaml to apply it.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: backend
spec:
  parentRefs:
    - name: ack-gateway
  hostnames:
    - "www.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /get
    backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
      weight: 8
    - group: ""
      kind: Service
      name: backend-2
      port: 3000
      weight: 2

Access the application 20 times and then check the actual traffic ratio between the two Services.
The following command sends 20 requests and processes the output to show only the backend Service (backend or backend-2) that handled each request.
```
for i in $(seq 1 20); do curl -sS -H "Host: www.example.com" http://$GATEWAY_HOST/get |grep backend; done | \
    sed -E 's/".*"(backend(-2)?)-[0-9a-zA-Z]*-.*/\1/'
```
Expected output:
```
 backend-2
 backend
 backend
 backend
 backend
 backend
 backend
 backend
 backend
 backend
 backend
 backend-2
 backend-2
 backend
 backend
 backend-2
 backend
 backend
 backend
 backend
```
The output shows that about 80% of the requests are forwarded to the backend Service and the remaining 20% is forwarded to the backend-2 Service.

Process TLS traffic

Update the Gateway by adding a certificate and a TLS listener.

Generate a certificate and create a Secret.

openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -subj '/O=example Inc./CN=example.com' -keyout example.com.key -out example.com.crt
openssl req -out www.example.com.csr -newkey rsa:2048 -nodes -keyout www.example.com.key -subj "/CN=www.example.com/O=example organization"
openssl x509 -req -days 365 -CA example.com.crt -CAkey example.com.key -set_serial 0 -in www.example.com.csr -out www.example.com.crt
kubectl create secret tls example-cert --key=www.example.com.key --cert=www.example.com.crt

Update the Gateway resource. Add a TLS listener and the certificate you created in the preceding step.

kubectl patch gateway ack-gateway --type=json --patch '
  - op: add
    path: /spec/listeners/-
    value:
      name: https
      protocol: HTTPS
      port: 443
      tls:
        mode: Terminate
        certificateRefs:
        - kind: Secret
          group: ""
          name: example-cert
  '

Check whether the modification is successful.

kubectl get gateway/ack-gateway -o yaml | grep spec: -A 20

Expected output:

spec:
  gatewayClassName: ack-gateway
  listeners:
  - allowedRoutes:
      namespaces:
        from: Same
    name: http
    port: 80
    protocol: HTTP
  - allowedRoutes:
      namespaces:
        from: Same
    name: https
    port: 443
    protocol: HTTPS
    tls:
      certificateRefs:
      - group: ""
        kind: Secret
        name: example-cert
      mode: Terminate
status:

The output shows that the modification is successful.

Access the application.

curl -H Host: www.example.com --resolve "www.example.com:443:${GATEWAY_HOST}" \
--cacert example.com.crt https://www.example.com/get

Expected output:

{
 "path": "/get",
 "host": "www.example.com",
 "method": "GET",
 "proto": "HTTP/1.1",
 "headers": {
  "Accept": [
   "*/*"
  ],
  "User-Agent": [
   "curl/8.9.1"
  ],
  "X-Envoy-External-Address": [
   "115.XX.XXX.55"
  ],
  "X-Forwarded-For": [
   "115.XX.XXX.55"
  ],
  "X-Forwarded-Proto": [
   "https"
  ],
  "X-Request-Id": [
   "ac539756-3826-474b-be2f-5e57fdd49dac"
  ]
 },
 "namespace": "default",
 "ingress": "",
 "service": "",
 "pod": "backend-5bff7XXXXX-XXXXX"
}