使用Gateway with Inference Extension配置全域限流規則-Container Service Kubernetes 版 ACK-阿里雲

Gateway with Inference Extension組件支援為叢集開啟全域限流，確保系統在高並發或異常流量下保持穩定運行。本文介紹如何基於Gateway with Inference Extension組件配置全域限流以及支援的限流情境。

功能說明

限流是一種限制發送到服務端的請求數量的機制。它指定用戶端在給定時間段內可以向服務端發送的最大請求數，通常表示為一段時間內的請求數，例如每分鐘300個請求或每秒10個請求等。Gateway with Inference Extension組件在開啟全域限流之後，會自動部署一個全域限流服務。該全域限流服務負責集中管理並動態提供全域的限流策略與即時資料流量資料。Gateway with Inference Extension通過內建的限流過濾器（如Rate Limit Filter）與全域限流服務進行互動，即時擷取預設的限流閾值（例如每秒請求數或並發串連數），並基於這些策略對傳入的請求進行速率限制。

前提條件

已安裝Gateway with Inference ExtensionGateway with Inference Extension，且版本不低於1.4.0。
已完成準備工作中的步驟。

操作步驟

步驟一：開啟全域限流

全域限流自動部署的限流服務依賴一個Redis服務作為全域儲存，本文採用自建Redis服務的方式。您也可以使用Tair (Redis OSS-compatible)來快速建立Redis執行個體，並將相關配置資訊更新到envoy-gateway-system命名空間下的ack-gateway-config配置項中。相關配置說明，請參見Envoy Gateway。

建立redis-service.yaml。

kind: Namespace
apiVersion: v1
metadata:
  name: redis-system
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis
  namespace: redis-system
  labels:
    app: redis
spec:
  serviceName: "redis"
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
        - image: registry-cn-hangzhou.ack.aliyuncs.com/dev/redis:6.0.6-for-ack-gateway
          name: redis
          ports:
            - containerPort: 6379
          resources:
            limits:
              cpu: 1500m
              memory: 512Mi
            requests:
              cpu: 200m
              memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
  name: redis
  namespace: redis-system
  labels:
    app: redis
spec:
  ports:
    - name: redis
      port: 6379
      protocol: TCP
      targetPort: 6379
  selector:
    app: redis

建立enable-global-rate-limit.yaml。

apiVersion: v1
kind: ConfigMap
metadata:
  name: ack-gateway-config
  namespace: envoy-gateway-system
data:
  ack-gateway.yaml: |
    apiVersion: gateway.envoyproxy.io/v1alpha1
    kind: EnvoyGateway
    rateLimit:
      backend:
        type: Redis
        redis:
          url: redis.redis-system.svc.cluster.local:6379

部署Redis服務並開啟全域限流。

kubectl apply -f redis-service.yaml
kubectl apply -f enable-global-rate-limit.yaml

步驟二：部署HTTPRoute資源

為後續測試建立HTTPRoute資源，後續的限流規則將會應用到此資源上。

建立httproute.yaml。

---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: http-ratelimit
spec:
  parentRefs:
  - name: eg
  hostnames:
  - ratelimit.example 
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000

部署HTTPRoute資源。
```
kubectl apply -f httproute.yaml
```

擷取Gateway的公網IP。

export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')

步驟三：情境示範

對指定使用者進行限流

配置全域限流規則，限制要求標頭x-user-id值為one的請求每小時只能有3次訪問。

建立backendtrafficpolicy.yaml。

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy 
metadata:
  name: policy-httproute
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: http-ratelimit
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - headers:
          - name: x-user-id
            value: one
        limit:
          requests: 3
          unit: Hour

部署限流規則。

kubectl apply -f backendtrafficpolicy.yaml

測試帶有x-user-id: one要求標頭的請求限流情況。

for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done

預期輸出：

HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 07:47:49 GMT                                                                                                                                        
content-length: 504                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 2                                                                                                                                                   
x-ratelimit-reset: 731                                                                                                                                                     
 
HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 07:47:50 GMT                                                                                                                                        
content-length: 504                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 1                                                                                                                                                   
x-ratelimit-reset: 730                                                                                                                                                     
 
HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
content-length: 504                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 0                                                                                                                                                   
x-ratelimit-reset: 728                                                                                                                                                     
 
HTTP/1.1 429 Too Many Requests 
x-envoy-ratelimited: true                                                                                                                                                  
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 0                                                                                                                                                   
x-ratelimit-reset: 727                                                                                                                                                     
date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
transfer-encoding: chunked

可以看到，前3次請求返回了200，第4次請求返回429，說明限流規則限制了帶有x-user-id: one要求標頭的請求。

測試帶有x-user-id: two要求標頭的請求限流情況。

for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done

預期輸出：

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:50:11 GMT
content-length: 504

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:50:12 GMT
content-length: 504

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:50:14 GMT
content-length: 504

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:50:15 GMT
content-length: 504

可以看到，4次請求返回都是200，說明限流規則沒有對帶有x-user-id: two要求標頭的請求進行限制。

對除管理員之外的其他使用者進行分別限流

更新全域限流規則，對要求標頭x-user-id值為admin的請求不限流，對其他要求標頭值的要求節流每小時只能有3次訪問。

編輯限流規則。

kubectl edit BackendTrafficPolicy policy-httproute

使用以下內容更新限流規則。

...
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - headers:
          - type: Distinct
            name: x-user-id
          - name: x-user-id
            value: admin
            invert: true
        limit:
          requests: 3
          unit: Hour

儲存並退出後，限流規則即時生效。

測試帶有x-user-id: one要求標頭的請求限流情況。

for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done

預期輸出：

HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 07:47:49 GMT                                                                                                                                        
content-length: 504                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 2                                                                                                                                                   
x-ratelimit-reset: 731                                                                                                                                                     
 
HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 07:47:50 GMT                                                                                                                                        
content-length: 504                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 1                                                                                                                                                   
x-ratelimit-reset: 730                                                                                                                                                     
 
HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
content-length: 504                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 0                                                                                                                                                   
x-ratelimit-reset: 728                                                                                                                                                     
 
HTTP/1.1 429 Too Many Requests 
x-envoy-ratelimited: true                                                                                                                                                  
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 0                                                                                                                                                   
x-ratelimit-reset: 727                                                                                                                                                     
date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
transfer-encoding: chunked

可以看到，前3次請求返回了200，第4次請求返回429，說明限流規則限制了帶有x-user-id: one要求標頭的請求。

測試帶有x-user-id: two要求標頭的請求限流情況。

for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done

預期輸出：

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:53:38 GMT
content-length: 504
x-ratelimit-limit: 3, 3;w=3600
x-ratelimit-remaining: 2
x-ratelimit-reset: 382

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:53:39 GMT
content-length: 504
x-ratelimit-limit: 3, 3;w=3600
x-ratelimit-remaining: 1
x-ratelimit-reset: 381

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:53:41 GMT
content-length: 504
x-ratelimit-limit: 3, 3;w=3600
x-ratelimit-remaining: 0
x-ratelimit-reset: 379

HTTP/1.1 429 Too Many Requests
x-envoy-ratelimited: true
x-ratelimit-limit: 3, 3;w=3600
x-ratelimit-remaining: 0
x-ratelimit-reset: 378
date: Tue, 27 May 2025 07:53:41 GMT
transfer-encoding: chunked

可以看到，前3次請求返回了200，第4次請求返回429，說明限流規則限制了帶有x-user-id: two要求標頭的請求。

測試帶有x-user-id: admin要求標頭的請求限流情況。

for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: admin" http://$GATEWAY_HOST/get ; sleep 1; done

預期輸出：

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:57:44 GMT
content-length: 506

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:57:45 GMT
content-length: 506

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:57:46 GMT
content-length: 506

HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 27 May 2025 07:57:47 GMT
content-length: 506

可以看到，4次請求返回都是200，說明限流規則沒有對帶有x-user-id: admin要求標頭的請求進行限制。

限制所有請求

更新全域限流規則，使所有請求每小時只能有3次訪問。

編輯限流規則。

kubectl edit BackendTrafficPolicy policy-httproute

使用以下內容更新限流規則。

...
  rateLimit:
    type: Global
    global:
      rules:
      - limit:
          requests: 3
          unit: Hour

儲存並退出後，限流規則即時生效。

測試普通請求限流情況。

for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done

預期輸出：

HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 08:02:53 GMT                                                                                                                                        
content-length: 473                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 2                                                                                                                                                   
x-ratelimit-reset: 3427                                                                                                                                                    
 
HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 08:02:55 GMT                                                                                                                                        
content-length: 473                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 1                                                                                                                                                   
x-ratelimit-reset: 3425                                                                                                                                                    
 
HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 08:02:56 GMT                                                                                                                                        
content-length: 473                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 0                                                                                                                                                   
x-ratelimit-reset: 3424                                                                                                                                                    
 
HTTP/1.1 429 Too Many Requests 
x-envoy-ratelimited: true                                                                                                                                                  
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 0                                                                                                                                                   
x-ratelimit-reset: 3423                                                                                                                                                    
date: Tue, 27 May 2025 08:02:57 GMT                                                                                                                                        
transfer-encoding: chunked

可以看到，前3次請求返回了200，第4次請求返回429，說明限流規則已經生效。

根據用戶端IP進行限流

更新全域限流規則，對一個IP段內的每個IP進行分別限制每小時只能有3次訪問。

說明

為了方便示範，本情境限制的IP網段為0.0.0.0/0，您可以根據實際情況進行調整。

編輯限流規則。

kubectl edit BackendTrafficPolicy policy-httproute

使用以下內容更新限流規則。

...
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - sourceCIDR: 
            value: 0.0.0.0/0
            type: Distinct
        limit:
          requests: 3
          unit: Hour

儲存並退出後，限流規則即時生效。

測試普通請求限流情況。

for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done

預期輸出：

HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 08:02:53 GMT                                                                                                                                        
content-length: 473                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 2                                                                                                                                                   
x-ratelimit-reset: 3427                                                                                                                                                    
 
HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 08:02:55 GMT                                                                                                                                        
content-length: 473                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 1                                                                                                                                                   
x-ratelimit-reset: 3425                                                                                                                                                    
 
HTTP/1.1 200 OK 
content-type: application/json                                                                                                                                             
x-content-type-options: nosniff                                                                                                                                            
date: Tue, 27 May 2025 08:02:56 GMT                                                                                                                                        
content-length: 473                                                                                                                                                        
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 0                                                                                                                                                   
x-ratelimit-reset: 3424                                                                                                                                                    
 
HTTP/1.1 429 Too Many Requests 
x-envoy-ratelimited: true                                                                                                                                                  
x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
x-ratelimit-remaining: 0                                                                                                                                                   
x-ratelimit-reset: 3423                                                                                                                                                    
date: Tue, 27 May 2025 08:02:57 GMT                                                                                                                                        
transfer-encoding: chunked

可以看到，前3次請求返回了200，第4次請求返回429，說明限流規則對0.0.0.0/0的限流已經生效。

（可選）步驟四：清理測試資源

清理限流規則。

kubectl delete BackendTrafficPolicy policy-httproute

清理其他資源。

kubectl delete -f httproute.yaml
kubectl delete -f redis-service.yaml
kubectl delete -f enable-global-rate-limit.yaml