全部產品
Search
文件中心

Container Service for Kubernetes:基於Gateway with Inference Extension實現全域限流

更新時間:Mar 07, 2026

Gateway with Inference Extension組件支援為叢集開啟全域限流,確保系統在高並發或異常流量下保持穩定運行。本文介紹如何基於Gateway with Inference Extension組件配置全域限流以及支援的限流情境。

功能說明

限流是一種限制發送到服務端的請求數量的機制。它指定用戶端在給定時間段內可以向服務端發送的最大請求數,通常表示為一段時間內的請求數,例如每分鐘300個請求或每秒10個請求等。Gateway with Inference Extension組件在開啟全域限流之後,會自動部署一個全域限流服務。該全域限流服務負責集中管理並動態提供全域的限流策略與即時資料流量資料。Gateway with Inference Extension通過內建的限流過濾器(如Rate Limit Filter)與全域限流服務進行互動,即時擷取預設的限流閾值(例如每秒請求數或並發串連數),並基於這些策略對傳入的請求進行速率限制。

前提條件

操作步驟

步驟一:開啟全域限流

全域限流自動部署的限流服務依賴一個Redis服務作為全域儲存,本文採用自建Redis服務的方式。您也可以使用Tair (Redis OSS-compatible)來快速建立Redis執行個體,並將相關配置資訊更新到envoy-gateway-system命名空間下的ack-gateway-config配置項中。相關配置說明,請參見Envoy Gateway

  1. 建立redis-service.yaml。

    kind: Namespace
    apiVersion: v1
    metadata:
      name: redis-system
    ---
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      name: redis
      namespace: redis-system
      labels:
        app: redis
    spec:
      serviceName: "redis"
      replicas: 1
      selector:
        matchLabels:
          app: redis
      template:
        metadata:
          labels:
            app: redis
        spec:
          containers:
            - image: registry-cn-hangzhou.ack.aliyuncs.com/dev/redis:6.0.6-for-ack-gateway
              name: redis
              ports:
                - containerPort: 6379
              resources:
                limits:
                  cpu: 1500m
                  memory: 512Mi
                requests:
                  cpu: 200m
                  memory: 256Mi
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: redis
      namespace: redis-system
      labels:
        app: redis
    spec:
      ports:
        - name: redis
          port: 6379
          protocol: TCP
          targetPort: 6379
      selector:
        app: redis
    
  2. 建立enable-global-rate-limit.yaml。

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-gateway-config
      namespace: envoy-gateway-system
    data:
      ack-gateway.yaml: |
        apiVersion: gateway.envoyproxy.io/v1alpha1
        kind: EnvoyGateway
        rateLimit:
          backend:
            type: Redis
            redis:
              url: redis.redis-system.svc.cluster.local:6379
  3. 部署Redis服務並開啟全域限流。

    kubectl apply -f redis-service.yaml
    kubectl apply -f enable-global-rate-limit.yaml

步驟二:部署HTTPRoute資源

為後續測試建立HTTPRoute資源,後續的限流規則將會應用到此資源上。

  1. 建立httproute.yaml。

    ---
    apiVersion: gateway.networking.k8s.io/v1
    kind: HTTPRoute
    metadata:
      name: http-ratelimit
    spec:
      parentRefs:
      - name: eg
      hostnames:
      - ratelimit.example 
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /
        backendRefs:
        - group: ""
          kind: Service
          name: backend
          port: 3000
    
  2. 部署HTTPRoute資源。

    kubectl apply -f httproute.yaml
  3. 擷取Gateway的公網IP。

    export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')

步驟三:情境示範

對指定使用者進行限流

配置全域限流規則,限制要求標頭x-user-id值為one的請求每小時只能有3次訪問。

  1. 建立backendtrafficpolicy.yaml。

    apiVersion: gateway.envoyproxy.io/v1alpha1
    kind: BackendTrafficPolicy 
    metadata:
      name: policy-httproute
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: HTTPRoute
        name: http-ratelimit
      rateLimit:
        type: Global
        global:
          rules:
          - clientSelectors:
            - headers:
              - name: x-user-id
                value: one
            limit:
              requests: 3
              unit: Hour
  2. 部署限流規則。

    kubectl apply -f backendtrafficpolicy.yaml
  3. 測試帶有x-user-id: one要求標頭的請求限流情況。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done

    預期輸出:

    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:49 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 2                                                                                                                                                   
    x-ratelimit-reset: 731                                                                                                                                                     
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:50 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 1                                                                                                                                                   
    x-ratelimit-reset: 730                                                                                                                                                     
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 728                                                                                                                                                     
     
    HTTP/1.1 429 Too Many Requests 
    x-envoy-ratelimited: true                                                                                                                                                  
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 727                                                                                                                                                     
    date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
    transfer-encoding: chunked 

    可以看到,前3次請求返回了200,第4次請求返回429,說明限流規則限制了帶有x-user-id: one要求標頭的請求。

  4. 測試帶有x-user-id: two要求標頭的請求限流情況。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done

    預期輸出:

    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:50:11 GMT
    content-length: 504
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:50:12 GMT
    content-length: 504
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:50:14 GMT
    content-length: 504
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:50:15 GMT
    content-length: 504

    可以看到,4次請求返回都是200,說明限流規則沒有對帶有x-user-id: two要求標頭的請求進行限制。

對除管理員之外的其他使用者進行分別限流

更新全域限流規則,對要求標頭x-user-id值為admin的請求不限流,對其他要求標頭值的要求節流每小時只能有3次訪問。

  1. 編輯限流規則。

    kubectl edit BackendTrafficPolicy policy-httproute

    使用以下內容更新限流規則。

    ...
      rateLimit:
        type: Global
        global:
          rules:
          - clientSelectors:
            - headers:
              - type: Distinct
                name: x-user-id
              - name: x-user-id
                value: admin
                invert: true
            limit:
              requests: 3
              unit: Hour

    儲存並退出後,限流規則即時生效。

  2. 測試帶有x-user-id: one要求標頭的請求限流情況。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done

    預期輸出:

    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:49 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 2                                                                                                                                                   
    x-ratelimit-reset: 731                                                                                                                                                     
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:50 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 1                                                                                                                                                   
    x-ratelimit-reset: 730                                                                                                                                                     
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
    content-length: 504                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 728                                                                                                                                                     
     
    HTTP/1.1 429 Too Many Requests 
    x-envoy-ratelimited: true                                                                                                                                                  
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 727                                                                                                                                                     
    date: Tue, 27 May 2025 07:47:52 GMT                                                                                                                                        
    transfer-encoding: chunked 

    可以看到,前3次請求返回了200,第4次請求返回429,說明限流規則限制了帶有x-user-id: one要求標頭的請求。

  3. 測試帶有x-user-id: two要求標頭的請求限流情況。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done

    預期輸出:

    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:53:38 GMT
    content-length: 504
    x-ratelimit-limit: 3, 3;w=3600
    x-ratelimit-remaining: 2
    x-ratelimit-reset: 382
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:53:39 GMT
    content-length: 504
    x-ratelimit-limit: 3, 3;w=3600
    x-ratelimit-remaining: 1
    x-ratelimit-reset: 381
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:53:41 GMT
    content-length: 504
    x-ratelimit-limit: 3, 3;w=3600
    x-ratelimit-remaining: 0
    x-ratelimit-reset: 379
    
    HTTP/1.1 429 Too Many Requests
    x-envoy-ratelimited: true
    x-ratelimit-limit: 3, 3;w=3600
    x-ratelimit-remaining: 0
    x-ratelimit-reset: 378
    date: Tue, 27 May 2025 07:53:41 GMT
    transfer-encoding: chunked

    可以看到,前3次請求返回了200,第4次請求返回429,說明限流規則限制了帶有x-user-id: two要求標頭的請求。

  4. 測試帶有x-user-id: admin要求標頭的請求限流情況。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: admin" http://$GATEWAY_HOST/get ; sleep 1; done

    預期輸出:

    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:57:44 GMT
    content-length: 506
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:57:45 GMT
    content-length: 506
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:57:46 GMT
    content-length: 506
    
    HTTP/1.1 200 OK
    content-type: application/json
    x-content-type-options: nosniff
    date: Tue, 27 May 2025 07:57:47 GMT
    content-length: 506

    可以看到,4次請求返回都是200,說明限流規則沒有對帶有x-user-id: admin要求標頭的請求進行限制。

限制所有請求

更新全域限流規則,使所有請求每小時只能有3次訪問。

  1. 編輯限流規則。

    kubectl edit BackendTrafficPolicy policy-httproute

    使用以下內容更新限流規則。

    ...
      rateLimit:
        type: Global
        global:
          rules:
          - limit:
              requests: 3
              unit: Hour

    儲存並退出後,限流規則即時生效。

  2. 測試普通請求限流情況。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done

    預期輸出:

    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:53 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 2                                                                                                                                                   
    x-ratelimit-reset: 3427                                                                                                                                                    
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:55 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 1                                                                                                                                                   
    x-ratelimit-reset: 3425                                                                                                                                                    
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:56 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 3424                                                                                                                                                    
     
    HTTP/1.1 429 Too Many Requests 
    x-envoy-ratelimited: true                                                                                                                                                  
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 3423                                                                                                                                                    
    date: Tue, 27 May 2025 08:02:57 GMT                                                                                                                                        
    transfer-encoding: chunked 

    可以看到,前3次請求返回了200,第4次請求返回429,說明限流規則已經生效。

根據用戶端IP進行限流

更新全域限流規則,對一個IP段內的每個IP進行分別限制每小時只能有3次訪問。

說明

為了方便示範,本情境限制的IP網段為0.0.0.0/0,您可以根據實際情況進行調整。

  1. 編輯限流規則。

    kubectl edit BackendTrafficPolicy policy-httproute

    使用以下內容更新限流規則。

    ...
      rateLimit:
        type: Global
        global:
          rules:
          - clientSelectors:
            - sourceCIDR: 
                value: 0.0.0.0/0
                type: Distinct
            limit:
              requests: 3
              unit: Hour

    儲存並退出後,限流規則即時生效。

  2. 測試普通請求限流情況。

    for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done

    預期輸出:

    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:53 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 2                                                                                                                                                   
    x-ratelimit-reset: 3427                                                                                                                                                    
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:55 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 1                                                                                                                                                   
    x-ratelimit-reset: 3425                                                                                                                                                    
     
    HTTP/1.1 200 OK 
    content-type: application/json                                                                                                                                             
    x-content-type-options: nosniff                                                                                                                                            
    date: Tue, 27 May 2025 08:02:56 GMT                                                                                                                                        
    content-length: 473                                                                                                                                                        
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 3424                                                                                                                                                    
     
    HTTP/1.1 429 Too Many Requests 
    x-envoy-ratelimited: true                                                                                                                                                  
    x-ratelimit-limit: 3, 3;w=3600                                                                                                                                             
    x-ratelimit-remaining: 0                                                                                                                                                   
    x-ratelimit-reset: 3423                                                                                                                                                    
    date: Tue, 27 May 2025 08:02:57 GMT                                                                                                                                        
    transfer-encoding: chunked 

    可以看到,前3次請求返回了200,第4次請求返回429,說明限流規則對0.0.0.0/0的限流已經生效。

(可選)步驟四:清理測試資源

  1. 清理限流規則。

    kubectl delete BackendTrafficPolicy policy-httproute
  2. 清理其他資源。

    kubectl delete -f httproute.yaml
    kubectl delete -f redis-service.yaml
    kubectl delete -f enable-global-rate-limit.yaml