Gateway with Inference Extension組件支援為叢集開啟全域限流,確保系統在高並發或異常流量下保持穩定運行。本文介紹如何基於Gateway with Inference Extension組件配置全域限流以及支援的限流情境。
功能說明
限流是一種限制發送到服務端的請求數量的機制。它指定用戶端在給定時間段內可以向服務端發送的最大請求數,通常表示為一段時間內的請求數,例如每分鐘300個請求或每秒10個請求等。Gateway with Inference Extension組件在開啟全域限流之後,會自動部署一個全域限流服務。該全域限流服務負責集中管理並動態提供全域的限流策略與即時資料流量資料。Gateway with Inference Extension通過內建的限流過濾器(如Rate Limit Filter)與全域限流服務進行互動,即時擷取預設的限流閾值(例如每秒請求數或並發串連數),並基於這些策略對傳入的請求進行速率限制。
前提條件
已安裝Gateway with Inference ExtensionGateway with Inference Extension,且版本不低於1.4.0。
已完成準備工作中的步驟。
操作步驟
步驟一:開啟全域限流
全域限流自動部署的限流服務依賴一個Redis服務作為全域儲存,本文採用自建Redis服務的方式。您也可以使用Tair (Redis OSS-compatible)來快速建立Redis執行個體,並將相關配置資訊更新到envoy-gateway-system命名空間下的ack-gateway-config配置項中。相關配置說明,請參見Envoy Gateway。
建立redis-service.yaml。
kind: Namespace apiVersion: v1 metadata: name: redis-system --- apiVersion: apps/v1 kind: StatefulSet metadata: name: redis namespace: redis-system labels: app: redis spec: serviceName: "redis" replicas: 1 selector: matchLabels: app: redis template: metadata: labels: app: redis spec: containers: - image: registry-cn-hangzhou.ack.aliyuncs.com/dev/redis:6.0.6-for-ack-gateway name: redis ports: - containerPort: 6379 resources: limits: cpu: 1500m memory: 512Mi requests: cpu: 200m memory: 256Mi --- apiVersion: v1 kind: Service metadata: name: redis namespace: redis-system labels: app: redis spec: ports: - name: redis port: 6379 protocol: TCP targetPort: 6379 selector: app: redis建立enable-global-rate-limit.yaml。
apiVersion: v1 kind: ConfigMap metadata: name: ack-gateway-config namespace: envoy-gateway-system data: ack-gateway.yaml: | apiVersion: gateway.envoyproxy.io/v1alpha1 kind: EnvoyGateway rateLimit: backend: type: Redis redis: url: redis.redis-system.svc.cluster.local:6379部署Redis服務並開啟全域限流。
kubectl apply -f redis-service.yaml kubectl apply -f enable-global-rate-limit.yaml
步驟二:部署HTTPRoute資源
為後續測試建立HTTPRoute資源,後續的限流規則將會應用到此資源上。
建立httproute.yaml。
--- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: http-ratelimit spec: parentRefs: - name: eg hostnames: - ratelimit.example rules: - matches: - path: type: PathPrefix value: / backendRefs: - group: "" kind: Service name: backend port: 3000部署HTTPRoute資源。
kubectl apply -f httproute.yaml擷取Gateway的公網IP。
export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')
步驟三:情境示範
對指定使用者進行限流
配置全域限流規則,限制要求標頭x-user-id值為one的請求每小時只能有3次訪問。
建立backendtrafficpolicy.yaml。
apiVersion: gateway.envoyproxy.io/v1alpha1 kind: BackendTrafficPolicy metadata: name: policy-httproute spec: targetRefs: - group: gateway.networking.k8s.io kind: HTTPRoute name: http-ratelimit rateLimit: type: Global global: rules: - clientSelectors: - headers: - name: x-user-id value: one limit: requests: 3 unit: Hour部署限流規則。
kubectl apply -f backendtrafficpolicy.yaml測試帶有
x-user-id: one要求標頭的請求限流情況。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done預期輸出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:49 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 731 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:50 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 730 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:52 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 728 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 727 date: Tue, 27 May 2025 07:47:52 GMT transfer-encoding: chunked可以看到,前3次請求返回了
200,第4次請求返回429,說明限流規則限制了帶有x-user-id: one要求標頭的請求。測試帶有
x-user-id: two要求標頭的請求限流情況。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done預期輸出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:50:11 GMT content-length: 504 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:50:12 GMT content-length: 504 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:50:14 GMT content-length: 504 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:50:15 GMT content-length: 504可以看到,4次請求返回都是
200,說明限流規則沒有對帶有x-user-id: two要求標頭的請求進行限制。
對除管理員之外的其他使用者進行分別限流
更新全域限流規則,對要求標頭x-user-id值為admin的請求不限流,對其他要求標頭值的要求節流每小時只能有3次訪問。
編輯限流規則。
kubectl edit BackendTrafficPolicy policy-httproute使用以下內容更新限流規則。
... rateLimit: type: Global global: rules: - clientSelectors: - headers: - type: Distinct name: x-user-id - name: x-user-id value: admin invert: true limit: requests: 3 unit: Hour儲存並退出後,限流規則即時生效。
測試帶有
x-user-id: one要求標頭的請求限流情況。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://$GATEWAY_HOST/get ; sleep 1; done預期輸出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:49 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 731 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:50 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 730 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:47:52 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 728 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 727 date: Tue, 27 May 2025 07:47:52 GMT transfer-encoding: chunked可以看到,前3次請求返回了
200,第4次請求返回429,說明限流規則限制了帶有x-user-id: one要求標頭的請求。測試帶有
x-user-id: two要求標頭的請求限流情況。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://$GATEWAY_HOST/get ; sleep 1; done預期輸出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:53:38 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 382 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:53:39 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 381 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:53:41 GMT content-length: 504 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 379 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 378 date: Tue, 27 May 2025 07:53:41 GMT transfer-encoding: chunked可以看到,前3次請求返回了
200,第4次請求返回429,說明限流規則限制了帶有x-user-id: two要求標頭的請求。測試帶有
x-user-id: admin要求標頭的請求限流情況。for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" --header "x-user-id: admin" http://$GATEWAY_HOST/get ; sleep 1; done預期輸出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:57:44 GMT content-length: 506 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:57:45 GMT content-length: 506 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:57:46 GMT content-length: 506 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 07:57:47 GMT content-length: 506可以看到,4次請求返回都是
200,說明限流規則沒有對帶有x-user-id: admin要求標頭的請求進行限制。
限制所有請求
更新全域限流規則,使所有請求每小時只能有3次訪問。
編輯限流規則。
kubectl edit BackendTrafficPolicy policy-httproute使用以下內容更新限流規則。
... rateLimit: type: Global global: rules: - limit: requests: 3 unit: Hour儲存並退出後,限流規則即時生效。
測試普通請求限流情況。
for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done預期輸出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:53 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 3427 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:55 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 3425 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:56 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 3424 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 3423 date: Tue, 27 May 2025 08:02:57 GMT transfer-encoding: chunked可以看到,前3次請求返回了
200,第4次請求返回429,說明限流規則已經生效。
根據用戶端IP進行限流
更新全域限流規則,對一個IP段內的每個IP進行分別限制每小時只能有3次訪問。
為了方便示範,本情境限制的IP網段為0.0.0.0/0,您可以根據實際情況進行調整。
編輯限流規則。
kubectl edit BackendTrafficPolicy policy-httproute使用以下內容更新限流規則。
... rateLimit: type: Global global: rules: - clientSelectors: - sourceCIDR: value: 0.0.0.0/0 type: Distinct limit: requests: 3 unit: Hour儲存並退出後,限流規則即時生效。
測試普通請求限流情況。
for i in {1..4}; do kubectl exec deployment/sleep -it -- curl -I --header "Host: ratelimit.example" http://$GATEWAY_HOST/get ; sleep 1; done預期輸出:
HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:53 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 2 x-ratelimit-reset: 3427 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:55 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 1 x-ratelimit-reset: 3425 HTTP/1.1 200 OK content-type: application/json x-content-type-options: nosniff date: Tue, 27 May 2025 08:02:56 GMT content-length: 473 x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 3424 HTTP/1.1 429 Too Many Requests x-envoy-ratelimited: true x-ratelimit-limit: 3, 3;w=3600 x-ratelimit-remaining: 0 x-ratelimit-reset: 3423 date: Tue, 27 May 2025 08:02:57 GMT transfer-encoding: chunked可以看到,前3次請求返回了
200,第4次請求返回429,說明限流規則對0.0.0.0/0的限流已經生效。
(可選)步驟四:清理測試資源
清理限流規則。
kubectl delete BackendTrafficPolicy policy-httproute清理其他資源。
kubectl delete -f httproute.yaml kubectl delete -f redis-service.yaml kubectl delete -f enable-global-rate-limit.yaml