Throttling is a mechanism that limits the number of requests sent to a service. Envoy uses the token bucket algorithm to implement local throttling. This topic describes how to configure local throttling for a Service Mesh (ASM) instance.
Prerequisites
An ASM instance is created and meets the following requirements:
If the ASM instance is of Enterprise Edition or Ultimate Edition, the version of the ASM instance must be 1.14.3 or later. If the version of the ASM instance is earlier than 1.14.3, upgrade the ASM instance. For more information, see Update an ASM instance.
If the ASM instance is of Standard Edition, the version of the ASM instance must be 1.9 or later. In addition, you can use only the native rate limiting feature of Istio to implement local throttling for the ASM instance. The reference document varies with the Istio version. For more information about how to configure local throttling for the latest Istio version, see Enabling Rate Limits using Envoy.
- An ACK managed cluster is created. For more information, see Create an ACK managed cluster.
- The cluster is added to the ASM instance. For more information, see Add a cluster to an ASM instance.
Automatic sidecar injection is enabled for the default namespace in the ACK cluster. For more information, see Enable automatic sidecar proxy injection.
What is throttling?
Concept of throttling
Throttling is a mechanism that limits the number of requests sent to a service. It specifies the maximum number of requests that clients can send to a service in a given period of time, such as 300 requests per minute or 10 requests per second. The aim of throttling is to prevent a service from being overloaded because it receives excessive requests from a specific client IP address or from global clients.
For example, if you limit the number of requests sent to a service to 300 per minute, the 301st request is denied. At the same time, the 429 HTTP status code that indicates excessive requests is returned.
Throttling modes
Envoy proxies implement throttling in the following modes.
Mode | Description |
Global or distributed throttling |
|
Local throttling |
|
How local throttling works
Envoy uses the token bucket algorithm to implement local throttling. The token bucket algorithm is a method that limits the number of requests sent to services based on a certain number of tokens in a bucket. Tokens fill in the bucket at a constant rate. When a request is sent to a service, a token is removed from the bucket. When the bucket is empty, requests are denied. Generally, you need to specify the following parameters:
The interval at which the bucket is filled
The number of tokens added to the bucket each time
By default, Envoy returns the 429 HTTP status code when a request is denied and the x-envoy-ratelimited header is set. You can customize the HTTP status code and response header.
Take note of the following concepts when you use the throttling feature:
http_filter_enabled: indicates the percentage of requests for which the local rate limit is checked but not enforced.
http_filter_enforcing: indicates the percentage of requests on which the local rate limiter is applied or enforced.
Set the values to percentages. For example, you can set http_filter_enabled to 10% of requests and http_filter_enforcing to 5% of requests. This way, you can test the effect of throttling before it is applied to all the requests.
Configure local throttling
Step 1: Deploy sample services
Create an httpbin.yaml file that contains the following content:
################################################################################################## # Example httpbin service ################################################################################################## apiVersion: v1 kind: ServiceAccount metadata: name: httpbin --- apiVersion: v1 kind: Service metadata: name: httpbin labels: app: httpbin service: httpbin spec: ports: - name: http port: 8000 targetPort: 80 selector: app: httpbin --- apiVersion: apps/v1 kind: Deployment metadata: name: httpbin spec: replicas: 1 selector: matchLabels: app: httpbin version: v1 template: metadata: labels: app: httpbin version: v1 spec: serviceAccountName: httpbin containers: - image: docker.io/kennethreitz/httpbin imagePullPolicy: IfNotPresent name: httpbin ports: - containerPort: 80
Run the following command to create the httpbin service:
kubectl apply -f httpbin.yaml -n default
Create a sleep.yaml file that contains the following content:
################################################################################################## # Example sleep service ################################################################################################## apiVersion: v1 kind: ServiceAccount metadata: name: sleep --- apiVersion: v1 kind: Service metadata: name: sleep labels: app: sleep service: sleep spec: ports: - port: 80 name: http selector: app: sleep --- apiVersion: apps/v1 kind: Deployment metadata: name: sleep spec: replicas: 1 selector: matchLabels: app: sleep template: metadata: labels: app: sleep spec: terminationGracePeriodSeconds: 0 serviceAccountName: sleep containers: - name: sleep image: curlimages/curl command: ["/bin/sleep", "infinity"] imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /etc/sleep/tls name: secret-volume volumes: - name: secret-volume secret: secretName: sleep-secret optional: true ---
Run the following command to create the sleep service:
kubectl apply -f sleep.yaml -n default
Go to the pod where the sleep service resides, and run the following command to send multiple requests to the httpbin service:
while true; do curl http://httpbin:8000/headers; done
Step 2: Define and execute a throttling policy
You can customize a response or use the default response when a request is denied.
Log on to the ASM console.
In the left-side navigation pane, choose .
On the Mesh Management page, find the ASM instance that you want to configure. Click the name of the ASM instance or click Manage in the Actions column.
On the details page of the ASM instance, choose in the left-side navigation pane.
On the LocalRateLimiter page, click Create. On the Create page, set the Namespace parameter to default, select a template, and copy one of the following codes to the YMAL code editor:
Use the default response:
apiVersion: istio.alibabacloud.com/v1beta1 kind: ASMLocalRateLimiter metadata: name: httpbin namespace: default spec: workloadSelector: labels: app: httpbin configs: - match: vhost: name: "*" port: 8000 route: header_match: - name: ":path" prefix_match: "/" limit: fill_interval: seconds: 60 quota: 10
Customize a response:
apiVersion: istio.alibabacloud.com/v1beta1 kind: ASMLocalRateLimiter metadata: name: httpbin namespace: default spec: workloadSelector: labels: app: httpbin configs: - match: vhost: name: "*" port: 8000 route: header_match: - name: ":path" prefix_match: "/" limit: fill_interval: seconds: 60 quota: 10 custom_response_body: '{"custom": "custom message", "message": "Your request be limited" }' response_header_to_add: x-rate-limited: 'TOO_MANY_REQUESTS' x-local-rate-limit: 'enabled'
Run the following command to send 10 requests:
curl -v httpbin:8000/headers
Expected output:
Default response:
* Trying 192.168.250.89:8000... * Connected to httpbin (192.168.250.89) port 8000 (#0) > GET /headers HTTP/1.1 > Host: httpbin:8000 > User-Agent: curl/7.85.0-DEV > Accept: */* > * Mark bundle as not supporting multiuse < HTTP/1.1 429 Too Many Requests < x-local-rate-limit: true < content-length: 18 < content-type: text/plain < date: Tue, 27 Sep 2022 07:42:08 GMT < server: envoy < x-envoy-upstream-service-time: 0
The 429 HTTP status code and the x-local-rate-limit response header are returned.
Custom response:
* Trying 192.168.250.89:8000... * Connected to httpbin (192.168.250.89) port 8000 (#0) > GET /headers HTTP/1.1 > Host: httpbin:8000 > User-Agent: curl/7.85.0-DEV > Accept: */* > * Mark bundle as not supporting multiuse < HTTP/1.1 429 Too Many Requests < x-local-rate-limit: enabled < x-rate-limited: TOO_MANY_REQUESTS < content-length: 67 < content-type: text/plain < date: Tue, 27 Sep 2022 11:45:45 GMT < server: envoy < x-envoy-upstream-service-time: 0 < * Connection #0 to host httpbin left intact {"custom": "custom message", "message": "Your request be limited" }
The custom HTTP status code and response header are returned.
Throttling metrics
The following table describes the throttling metrics that Envoy automatically generates.
Metric | Description |
<stat_prefix>.http_local_rate_limit.enabled | Total number of requests for which the throttling is triggered |
<stat_prefix>.http_local_rate_limit.ok | Total number of responses to requests that have tokens in the token bucket |
<stat_prefix>.http_local_rate_limit.rate_limited | Total number of responses to requests that have no tokens available (but not necessarily enforced) |
<stat_prefix>.http_local_rate_limit.enforced | Total number of requests to which throttling was applied (for example, 429 is returned) |
The preceding metrics are prefixed with <stat_prefix>.http_local_rate_limit
, where stat_prefix
indicates the value that you configured in the stat_prefix
field, such as .
View throttling metrics in Prometheus Service
Run the following command to enable collection of statistics from Envoy:
Add
annotations
tospec.template.metadata
in the Deployment YAML file to enable statistics collection from Envoy.kubectl patch deployment httpbin --type merge -p '{"spec":{"template":{"metadata":{"annotations":{"proxy.istio.io/config":"proxyStatsMatcher:\n inclusionRegexps:\n - \".*http_local_rate_limit.*\""}}}}}'
After the pod is automatically restarted, run the following command to send multiple requests to the httpbin service:
curl -v httpbin:8000/headers
Expected output:
envoy_http_local_rate_limiter_http_local_rate_limit_enabled{} 37 envoy_http_local_rate_limiter_http_local_rate_limit_enforced{} 17 envoy_http_local_rate_limiter_http_local_rate_limit_ok{} 20 envoy_http_local_rate_limiter_http_local_rate_limit_rate_limited{} 17
View throttling metrics by using a self-managed Prometheus instance or in the ARMS console.
To view throttling metrics in the ARMS console, perform the following operations:
Log on to the ARMS console. In the left-side navigation pane, choose .
Click the instance that you want to use. In the left-side navigation pane, click Dashboards, and then click the dashboard that you want to view.
In the left-side navigation pane, click the
icon to view metrics.
The following figure provides an example.