All Products
Search
Document Center

:Configure the connectionPool field to implement circuit breaking

Last Updated:Dec 15, 2023

Circuit breaking is a traffic management mechanism used to protect your system from further damage in the event of a system failure or overload. In traditional Java services, frameworks such as Resilience4j can be used to implement circuit breaking. Compared with the traditional approaches, Istio allows you to implement circuit breaking at the network level without integrating circuit breaking into the application code of each service. This topic describes how to configure the connectionPool field to implement the circuit breaking feature.

Prerequisites

The cluster is added to the ASM instance.

connectionPool settings

Before you enable the circuit breaking feature, you must create a destination rule to configure circuit breaking for the desired destination service. For more information about the fields in a destination rule, see Destination Rule.

The connectionPool field defines parameters related to circuit breaking. The following table describes the parameters of the connectionPool field.

Parameter

Type

Required

Description

Default

tcp.maxConnections

int32

No

The maximum number of HTTP1 or TCP connections to a destination host.

2³²-1

http.http1MaxPendingRequests

int32

No

The maximum number of requests that will be queued while they are waiting for a ready connection pool connection.

1024

http.http2MaxRequests

int32

No

The maximum number of active requests to a backend service.

1024

It is clear how these parameters work in a simple scenario where only one client and one destination service instance exist. In Kubernetes environments, an instance is equivalent to a pod. However, in production environments, we are more likely to see the following scenarios:

  • One client instance and multiple destination service instances

  • Multiple client instances and single destination service instance

  • Multiple client instances and multiple destination service instances

In different scenarios, you need to adjust the values of these parameters based on your business requirements to ensure that the connection pool can adapt to high-load and complex environments and provide good performance and reliability. The following section describes how to configure the connectionPool field in the preceding scenarios to help you find the optimal parameter configurations for your business.

Configurartion examples

In this topic, two Python scripts are created: one for the destination service (server) and another for the calling service (client).

  • The server script creates a Flask application and defines a single endpoint on the root route. When you access the root route, the server sleeps for 5 seconds and then returns a "Hello World!" string.

    Show the server script

    #! /usr/bin/env python3
    from flask import Flask
    import time
    
    app = Flask(__name__)
    
    @app.route('/hello')
    def get():
        time.sleep(5)
        return 'hello world!'
    
    if __name__ == '__main__':
        app.run(debug=True, host='0.0.0.0', port='9080', threaded = True)
  • The client script calls the server endpoint by sending 10 requests in parallel at a time, then sleeps for some time before sending the next batch of 10 requests. The script does this in an infinite loop. To ensure that all of the pods send a batch of 10 requests at the same time when multiple pods of the client are running, batches of 10 requests are sent at the 0th, 20th, and 40th second of every minute (according to the system time) in this example.

    Show the client script

    #! /usr/bin/env python3
    import requests
    import time
    import sys
    from datetime import datetime
    import _thread
    
    def timedisplay(t):
      return t.strftime("%H:%M:%S")
    
    def get(url):
      try:
        stime = datetime.now()
        start = time.time()
        response = requests.get(url)
        etime = datetime.now()
        end = time.time()
        elapsed = end-start
        sys.stderr.write("Status: " + str(response.status_code) + ", Start: " + timedisplay(stime) + ", End: " + timedisplay(etime) + ", Elapsed Time: " + str(elapsed)+"\n")
        sys.stdout.flush()
      except Exception as myexception:
        sys.stderr.write("Exception: " + str(myexception)+"\n")
        sys.stdout.flush()
    
    time.sleep(30)
    
    while True:
      sc = int(datetime.now().strftime('%S'))
      time_range = [0, 20, 40]
    
      if sc not in time_range:
        time.sleep(1)
        continue
    
      sys.stderr.write("\n----------Info----------\n")
      sys.stdout.flush()
    
      # Send 10 requests in parallel
      for i in range(10):
        _thread.start_new_thread(get, ("http://circuit-breaker-sample-server:9080/hello", ))
    
      time.sleep(2)

Deploy sample applications

  1. Create a YAML file that contains the following content and then run the kubectl apply -f ${name of the YAML file}.yaml command to deploy sample applications.

    Show the YAML file

    ##################################################################################################
    #  circuit-breaker-sample-server services
    ##################################################################################################
    apiVersion: v1
    kind: Service
    metadata:
      name: circuit-breaker-sample-server
      labels:
        app: circuit-breaker-sample-server
        service: circuit-breaker-sample-server
    spec:
      ports:
      - port: 9080
        name: http
      selector:
        app: circuit-breaker-sample-server
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: circuit-breaker-sample-server
      labels:
        app: circuit-breaker-sample-server
        version: v1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: circuit-breaker-sample-server
          version: v1
      template:
        metadata:
          labels:
            app: circuit-breaker-sample-server
            version: v1
        spec:
          containers:
          - name: circuit-breaker-sample-server
            image: registry.cn-hangzhou.aliyuncs.com/acs/istio-samples:circuit-breaker-sample-server.v1
            imagePullPolicy: Always
            ports:
            - containerPort: 9080
    ---
    ##################################################################################################
    #  circuit-breaker-sample-client services
    ##################################################################################################
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: circuit-breaker-sample-client
      labels:
        app: circuit-breaker-sample-client
        version: v1
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: circuit-breaker-sample-client
          version: v1
      template:
        metadata:
          labels:
            app: circuit-breaker-sample-client
            version: v1
        spec:
          containers:
          - name: circuit-breaker-sample-client
            image: registry.cn-hangzhou.aliyuncs.com/acs/istio-samples:circuit-breaker-sample-client.v1
            imagePullPolicy: Always
            
  2. Run the following command to view the client and server pods:

    kubectl get po |grep circuit  

    Expected output:

    circuit-breaker-sample-client-d4f64d66d-fwrh4   2/2     Running   0             1m22s
    circuit-breaker-sample-server-6d6ddb4b-gcthv    2/2     Running   0             1m22s

If no limits are defined in the destination rule, the server can handle 10 concurrent requests from the client. Therefore, the response code returned by the sever is always 200. The following code block shows the logs of the client:

----------Info----------
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.016539812088013
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.012614488601685
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.015984535217285
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.015599012374878
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.012874364852905
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.018714904785156
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.010422468185425
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.012431621551514
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.011001348495483
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.01432466506958

Configure the connectionPool field

To enable circuit breaking for a destination service by using the service mesh technology, you need to only define a corresponding destination rule for the destination service.

Use the following content to create a destination rule for the sample destination service. For more information, see Manage destination rules. This destination rule limits the number of TCP connections to the destination service to 5.

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: circuit-breaker-sample-server
spec:
  host: circuit-breaker-sample-server
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 5

Scenario 1: One client pod and one pod for the destination service

  1. Start the client pod and monitor logs.

    We recommend that you restart the client to obtain more intuitive statistical results. You can see the following logs:

    ----------Info----------
    Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.0167787075042725
    Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.011920690536499
    Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.017078161239624
    Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.018405437469482
    Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.018689393997192
    Status: 200, Start: 02:49:40, End: 02:49:50, Elapsed Time: 10.018936395645142
    Status: 200, Start: 02:49:40, End: 02:49:50, Elapsed Time: 10.016417503356934
    Status: 200, Start: 02:49:40, End: 02:49:50, Elapsed Time: 10.019930601119995
    Status: 200, Start: 02:49:40, End: 02:49:50, Elapsed Time: 10.022735834121704
    Status: 200, Start: 02:49:40, End: 02:49:55, Elapsed Time: 15.02303147315979

    The preceding logs show that all the requests are successful. However, only five requests in each batch are responded to in about 5 seconds. The other requests are responded to in 10 or more seconds. It implies that using only tcp.maxConnections results in excess requests being queued. They are waiting for connections to be freed up. By default, the number of requests that can be queued is 2³² - 1.

  2. Use the following content to update the destination rule to allow only one pending request. For more information, see Manage destination rules.

    To realize circuit breaking (fail-fast), you must also set http.http1MaxPendingRequests which limits the number of requests that can be queued. Its default value is 1024. If you set its value to 0, it falls back to the default value. Therefore, you must set its value to at least 1.

    apiVersion: networking.istio.io/v1alpha3
    kind: DestinationRule
    metadata:
      name: circuit-breaker-sample-server
    spec:
      host: circuit-breaker-sample-server
      trafficPolicy:
        connectionPool:
          tcp:
            maxConnections: 5
          http:
            http1MaxPendingRequests: 1
  3. Restart the client pod to obtain correct statistics and monitor logs.

    Sample logs:

    ----------Info----------
    Status: 503, Start: 02:56:40, End: 02:56:40, Elapsed Time: 0.005339622497558594
    Status: 503, Start: 02:56:40, End: 02:56:40, Elapsed Time: 0.007254838943481445
    Status: 503, Start: 02:56:40, End: 02:56:40, Elapsed Time: 0.0044133663177490234
    Status: 503, Start: 02:56:40, End: 02:56:40, Elapsed Time: 0.008964776992797852
    Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.018309116363525
    Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.017424821853638
    Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.019804954528809
    Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.01643180847168
    Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.025975227355957
    Status: 200, Start: 02:56:40, End: 02:56:50, Elapsed Time: 10.01716136932373

    The logs indicate that four requests were immediately throttled, five requests were sent to the destination service, and one request was queued.

  4. Run the following command to view the number of active connections that the Istio proxy of the client establishes with the pod of the destination service:

    kubectl exec $(kubectl get pod --selector app=circuit-breaker-sample-client --output jsonpath='{.items[0].metadata.name}') -c istio-proxy -- curl -X POST http://localhost:15000/clusters | grep circuit-breaker-sample-server | grep cx_active

    Expected output:

    outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local::172.20.192.124:9080::cx_active::5

    The output indicates that five active connections are established between the Istio proxy of the client and the pod of the destination service.

Scenario 2: One client pod and multiple pods for the destination service

This section verifies whether the connection limit is applied at the pod level or the service level. Assume that one client pod and three pods for the destination service exist.

  • If the connection limit is applied at the pod level, each pod of the destination service has a maximum of five connections.

    In this case, no throttling or queuing is observed because the maximum connections allowed is 15 (3 pods multiplied by 5 connections per pod). Because only 10 requests are sent at a time, all requests should succeed and are responded to in about 5 seconds.

  • If the connection limit is applied at the service level, no matter how many pods are running for the destination service, a maximum of five connections are allowed in total.

    In this case, four requests were immediately throttled, five requests were sent to the destination service, and one request was queued.

  1. Run the following command to scale the destination service deployment to three replicas:

    kubectl scale deployment/circuit-breaker-sample-server  --replicas=3
  2. Restart the client pod and monitor logs.

    Sample logs:

    ----------Info----------
    Status: 503, Start: 03:06:20, End: 03:06:20, Elapsed Time: 0.011791706085205078
    Status: 503, Start: 03:06:20, End: 03:06:20, Elapsed Time: 0.0032286643981933594
    Status: 503, Start: 03:06:20, End: 03:06:20, Elapsed Time: 0.012153387069702148
    Status: 503, Start: 03:06:20, End: 03:06:20, Elapsed Time: 0.011871814727783203
    Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.012892484664917
    Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.013102769851685
    Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.016939163208008
    Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.014261484146118
    Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.01246190071106
    Status: 200, Start: 03:06:20, End: 03:06:30, Elapsed Time: 10.021712064743042

    The logs indicate a similar throttling and queuing as shown in the preceding code block, which means increasing the number of instances of the destination service does not increase the connection limit for the client. This indicates that the connection limit is applied at the service level.

  3. Run the following command to view the number of active connections that the Istio proxy of the client establishes with the pods of the destination service:

    kubectl exec $(kubectl get pod --selector app=circuit-breaker-sample-client --output jsonpath='{.items[0].metadata.name}') -c istio-proxy -- curl -X POST http://localhost:15000/clusters | grep circuit-breaker-sample-server | grep cx_active

    Expected output:

    outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local::172.20.192.124:9080::cx_active::2
    outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local::172.20.192.158:9080::cx_active::2
    outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local::172.20.192.26:9080::cx_active::2

    The output indicates that the Istio proxy of the client establishes two active connections with each pod of the destination service. A total of six rather than five connections are established. As mentioned in both Envoy and Istio documentation, a proxy allows some leeway in terms of the number of connections.

Scenario 3: Multiple client pods and one pod for the destination service

  1. Run the following commands to adjust the number of replicas for the destination service and the client:

    kubectl scale deployment/circuit-breaker-sample-server --replicas=1 
    kubectl scale deployment/circuit-breaker-sample-client --replicas=3
  2. Restart the client pod and monitor logs.

    Show the logs of the client

    Client 1
    
    ----------Info----------
    Status: 503, Start: 03:10:40, End: 03:10:40, Elapsed Time: 0.008828878402709961
    Status: 503, Start: 03:10:40, End: 03:10:40, Elapsed Time: 0.010806798934936523
    Status: 503, Start: 03:10:40, End: 03:10:40, Elapsed Time: 0.012855291366577148
    Status: 503, Start: 03:10:40, End: 03:10:40, Elapsed Time: 0.004465818405151367
    Status: 503, Start: 03:10:40, End: 03:10:40, Elapsed Time: 0.007823944091796875
    Status: 503, Start: 03:10:40, End: 03:10:40, Elapsed Time: 0.06221342086791992
    Status: 503, Start: 03:10:40, End: 03:10:40, Elapsed Time: 0.06922149658203125
    Status: 503, Start: 03:10:40, End: 03:10:40, Elapsed Time: 0.06859922409057617
    Status: 200, Start: 03:10:40, End: 03:10:45, Elapsed Time: 5.015282392501831
    Status: 200, Start: 03:10:40, End: 03:10:50, Elapsed Time: 9.378434181213379
    
    Client 2
    
    ----------Info----------
    Status: 503, Start: 03:11:00, End: 03:11:00, Elapsed Time: 0.007795810699462891
    Status: 503, Start: 03:11:00, End: 03:11:00, Elapsed Time: 0.00595545768737793
    Status: 503, Start: 03:11:00, End: 03:11:00, Elapsed Time: 0.013380765914916992
    Status: 503, Start: 03:11:00, End: 03:11:00, Elapsed Time: 0.004278898239135742
    Status: 503, Start: 03:11:00, End: 03:11:00, Elapsed Time: 0.010999202728271484
    Status: 200, Start: 03:11:00, End: 03:11:05, Elapsed Time: 5.015426874160767
    Status: 200, Start: 03:11:00, End: 03:11:05, Elapsed Time: 5.0184690952301025
    Status: 200, Start: 03:11:00, End: 03:11:05, Elapsed Time: 5.019806146621704
    Status: 200, Start: 03:11:00, End: 03:11:05, Elapsed Time: 5.0175628662109375
    Status: 200, Start: 03:11:00, End: 03:11:05, Elapsed Time: 5.031521558761597
    
    Client 3
    
    ----------Info----------
    Status: 503, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.012019157409667969
    Status: 503, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.012546539306640625
    Status: 503, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.013760805130004883
    Status: 503, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.014089822769165039
    Status: 503, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.014792442321777344
    Status: 503, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.015463829040527344
    Status: 503, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.01661539077758789
    Status: 200, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.02904224395751953
    Status: 200, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.03912043571472168
    Status: 200, Start: 03:13:20, End: 03:13:20, Elapsed Time: 0.06436014175415039

    The logs indicate that the number of 503 errors on each client increases. The system allows only five concurrent requests from all the three client pods.

  3. View the logs of the client proxies.

    Show the logs of the client proxies

    {"authority":"circuit-breaker-sample-server:9080","bytes_received":"0","bytes_sent":"81","downstream_local_address":"192.168.142.207:9080","downstream_remote_address":"172.20.192.31:44610","duration":"0","istio_policy_status":"-","method":"GET","path":"/hello","protocol":"HTTP/1.1","request_id":"d9d87600-cd01-421f-8a6f-dc0ee0ac8ccd","requested_server_name":"-","response_code":"503","response_flags":"UO","route_name":"default","start_time":"2023-02-28T03:14:00.095Z","trace_id":"-","upstream_cluster":"outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local","upstream_host":"-","upstream_local_address":"-","upstream_service_time":"-","upstream_transport_failure_reason":"-","user_agent":"python-requests/2.21.0","x_forwarded_for":"-"}
    
    {"authority":"circuit-breaker-sample-server:9080","bytes_received":"0","bytes_sent":"81","downstream_local_address":"192.168.142.207:9080","downstream_remote_address":"172.20.192.31:43294","duration":"58","istio_policy_status":"-","method":"GET","path":"/hello","protocol":"HTTP/1.1","request_id":"931d080a-3413-4e35-91f4-0c906e7ee565","requested_server_name":"-","response_code":"503","response_flags":"URX","route_name":"default","start_time":"2023-02-28T03:12:20.995Z","trace_id":"-","upstream_cluster":"outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local","upstream_host":"172.20.192.84:9080","upstream_local_address":"172.20.192.31:58742","upstream_service_time":"57","upstream_transport_failure_reason":"-","user_agent":"python-requests/2.21.0","x_forwarded_for":"-"}
    

    You can see two different types of logs for the requests that were throttled. The logs indicate that the RESPONSE_FLAGS field has two values: UO and URX.

    • UO: indicates upstream overflow (circuit breaking).

    • URX: indicates that the request is rejected because the retry condition for upstream HTTP requests is not met or the maximum number of TCP connection attempts is reached.

    Requests with the UO flag are throttled locally by the client proxies, and requests with the URX flag are rejected by the destination service proxy. The values of other fields in the logs, such as DURATION, UPSTREAM_HOST, and UPSTREAM_CLUSTER, also corroborate the preceding conclusion.

  4. Check the logs of the destination service proxy.

    Show the logs of the destination service proxy

    {"authority":"circuit-breaker-sample-server:9080","bytes_received":"0","bytes_sent":"81","downstream_local_address":"172.20.192.84:9080","downstream_remote_address":"172.20.192.31:59510","duration":"0","istio_policy_status":"-","method":"GET","path":"/hello","protocol":"HTTP/1.1","request_id":"7684cbb0-8f1c-44bf-b591-40c3deff6b0b","requested_server_name":"outbound_.9080_._.circuit-breaker-sample-server.default.svc.cluster.local","response_code":"503","response_flags":"UO","route_name":"default","start_time":"2023-02-28T03:14:00.095Z","trace_id":"-","upstream_cluster":"inbound|9080||","upstream_host":"-","upstream_local_address":"-","upstream_service_time":"-","upstream_transport_failure_reason":"-","user_agent":"python-requests/2.21.0","x_forwarded_for":"-"}
    {"authority":"circuit-breaker-sample-server:9080","bytes_received":"0","bytes_sent":"81","downstream_local_address":"172.20.192.84:9080","downstream_remote_address":"172.20.192.31:58218","duration":"0","istio_policy_status":"-","method":"GET","path":"/hello","protocol":"HTTP/1.1","request_id":"2aa351fa-349d-4283-a5ea-dc74ecbdff8c","requested_server_name":"outbound_.9080_._.circuit-breaker-sample-server.default.svc.cluster.local","response_code":"503","response_flags":"UO","route_name":"default","start_time":"2023-02-28T03:12:20.996Z","trace_id":"-","upstream_cluster":"inbound|9080||","upstream_host":"-","upstream_local_address":"-","upstream_service_time":"-","upstream_transport_failure_reason":"-","user_agent":"python-requests/2.21.0","x_forwarded_for":"-"}

    The response code is 503 in the logs. That is the reason why the logs of the client proxies contain "response_code":"503" and "response_flags":"URX".

In summary, the client proxies send requests according to the limit that up to five connections are allowed for each pod, and throttle or queue excess requests by using the UO response flag. All three client proxies can send up to 15 parallel requests at the start of a batch. However, only five requests can be successfully sent because the destination service proxy also limits the number of connections to five. The destination service proxy accepts only five requests and throttles the rest. The throttled requests are marked by the URX response flag in the logs of the client proxies.

The following figure shows how requests are sent from multiple client pods to a single destination service pod in the preceding scenario:

image

Scenario 4: Multiple pods for both the client and the destination service

When you increase the number of replicas of the destination service, the overall success rate of requests rises because each destination sercice proxy allows five parallel requests. In this way, throttling on both the client proxies and the destination service proxies can be observed.

  1. Run the following command to increase the number of replicas of the destination service to 2:

    kubectl scale deployment/circuit-breaker-sample-server --replicas=2

    After the number of replicas of the destination service is increased to 2, you should see that 10 requests are successful out of the 30 requests generated by all 3 client proxies in a batch.

  2. Run the following command to increase the number of replicas of the destination service to 3:

    kubectl scale deployment/circuit-breaker-sample-server --replicas=3

    After the number of replicas of the destination service is increased to 3, you should see that 15 requests are successful.

  3. Run the following command to increase the number of replicas of the destination service to 4:

    kubectl scale deployment/circuit-breaker-sample-server --replicas=3

    After the number of replicas of the destination service is increased to 4, you still see only 15 successful requests. The limit on client proxies applies to the entire destination service regardless of the number of replicas that the destination service has. Therefore, regardless of the number of replicas that the destination service has, each client proxy can send a maximum of five concurrent requests to the destination service.

Summary

Role

Description

Client

Each client proxy implements the limit independently. If the limit on the number of requests is 100, each client proxy can have 100 outstanding requests before local throttling is applied. If N clients call the destination service, the maximum number of outstanding requests that are supported is the product of 100 and N.

The limit on client proxies applies to the entire destination service, not to a single replica of the destination service. Even if the destination service runs in 200 active pods, a maximum of 100 requests are allowed.

Destination service

The limit applies to each destination service proxy. If the service runs in 50 active pods, each pod can have a maximum of 100 outstanding requests sent from client proxies before performing throttling and returning 503.