When microservices experience overload or partial failures, uncontrolled traffic can cascade across the system and bring down healthy services. Circuit breaking at the network level -- through Service Mesh (ASM) sidecar proxies -- rejects excess traffic before it reaches the backend, without requiring code changes in each service. Traditional approaches such as Resilience4j require embedding circuit breaking logic directly into application code.
Configure the connectionPool field in a DestinationRule to cap concurrent connections and pending requests to a destination service. The sections below cover the parameters, demonstrate behavior across four pod scaling topologies, and explain how to monitor circuit breaking in production.
Prerequisites
Before you begin, make sure that you have:
A Container Service for Kubernetes (ACK) cluster added to your Service Mesh (ASM) instance (see Add a cluster to an ASM instance)
How connection pool circuit breaking works
Create a DestinationRule with connectionPool settings to enable circuit breaking for a target service. For the full field reference, see Destination Rule.
Three parameters control connection pool behavior:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
tcp.maxConnections | int32 | No | 2^32-1 | Maximum HTTP/1.1 or TCP connections to a destination host. Enforced on sidecar proxies on both the client and server sides. A single client pod cannot open more than this number of connections, and a single server pod cannot accept more. Effective server-side capacity: min(client pods, server pods) x maxConnections. |
http.http1MaxPendingRequests | int32 | No | 1024 | Maximum requests queued while waiting for an available connection. Setting this to 0 falls back to the default (1024), so use a value of at least 1. |
http.http2MaxRequests | int32 | No | 1024 | Maximum active requests to a backend. |
TheconnectionPoolfield limits connections and queued requests but does not eject unhealthy hosts from the load balancing pool. For host ejection based on error rates, combineconnectionPoolwithoutlierDetectionin the same DestinationRule. See Destination Rule foroutlierDetectionfields.
With a single client and a single server pod, these parameters behave predictably. In production, services typically run multiple pods. The following scenarios show how circuit breaking behaves across four common topologies:
One client pod, one destination service pod
One client pod, multiple destination service pods
Multiple client pods, one destination service pod
Multiple client pods, multiple destination service pods
Deploy the sample applications
The sample setup has two components:
Server: A Flask application listening on port 9080 at the
/helloendpoint. Each request takes 5 seconds to process (simulating a slow backend).Client: A Python script that sends 10 parallel requests per batch. Batches fire at the 0th, 20th, and 40th second of each minute so that multiple client pods send requests simultaneously.
Save the following YAML and run
kubectl apply -f <file-name>.yamlto deploy the sample applications.Verify that the pods are running. Expected output:
kubectl get po | grep circuitcircuit-breaker-sample-client-d4f64d66d-fwrh4 2/2 Running 0 1m22s circuit-breaker-sample-server-6d6ddb4b-gcthv 2/2 Running 0 1m22s
Without a DestinationRule, the server handles all 10 concurrent requests and every response returns 200:
----------Info----------
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.016539812088013
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.012614488601685
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.015984535217285
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.015599012374878
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.012874364852905
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.018714904785156
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.010422468185425
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.012431621551514
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.011001348495483
Status: 200, Start: 02:39:20, End: 02:39:25, Elapsed Time: 5.01432466506958Create a DestinationRule for circuit breaking
Define a DestinationRule for the destination service to enable circuit breaking. For more information, see Manage destination rules.
The following rule limits TCP connections to 5:
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: circuit-breaker-sample-server
spec:
host: circuit-breaker-sample-server
trafficPolicy:
connectionPool:
tcp:
maxConnections: 5Scenario 1: One client pod, one destination service pod
Restart the client pod and check its logs. All 10 requests succeed, but only 5 finish in approximately 5 seconds. The rest wait 10+ seconds because they queue until a connection frees up. With only
tcp.maxConnectionsset, excess requests queue rather than fail -- the default queue depth is 2^32 - 1.----------Info---------- Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.0167787075042725 Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.011920690536499 Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.017078161239624 Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.018405437469482 Status: 200, Start: 02:49:40, End: 02:49:45, Elapsed Time: 5.018689393997192 Status: 200, Start: 02:49:40, End: 02:49:50, Elapsed Time: 10.018936395645142 Status: 200, Start: 02:49:40, End: 02:49:50, Elapsed Time: 10.016417503356934 Status: 200, Start: 02:49:40, End: 02:49:50, Elapsed Time: 10.019930601119995 Status: 200, Start: 02:49:40, End: 02:49:50, Elapsed Time: 10.022735834121704 Status: 200, Start: 02:49:40, End: 02:49:55, Elapsed Time: 15.02303147315979For fail-fast circuit breaking, also limit
http.http1MaxPendingRequests. Update the DestinationRule. For more information, see Manage destination rules.apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: circuit-breaker-sample-server spec: host: circuit-breaker-sample-server trafficPolicy: connectionPool: tcp: maxConnections: 5 http: http1MaxPendingRequests: 1Restart the client pod and check its logs. Four requests are immediately rejected (503), five reach the destination, and one is queued (completing in approximately 10 seconds after waiting for a free connection).
----------Info---------- Status: 503, Start: 02:56:40, End: 02:56:40, Elapsed Time: 0.005339622497558594 Status: 503, Start: 02:56:40, End: 02:56:40, Elapsed Time: 0.007254838943481445 Status: 503, Start: 02:56:40, End: 02:56:40, Elapsed Time: 0.0044133663177490234 Status: 503, Start: 02:56:40, End: 02:56:40, Elapsed Time: 0.008964776992797852 Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.018309116363525 Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.017424821853638 Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.019804954528809 Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.01643180847168 Status: 200, Start: 02:56:40, End: 02:56:45, Elapsed Time: 5.025975227355957 Status: 200, Start: 02:56:40, End: 02:56:50, Elapsed Time: 10.01716136932373Verify the active connection count from the client's sidecar proxy. Expected output: Five active connections from the client proxy to the destination pod, matching the
maxConnectionslimit.kubectl exec $(kubectl get pod --selector app=circuit-breaker-sample-client --output jsonpath='{.items[0].metadata.name}') -c istio-proxy -- curl -X POST http://localhost:15000/clusters | grep circuit-breaker-sample-server | grep cx_activeoutbound|9080||circuit-breaker-sample-server.default.svc.cluster.local::172.20.192.124:9080::cx_active::5
Scenario 2: One client pod, multiple destination service pods
This scenario tests whether the connection limit applies per pod or per service. With one client and three destination pods:
Per-pod limit: Each pod allows 5 connections, totaling 15. All 10 requests should succeed in approximately 5 seconds.
Per-service limit: Only 5 connections total, regardless of pod count. Throttling behavior matches Scenario 1.
Scale the destination service to three replicas.
kubectl scale deployment/circuit-breaker-sample-server --replicas=3Restart the client pod and check its logs. The throttling pattern is identical to Scenario 1. Adding more destination pods does not increase the client's connection limit. The connection limit applies per service, not per pod.
----------Info---------- Status: 503, Start: 03:06:20, End: 03:06:20, Elapsed Time: 0.011791706085205078 Status: 503, Start: 03:06:20, End: 03:06:20, Elapsed Time: 0.0032286643981933594 Status: 503, Start: 03:06:20, End: 03:06:20, Elapsed Time: 0.012153387069702148 Status: 503, Start: 03:06:20, End: 03:06:20, Elapsed Time: 0.011871814727783203 Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.012892484664917 Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.013102769851685 Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.016939163208008 Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.014261484146118 Status: 200, Start: 03:06:20, End: 03:06:25, Elapsed Time: 5.01246190071106 Status: 200, Start: 03:06:20, End: 03:06:30, Elapsed Time: 10.021712064743042Verify the active connection distribution. Expected output: The proxy distributes connections across pods -- two per pod, six total rather than five. As mentioned in both Envoy and Istio documentation, a proxy allows some leeway in terms of the number of connections.
kubectl exec $(kubectl get pod --selector app=circuit-breaker-sample-client --output jsonpath='{.items[0].metadata.name}') -c istio-proxy -- curl -X POST http://localhost:15000/clusters | grep circuit-breaker-sample-server | grep cx_activeoutbound|9080||circuit-breaker-sample-server.default.svc.cluster.local::172.20.192.124:9080::cx_active::2 outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local::172.20.192.158:9080::cx_active::2 outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local::172.20.192.26:9080::cx_active::2
Scenario 3: Multiple client pods, one destination service pod
Adjust replicas: scale the server to 1 and the client to 3.
kubectl scale deployment/circuit-breaker-sample-server --replicas=1 kubectl scale deployment/circuit-breaker-sample-client --replicas=3Restart the client pods and check their logs.
The 503 error rate increases on each client. Each client proxy enforces its own 5-connection limit independently, but the single destination service proxy also enforces a 5-connection limit. Only 5 requests from across all clients can succeed concurrently.
Check the client proxy logs for response flags.
Throttled requests return a 503 with one of two response flags:
Flag Meaning Where it happens UOUpstream overflow (circuit breaking) Client proxy throttles the request locally URXUpstream retry/connection limit exceeded Destination service proxy rejects the request Distinguish the two by examining
DURATION,UPSTREAM_HOST, andUPSTREAM_CLUSTERin the access log.UOrequests have no upstream host (throttled before sending), whileURXrequests reached the destination proxy and were rejected there.Confirm by checking the destination service proxy logs.
The destination service proxy also returns 503 with the
UOflag. This confirms thatURXentries in the client proxy logs originate from the destination service proxy rejecting excess connections.
Request flow summary:
Each client proxy enforces a 5-connection limit independently. With 3 clients, up to 15 requests can leave the client proxies in parallel. However, the single destination service proxy also enforces a 5-connection limit, so it accepts only 5 and rejects the rest. The rejected requests appear as URX in the client proxy logs.
Scenario 4: Multiple client pods, multiple destination service pods
Scaling the destination service increases the overall success rate because each destination pod's proxy independently allows 5 connections.
Set the server to 2 replicas and the client to 3. With 2 destination pods (each accepting 5), 10 out of 30 total requests (from 3 clients) succeed per batch.
kubectl scale deployment/circuit-breaker-sample-server --replicas=2 kubectl scale deployment/circuit-breaker-sample-client --replicas=3Scale the server to 3 replicas. 15 requests succeed per batch.
kubectl scale deployment/circuit-breaker-sample-server --replicas=3Scale the server to 4 replicas. Still only 15 requests succeed. The client proxy limit caps at 5 per client regardless of how many destination pods are available. With 3 clients, the maximum is 3 x 5 = 15 successful concurrent requests.
kubectl scale deployment/circuit-breaker-sample-server --replicas=4
Client and server constraint summary
| Role | How the limit applies |
|---|---|
| Client | Each client proxy enforces the limit independently. If maxConnections is 100 and there are N client pods, up to N x 100 requests can be in flight across all clients. The limit applies to the entire destination service, not to individual destination pods. Even with 200 destination pods, a single client proxy caps at 100 connections. |
| Destination service | Each destination pod's proxy enforces the limit independently. With 50 active pods and maxConnections set to 100, each pod accepts up to 100 connections from client proxies before returning 503. |
Monitor circuit breaking metrics
When circuit breaking activates, Envoy generates metrics for detecting and diagnosing throttling.
| Metric | Type | Description |
|---|---|---|
envoy_cluster_circuit_breakers_default_cx_open | Gauge | 1 if the connection pool circuit breaker is open (active); 0 otherwise. |
envoy_cluster_circuit_breakers_default_rq_pending_open | Gauge | 1 if the pending request queue has exceeded its limit; 0 otherwise. |
Enable circuit breaking metrics
Configure
proxyStatsMatcherfor the sidecar proxy. Select Regular Expression Match and set the value to.*circuit_breaker.*. For more information, see proxyStatsMatcher.Redeploy the
circuit-breaker-sample-serverandcircuit-breaker-sample-clientDeployments. For more information, see Redeploy workloads.Re-run the circuit breaking test from the preceding scenarios.
Query the metrics from the client proxy. Expected output:
kubectl exec -it deploy/circuit-breaker-sample-client -c istio-proxy -- curl localhost:15090/stats/prometheus | grep circuit_breaker | grep circuit-breaker-sample-serverenvoy_cluster_circuit_breakers_default_cx_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 1 envoy_cluster_circuit_breakers_default_cx_pool_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_default_remaining_cx{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_default_remaining_cx_pools{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 18446744073709551613 envoy_cluster_circuit_breakers_default_remaining_pending{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 1 envoy_cluster_circuit_breakers_default_remaining_retries{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 4294967295 envoy_cluster_circuit_breakers_default_remaining_rq{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 4294967295 envoy_cluster_circuit_breakers_default_rq_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_default_rq_pending_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_default_rq_retry_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_high_cx_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_high_cx_pool_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_high_rq_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_high_rq_pending_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0 envoy_cluster_circuit_breakers_high_rq_retry_open{cluster_name="outbound|9080||circuit-breaker-sample-server.default.svc.cluster.local"} 0
Set up alerts for circuit breaking
Use Managed Service for Prometheus to collect circuit breaking metrics and set up alert rules. For component integration details, see Manage components.
If you already use a self-managed Prometheus instance to collect ASM metrics (see Monitor ASM instances by using a self-managed Prometheus instance), skip step 1.
In Managed Service for Prometheus, connect the data plane cluster to the Alibaba Cloud ASM component or upgrade it to the latest version.
Create an alert rule with a custom PromQL statement. For more information, see Use a custom PromQL statement to create an alert rule. Use the following parameters as a reference:
Parameter Example Description Custom PromQL statements (sum by(cluster_name, pod_name, namespace) (envoy_cluster_circuit_breakers_default_cx_open)) != 0Checks whether circuit breaking is active in any connection pool. Groups by upstream service name, pod, and namespace so that you can pinpoint where throttling occurs. Alert message Circuit breaking is active. The TCP connection limit has been reached. Namespace: {{$labels.namespace}}, Pod: {{$labels.pod_name}}, Upstream service: {{$labels.cluster_name}} Identifies the affected pod, its namespace, and the upstream service.