When services in your mesh experience overload or cascading failures, you need visibility into how traffic protection mechanisms respond. Service Mesh (ASM) provides non-intrusive circuit breaking and throttling. This topic describes how to set up Grafana dashboards and Prometheus alert rules to monitor these protections in real time.
When to use each mechanism
ASM supports three traffic protection mechanisms, each serving a different purpose:
| Mechanism | Scope | How it works | Best for |
|---|---|---|---|
| Circuit breaking | Per-proxy (east-west traffic) | Each ASM proxy tracks failure rates and response timeouts for upstream services. When a threshold is reached, the proxy rejects further requests. | Preventing cascading failures between services |
| Global throttling | Cluster-wide (centralized) | A central Envoy rate limit service enforces request rate limits across the entire ASM instance based on predefined rules and quotas. | Controlling request rates to shared backend services |
| Local throttling | Per-proxy (decentralized) | Each Envoy proxy uses a token bucket algorithm to limit request rates independently. Tokens refill at a constant rate; requests are denied when the bucket is empty. | Protecting individual service instances from burst traffic |
Global and local throttling can be combined to provide layered rate limiting at different granularities.
Metrics reference
The following Prometheus metrics power the dashboards and alert rules described in this topic.
Circuit breaking metrics
| Metric | Description |
|---|---|
istio_requests_total | Total requests received by the proxy |
envoy_asm_circuit_breaker_total_broken_requests | Requests rejected by circuit breaking |
Global throttling metrics
| Metric | Description |
|---|---|
envoy_cluster_ratelimit_ok | Requests that passed the global rate limit check |
envoy_cluster_ratelimit_over_limit | Requests rejected by the global rate limit service |
Local throttling metrics
| Metric | Description |
|---|---|
envoy_http_local_rate_limiter_http_local_rate_limit_enabled | Total requests evaluated by the local rate limiter |
envoy_http_local_rate_limiter_http_local_rate_limit_ok | Requests that passed the local rate limit check |
envoy_http_local_rate_limiter_http_local_rate_limit_enforced | Requests rejected by the local rate limiter |
Circuit breaking
Circuit breaking is an overload protection mechanism that prevents cascading failures caused by traffic bursts. In east-west traffic between cloud-native services, a single failing service -- whether responding slowly or experiencing increased failure rates -- can propagate failures across the entire call chain.
You can configure circuit breaking rules to reject requests from upstream services when the failure rate or the number of response timeouts reaches the corresponding threshold. This protects upstream services and effectively prevents faults from affecting the entire call chain and causing the entire system to crash.
After you configure a circuit breaking rule, each ASM proxy independently tracks the failure rate and response timeout count based on the requests it handles. Because each proxy calculates independently, the exact moment circuit breaking activates may differ slightly across proxies for the same upstream service.
Dashboard
The circuit breaking dashboard includes the following panels:

| Panel | Description |
|---|---|
| Requests | Time-series chart showing three metrics: total requests (total_requests), circuit-broken requests (throttled_requests), and successful requests (ok_requests). |
| Requests Total | Total request count over the selected time range. |
| Requests OK | Total successful requests (not circuit-broken) over the selected time range. |
| Requests Throttled | Total requests rejected by circuit breaking over the selected time range. |
| Requests OK Percent | Ratio of successful requests to total requests, displayed as a gauge. Turns red below 90%. |
| Requests Throttled Percent | Ratio of circuit-broken requests to total requests, displayed as a gauge. Turns red above 10%. |
Import the following JSON into Grafana to create this dashboard. The JSON is also available on GitHub.
Alert rule
The following example fires an alert when more than 10 requests are rejected by circuit breaking within one minute, grouped by namespace and service.
| Configuration | Example | Description |
|---|---|---|
| Custom PromQL statement | (sum by(cluster, namespace) (increase(envoy_asm_circuit_breaker_total_broken_requests[1m]))) > 10 | Counts circuit-broken requests in the past minute, grouped by the namespace and the service that triggered circuit breaking. Fires when the count exceeds 10. |
| Alert message | Service-level circuit breaking occurred! Namespace: {{$labels.namespace}}, Service that triggers circuit breaking: {{$labels.cluster}}. The number of requests that are rejected due to circuit breaking within the current one minute: {{$value}} | Displays the namespace, the triggering service, and the number of rejected requests in the past minute. |
Global throttling
Global throttling controls request rates across an entire ASM instance. It relies on the Envoy rate limit service, which centrally processes traffic and enforces rate limits based on predefined rules and quotas.
Global throttling can be combined with local throttling to provide layered rate limiting at different granularities.
Dashboard
The global throttling dashboard includes the following panels:

| Panel | Description |
|---|---|
| Requests | Time-series chart showing three metrics: total requests (total_requests), requests that passed the rate limit check (unlimited_requests), and requests rejected by throttling (limited_requests). |
| Requests Total | Total request count over the selected time range. |
| Requests OK | Total requests that passed the rate limit check over the selected time range. |
| Requests Reached Limits | Total requests rejected by throttling over the selected time range. |
| Requests OK Percent | Ratio of successful requests to total requests, displayed as a gauge. Turns red below 90%. |
| Requests Reached Limits Percent | Ratio of throttled requests to total requests, displayed as a gauge. Turns red above 10%. |
Import the following JSON into Grafana to create this dashboard. The JSON is also available on GitHub.
Alert rule
The following example fires an alert when more than 10 requests are rejected by global throttling within one minute, grouped by namespace and service.
| Configuration | Example | Description |
|---|---|---|
| Custom PromQL statement | sum (increase(envoy_cluster_ratelimit_over_limit[1m])) by (namespace, service_istio_io_canonical_name) > 10 | Counts requests rejected by the global rate limit service in the past minute, grouped by namespace and service name. Fires when the count exceeds 10. |
| Alert message | Throttling triggered! Namespace: {{$labels.namespace}}, Service that triggers throttling: {{$labels.service_istio_io_canonical_name}}. The number of requests that are rejected due to throttling within the current one minute: {{$value}} | Displays the namespace, the triggering service, and the number of rejected requests in the past minute. |
Local throttling
Local throttling runs on each Envoy proxy independently, using a token bucket algorithm. Tokens refill at a constant rate. Each incoming request consumes one token; when the bucket is empty, the proxy denies further requests.
Local throttling can be combined with global throttling to provide layered rate limiting at different granularities.
Dashboard
The local throttling dashboard includes the following panels:

| Panel | Description |
|---|---|
| Requests | Time-series chart showing three metrics: total requests evaluated by the rate limiter (total_requests), requests that passed (unlimited_requests), and requests rejected (limited_requests). |
| Requests Total | Total request count evaluated by the local rate limiter over the selected time range. |
| Requests OK | Total requests that passed the local rate limit check over the selected time range. |
| Requests Reached Limits | Total requests rejected by local throttling over the selected time range. |
| Requests OK Percent | Ratio of successful requests to total requests, displayed as a gauge. Turns red below 90%. |
| Requests Reached Limits Percent | Ratio of throttled requests to total requests, displayed as a gauge. Turns red above 10%. |
Import the following JSON into Grafana to create this dashboard. The JSON is also available on GitHub.
Alert rule
The following example fires an alert when more than 10 requests are rejected by local throttling within one minute, grouped by namespace and service.
| Configuration | Example | Description |
|---|---|---|
| Custom PromQL statement | sum (increase(envoy_http_local_rate_limiter_http_local_rate_limit_enforced[1m])) by (namespace, service_istio_io_canonical_name) > 10 | Counts requests rejected by local throttling in the past minute, grouped by namespace and service name. Fires when the count exceeds 10. |
| Alert message | Throttling triggered! Namespace: {{$labels.namespace}}, Service that triggers throttling: {{$labels.service_istio_io_canonical_name}}. The number of requests that are rejected due to throttling within the current one minute: {{$value}} | Displays the namespace, the triggering service, and the number of rejected requests in the past minute. |
Import a dashboard into Grafana
Import any of the dashboard JSON files above into Grafana through Application Real-Time Monitoring Service (ARMS):
-
Log on to the ARMS console.
In the left navigation pane, click Integration Management.
On the Integrated Environments tab, select Container Service. Search for your cluster by name, click the target environment name, and then click Dashboard Directory.


On the Dashboards tab, click Import.

Paste the dashboard JSON into the Import via panel json section, then click Load. Keep the default settings and click Import.
You can also upload a JSON file directly to import the dashboard.