Service Mesh (ASM) collects three types of telemetry data from sidecar proxies and gateway pods on the data plane: logs, metrics, and traces. These map to the four golden signals of monitoring -- latency, traffic, errors, and saturation -- giving you real-time visibility into how services communicate, where bottlenecks form, and when failures occur.

ASM uses Istio Telemetry CustomResourceDefinitions (CRDs) to provide a centralized way to configure how telemetry data is generated and collected. Define collection rules separately for sidecar proxies and gateway pods, and route the data to managed cloud services or self-managed backends.

Telemetry CRD configuration model
Telemetry CRDs follow a three-tier inheritance model. Each tier overrides the settings from the tier above it:
Mesh-wide -- A single Telemetry CRD named
defaultin theistio-systemroot namespace. This sets the baseline for all workloads.Namespace-level -- One Telemetry CRD named
default(with an empty workload selector) per namespace. Fields specified here fully override the mesh-wide configuration for that namespace.Workload-level -- An additional Telemetry CRD with a workload selector in the target namespace. This overrides namespace-level settings for the selected workloads only.
Configuration rules:
ASM allows exactly one Telemetry CRD named
defaultin theistio-systemnamespace.Each namespace can have only one Telemetry CRD with an empty workload selector, also named
default.If two Telemetry CRDs select the same workload, the behavior is undefined. Avoid duplicate selectors.
If no metrics are configured in the
istio-systemTelemetry CRD, no metrics are generated.
Logs
Access logs record every request that passes through sidecar proxies and gateways. Print logs to stdout or stderr, then use a logging agent to aggregate them into a central system.
ASM provides two log capabilities: log filtering (control which requests generate log entries) and log formatting (control which fields appear in each entry).
Customize the log format
Customize the fields in Envoy access logs through the ASM console or the Telemetry CRD. For step-by-step instructions, see Customize access logs on the data plane.


The following YAML is equivalent to the GUI configuration shown above:
envoyFileAccessLog:
logFormat:
text: '{"bytes_received":"%BYTES_RECEIVED%","bytes_sent":"%BYTES_SENT%","downstream_local_address":"%DOWNSTREAM_LOCAL_ADDRESS%","downstream_remote_address":"%DOWNSTREAM_REMOTE_ADDRESS%","duration":"%DURATION%","istio_policy_status":"%DYNAMIC_METADATA(istio.mixer:status)%","method":"%REQ(:METHOD)%","path":"%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%","protocol":"%PROTOCOL%","request_id":"%REQ(X-REQUEST-ID)%","requested_server_name":"%REQUESTED_SERVER_NAME%","response_code":"%RESPONSE_CODE%","response_flags":"%RESPONSE_FLAGS%","route_name":"%ROUTE_NAME%","start_time":"%START_TIME%","trace_id":"%REQ(X-B3-TRACEID)%","upstream_cluster":"%UPSTREAM_CLUSTER%","upstream_host":"%UPSTREAM_HOST%","upstream_local_address":"%UPSTREAM_LOCAL_ADDRESS%","upstream_service_time":"%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%","upstream_transport_failure_reason":"%UPSTREAM_TRANSPORT_FAILURE_REASON%","user_agent":"%REQ(USER-AGENT)%","x_forwarded_for":"%REQ(X-FORWARDED-FOR)%","authority_for":"%REQ(:AUTHORITY)%","upstream_response_time":"%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%","xff":"%REQ(X-FORWARDED-FOR)%","app_service_name":"%UPSTREAM_CLUSTER%"}'
path: /dev/stdoutTo filter logs -- for example, to log only responses with status codes 400 and above:
accessLogging:
- disabled: false
filter:
expression: response.code >= 400
providers:
- name: envoyCollect data plane logs
Container Service for Kubernetes (ACK) integrates with Simple Log Service (SLS) to collect access logs from sidecar proxies on the data plane. Configure collection rules to control how logs are stored and how long they are retained. For details, see Use Simple Log Service to collect access logs on the data plane.

Collect control plane logs and set up alerting
The ASM control plane pushes configurations to sidecar proxies and ingress gateways on the data plane. If a configuration conflict occurs, the affected proxy or gateway cannot receive the update. It continues running on its last known good configuration, but will fail if the pod restarts.
Enable control plane log collection and log-based alerting to detect configuration push failures early. For setup instructions, see:
Metrics
Istio uses the Prometheus agent to collect and store metrics from Envoy proxies. These metrics cover the four golden signals of monitoring -- latency, traffic, errors, and saturation -- and support use cases such as real-time dashboards, anomaly detection, and auto scaling.
Configure metric generation rules
Enable data plane metrics to generate operational data from gateways and sidecar proxies. Collect these metrics to Managed Service for Prometheus for monitoring dashboards, or use a self-managed Prometheus instance.
Configure custom metrics through the ASM console or the Telemetry CRD. For details, see Create custom metrics in ASM.

Metric considerations
Managed Service for Prometheus is a paid service. When enabling it for the first time, scope the metrics to what your business actually needs. Monitoring too many metrics incurs unnecessary cost. If you previously enabled metrics, your earlier settings are preserved. To monitor a gateway, enable client-side metrics.
Mesh Topology depends on specific metrics. Disabling certain metrics affects Mesh Topology:
Disabled metric Impact Server-side REQUEST_COUNTHTTP and gRPC service topology cannot be generated Server-side TCP_SENT_BYTESTCP service topology cannot be generated Server-side REQUEST_SIZE,REQUEST_DURATION, and client-sideREQUEST_SIZESome node monitoring data may not display
Configure metric collection
After enabling Managed Service for Prometheus, collect metrics for storage and analysis. For integration steps, see Integrate Managed Service for Prometheus to monitor ASM instances.
Collection interval. The default interval is 15 seconds. This may be too frequent for production workloads. Adjust it in the Application Real-Time Monitoring Service (ARMS) console based on your requirements. See Configure data collection rules.
Histogram metrics. Metrics such as istio_request_duration_milliseconds_bucket, istio_request_bytes_bucket, and istio_response_bytes_bucket generate large volumes of data and incur ongoing custom metric costs. To reduce costs, discard these metrics in the ARMS console. See Configure metrics.
Self-managed Prometheus. Deploy your own Prometheus instance to monitor ASM. See Monitor ASM instances by using a self-managed Prometheus instance.
View collected metrics on a Grafana dashboard:

Merge Istio metrics with application metrics
Prometheus can only scrape one metrics endpoint per pod. If your application already exposes its own Prometheus metrics, enable metric merging so that sidecar proxies serve both Istio and application metrics from a single endpoint (:15020/stats/prometheus).
When enabled, ASM adds prometheus.io annotations to all pods on the data plane. If these annotations already exist, they are overwritten. Prometheus then scrapes the merged metrics from the unified endpoint.
For configuration steps, see Merge Istio metrics with application metrics.
Mesh Topology
Mesh Topology provides a visual map of service-to-service communication in your mesh. Use it to identify dependencies, trace traffic flows, and spot anomalies across your application.

For setup instructions, see Enable Mesh Topology to improve observability.
Service level objectives (SLOs)
A service level indicator (SLI) measures a specific dimension of service health. A service level objective (SLO) sets a target or target range for one or more SLIs. SLOs provide application developers, platform operators, and operations teams a shared benchmark to measure and continuously improve service quality.
SLO examples:
Average queries per second (QPS) > 100,000/s
99th percentile latency < 500 ms
99th percentile bandwidth > 200 MB/s per minute
Supported SLI types:
| SLI type | Plugin type | Description | Failure criteria |
|---|---|---|---|
| Availability | availability | Proportion of requests that receive a successful response | HTTP status code 429 or 5XX |
| Latency | latency | Time to return a response | Response time exceeds the configured threshold |

After you configure SLOs in ASM, a Prometheus rule is automatically generated. Import this rule into your Prometheus instance for the SLOs to take effect. The Alertmanager component collects and routes alerts to your specified contacts.

For full SLO configuration, see SLO management.
Distributed tracing
Distributed tracing tracks requests as they flow through multiple services. Each unit of work is a span. Spans can be nested, and a collection of related spans forms a trace -- representing the full lifecycle of a single request.
ASM supports distributed tracing through tools such as Jaeger and Zipkin.
Header propagation requirement
Istio proxies automatically generate spans, but applications must propagate the following HTTP headers from inbound to outbound requests to correlate spans into a complete trace:
x-request-idx-b3-traceidx-b3-spanidx-b3-parentspanidx-b3-sampledx-b3-flagsx-ot-span-context
Configure tracing data generation rules
Configure tracing parameters -- such as sampling rate and custom tags -- through the ASM console or the Telemetry CRD.

The following YAML is equivalent to the GUI configuration shown above:
tracing:
- customTags:
mytag1:
literal:
value: fixedvalue
mytag2:
header:
defaultValue: value1
name: myheader1
mytag3:
environment:
defaultValue: value1
name: myenv1
providers:
- name: zipkin
randomSamplingPercentage: 90Configure tracing data collection
Send tracing data to a managed service or a self-managed backend:
Managed service. Use the cloud-native application management service to collect and analyze traces. See Enable distributed tracing in ASM.
Self-managed service. Use open-source tools such as Zipkin or Jaeger. See Export ASM tracing data to a self-managed system.