All Products
Search
Document Center

Application Real-Time Monitoring Service:What is Managed Service for OpenTelemetry?

Last Updated:Mar 11, 2026

In distributed systems, a single user request often passes through dozens of microservices before completing. When latency spikes or errors occur, pinpointing the root cause across these services requires end-to-end visibility into the request path.

Managed Service for OpenTelemetry, a component of Application Real-Time Monitoring Service (ARMS), provides distributed tracing for microservice architectures. It collects trace data from your applications, aggregates it in real time, and generates trace details, performance metrics, and service topology maps so you can quickly identify and resolve performance bottlenecks.

Core concepts

The following concepts are central to distributed tracing:

  • Trace: A record of a single request as it travels through multiple services. Each trace has a unique ID that ties together all the operations involved in fulfilling that request.

  • Span: A single operation within a trace. Each span captures the operation name, start time, duration, and the parent span that triggered it. A trace consists of multiple spans arranged in a parent-child hierarchy.

  • Topology: A visual map of how your services call each other, generated automatically from trace data.

Architecture

The following diagram shows how Managed Service for OpenTelemetry collects and processes trace data.

Architecture of Managed Service for OpenTelemetry

Data flow

  1. Instrument your application Integrate the client SDK into your application to capture service call data. Managed Service for OpenTelemetry provides client SDKs for multiple programming languages and is compatible with open source tracing libraries such as Jaeger and Zipkin. The SDKs support the OpenTracing standard.

  2. Process and visualize After the SDK reports data, the service aggregates and persists it in real time. Three types of monitoring data are generated: Use this data to troubleshoot slow requests, identify failing services, and understand call patterns.

    Data typeDescription
    Trace detailsThe full span-by-span breakdown of each request, used for root-cause analysis.
    Performance overviewLatency, throughput, and error rate metrics across your services.
    Real-time topologyA live map of service dependencies and call relationships.
  3. Forward to downstream services Send trace data to other Alibaba Cloud services for further analysis:

    ServiceUse case
    Simple Log ServiceCorrelate traces with application logs and set up alerting rules.
    MaxComputeRun large-scale offline analysis on historical trace data.

Capabilities

GoalHow it helps
Trace requests across servicesCollects all spans from distributed microservices and assembles them into end-to-end traces for query and root-cause analysis.
Monitor application performanceCaptures request-level data and analyzes service and resource performance in real time, surfacing latency, error rates, and throughput.
Map service dependenciesAutomatically discovers how your microservices and related PaaS products call each other, and renders a real-time topology.
Integrate with open source librariesWorks with Jaeger, Zipkin, and other open source tracing libraries built on the OpenTracing standard.
Stream data to analysis platformsSends trace data to Simple Log Service for log correlation and alerting, and to MaxCompute for offline analysis.

Next steps

  • Get started by instrumenting your first application with the Managed Service for OpenTelemetry SDK.

  • Explore the trace query interface to search, filter, and analyze distributed traces.

  • Set up alerting rules in Simple Log Service to get notified when trace metrics exceed thresholds.