This topic describes how to send metrics to Managed Service for Prometheus from applications that are deployed in Container Service for Kubernetes (ACK) clusters and instrumented with the OpenTelemetry SDK. You can unify the collection of native OpenTelemetry (OTel) metrics, custom business metrics, and metrics converted from trace spans to achieve full application observability. Converting trace data into metrics can significantly reduce data ingestion volume.
OpenTelemetry overview
OpenTelemetry (OTel) is an open-source observability framework that provides a unified set of APIs, SDKs, and tools to generate, collect, and export telemetry data from distributed systems, including metrics, traces, and logs. Its core goal is to resolve the fragmentation of observability data across different tools and systems.
OpenTelemetry metrics are structured data that quantify system behavior and are used to monitor system performance and health. The source of metrics depends on the SDK or agent instrumentation for a specific language. For a Java application, metrics typically include standard Java Virtual Machine (JVM) metrics, custom user-instrumented metrics, and metrics derived from trace data.
OpenTelemetry metrics are designed to be compatible with Prometheus. Using the OpenTelemetry Collector, you can convert metrics to the Prometheus format for seamless integration with Managed Service for Prometheus.
Role of the OpenTelemetry Collector
The OpenTelemetry Collector is an extensible data processing pipeline that collects telemetry data from sources such as applications and services, and converts the data into the format required by a target system such as Prometheus.
Data collection
Applications use the OpenTelemetry SDK to generate metric data and send it to the Collector using the OpenTelemetry Protocol (OTLP) or other protocols, such as the native Prometheus protocol.
Data conversion
Community Collector extensions provide two exporters to convert OTel metrics to the Prometheus format.
-
The Prometheus Exporter converts OpenTelemetry metrics to the Prometheus format and provides an endpoint for a Prometheus agent to scrape data.
-
Metric name mapping: Converts OpenTelemetry metric names to a Prometheus-compatible format.
-
Label handling: Preserves or renames labels to comply with Prometheus naming rules.
-
Data type conversion:
-
gauge→ Prometheusgauge -
sum→ Prometheuscounterorgauge(based on theMonotonicproperty) -
histogram→ Prometheushistogram(by usingbucketandsumsub-metrics)
The following configuration example exposes a metric scraping endpoint on port
1234.exporters: prometheus: endpoint: "0.0.0.0:1234" namespace: "acs" const_labels: label1: value1 send_timestamps: true metric_expiration: 5m enable_open_metrics: true add_metric_suffixes: false resource_to_telemetry_conversion: enabled: true -
-
-
The Prometheus Remote Write Exporter converts OpenTelemetry metrics to the Prometheus format and writes them directly to the target Prometheus service by using the Remote Write protocol.
Similar to the Prometheus Exporter, this exporter also converts the data format. The following code provides a configuration example:
exporters: prometheusremotewrite: endpoint: http://<Prometheus Endpoint>/api/v1/write namespace: "acs" resource_to_telemetry_conversion: enabled: true timeout: 10s headers: Prometheus-Remote-Write-Version: "0.1.0" external_labels: data-mode: metrics
Best practices for applications in ACK clusters
Step 1: Prepare Managed Service for Prometheus
-
Prometheus monitoring is enabled for the cluster
If Prometheus monitoring is enabled for your ACK cluster, a Prometheus instance already exists. Log on to the CloudMonitor console. In the left-side navigation pane, choose . Find the Prometheus instance that has the same name as your ACK cluster and confirm that the instance is in the Running state in the specified region.
-
Prometheus monitoring is not enabled for the cluster
Log on to the ACK console. Click the name of the target cluster and go to the Add-ons page. On the Logging and Monitoring tab, find and install the ack-arms-prometheus component. After installation, Prometheus monitoring is automatically enabled for the cluster. Verify that the component is installed.
Step 2: Deploy the Collector in sidecar mode
For accurate metric calculations, all metrics and traces from a single pod must be sent to the same Collector. Deploying in gateway mode requires complex load balancing configurations. Therefore, we recommend that you deploy the Collector in sidecar mode.
Prometheus Exporter mode
This method exempts you from handling Prometheus write-path authentication. You can also change the scraping configuration to adjust the metric collection interval.
Architecture diagram
Deployment configuration example
# Kubernetes Deployment example
apiVersion: apps/v1
kind: Deployment
spec:
template:
metadata:
labels:
# Add a specific label to the pod. The label is usually named after the application to facilitate metric scraping configuration.
observability: opentelemetry-collector
spec:
volumes:
- name: otel-config-volume
configMap:
# This configuration is created based on the following Collector configuration example.
name: otel-config
containers:
- name: app
image: your-app:latest
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://localhost:4317
- name: otel-collector
# You can use the Collector image that we provide. The image includes Prometheus-related extensions.
# Replace in the image name with the ID of your region.
image: registry-<regionId>.ack.aliyuncs.com/acs/otel-collector:v0.128.0-7436f91
args: ["--config=/etc/otel/config/otel-config.yaml"]
ports:
- containerPort: 1234 # Prometheus endpoint
name: metrics
volumeMounts:
- name: otel-config-volume
mountPath: /etc/otel/config
Collector configuration example
Configure the resource limits (CPU and memory) for the Collector based on your application's request volume to ensure that it can process all data.
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-config
namespace: <app-namespace>
data:
otel-config.yaml: |
extensions:
zpages:
endpoint: localhost:55679
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
memory_limiter:
# 75% of maximum memory up to 2 GB
limit_mib: 1536
# 25% of limit up to 2 GB
spike_limit_mib: 512
check_interval: 5s
resource:
attributes:
- key: process.runtime.description
action: delete
- key: process.command_args
action: delete
- key: telemetry.distro.version
action: delete
- key: telemetry.sdk.name
action: delete
- key: telemetry.sdk.version
action: delete
- key: service.instance.id
action: delete
- key: process.runtime.name
action: delete
- key: process.runtime.description
action: delete
- key: process.pid
action: delete
- key: process.executable.path
action: delete
- key: process.command.args
action: delete
- key: os.description
action: delete
- key: instance
action: delete
- key: container.id
action: delete
connectors:
spanmetrics:
histogram:
explicit:
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10]
dimensions:
- name: http.method
default: "GET"
- name: http.response.status_code
- name: http.route
# Custom attribute
- name: user.id
metrics_flush_interval: 15s
exclude_dimensions:
metrics_expiration: 3m
events:
enabled: true
dimensions:
- name: default
default: "GET"
exporters:
debug:
verbosity: detailed
prometheus:
endpoint: "0.0.0.0:1234"
namespace: "acs"
const_labels:
label1: value1
send_timestamps: true
metric_expiration: 5m
enable_open_metrics: true
add_metric_suffixes: false
resource_to_telemetry_conversion:
enabled: true
service:
pipelines:
logs:
receivers: [otlp]
exporters: [debug]
traces:
receivers: [otlp]
processors: [resource]
exporters: [spanmetrics]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheus]
metrics/2:
receivers: [spanmetrics]
exporters: [prometheus]
extensions: [zpages]
This configuration processes incoming Metrics and Traces data. It uses processors of the resource type to discard environment Attributes that are typically not of interest, which prevents the volume of metric data from becoming too large. It also uses spanmetrics to convert key Span statistics into metrics.
Scraping task configuration
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: opentelemetry-collector-podmonitor
namespace: default
annotations:
arms.prometheus.io/discovery: "true"
spec:
selector:
matchLabels:
observability: opentelemetry-collector
podMetricsEndpoints:
- port: metrics
interval: 15s
scheme: http
path: /metrics
Prometheus Remote Write Exporter mode
-
This method is suitable for scenarios with large data volumes or unstable scraping, where the Collector writes directly to the Prometheus instance.
-
The configuration is more complex because you need to configure the data write path.
Architecture diagram
Data write path preparation
In this mode, the Collector writes data directly to the Prometheus instance. You must first obtain the Prometheus endpoint and authentication information.
-
Obtain the endpoint
Log on to the CloudMonitor console. In the left-side navigation pane, choose and find the Prometheus instance that corresponds to your cluster. The Prometheus instance ID and name typically match the ACK cluster ID and name. Click the name of the target instance. On the Settings page, find the Remote Write URL and copy the internal network address for later use.
-
Obtain authentication credentials
Choose one of the following methods:
-
V2 Prometheus instances support a password-free policy that allows password-free writes from within the cluster's virtual private cloud (VPC).
-
Create a RAM user for writing metric data, grant the
AliyunPrometheusMetricWriteAccesssystem policy to the RAM user, and then obtain its AccessKey pair. The AccessKey ID is used as the username and the AccessKey secret is used as the password for writing data.
-
Deployment configuration example
The Collector deployment configuration is the same as that in the Prometheus Exporter mode. For more information, see the deployment configuration example for the Prometheus Exporter mode.
Collector configuration example
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-config
namespace: <app-namespace>
data:
otel-config.yaml: |
extensions:
zpages:
endpoint: localhost:55679
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
memory_limiter:
# 75% of maximum memory up to 2 GB
limit_mib: 1536
# 25% of limit up to 2 GB
spike_limit_mib: 512
check_interval: 5s
resource:
attributes:
- key: process.runtime.description
action: delete
- key: process.command_args
action: delete
- key: telemetry.distro.version
action: delete
- key: telemetry.sdk.name
action: delete
- key: telemetry.sdk.version
action: delete
- key: service.instance.id
action: delete
- key: process.runtime.name
action: delete
- key: process.runtime.description
action: delete
- key: process.pid
action: delete
- key: process.executable.path
action: delete
- key: process.command.args
action: delete
- key: os.description
action: delete
- key: instance
action: delete
- key: container.id
action: delete
connectors:
spanmetrics:
histogram:
explicit:
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10]
dimensions:
- name: http.method
default: "GET"
- name: http.response.status_code
- name: http.route
# Custom attribute
- name: user.id
metrics_flush_interval: 15s
exclude_dimensions:
metrics_expiration: 3m
events:
enabled: true
dimensions:
- name: default
default: "GET"
exporters:
debug:
verbosity: detailed
prometheusremotewrite:
# Replace this with the internal network address of the Prometheus Remote Write endpoint.
endpoint: http://<Endpoint>/api/v3/write
namespace: "acs"
resource_to_telemetry_conversion:
enabled: true
timeout: 10s
headers:
Prometheus-Remote-Write-Version: "0.1.0"
# This header is required if the password-free policy is not enabled.
Authorization: Basic <base64-encoded-username-password>
external_labels:
data-mode: metrics
service:
pipelines:
logs:
receivers: [otlp]
exporters: [debug]
traces:
receivers: [otlp]
processors: [resource]
exporters: [spanmetrics]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheusremotewrite]
metrics/2:
receivers: [spanmetrics]
exporters: [prometheusremotewrite]
extensions: [zpages]
-
Replace the Prometheus Remote Write endpoint URL in the configuration with the URL you obtained.
-
To generate the value for
base64-encoded-username-password, run the following command:echo -n 'AK:SK' | base64