All Products
Search
Document Center

Application Real-Time Monitoring Service:Use Container Monitoring Pro Edition

Last Updated:Mar 11, 2026

Running a self-managed Prometheus agent inside your Kubernetes cluster consumes resources (3 CPU cores and 4 GB memory by default) and limits metric retention to 7 days. Container Monitoring Pro Edition replaces the in-cluster agent with a fully managed collector, extends metric retention to 90 days, and provides comprehensive prebuilt dashboards with a 99.95% SLA.

Important

After you enable Pro Edition, you cannot downgrade to Basic Edition.

Why upgrade to Pro Edition

CapabilityBasic EditionPro Edition
Metric retention for basic container cluster metrics7 days90 days
Prometheus collectorSelf-managed agent in your cluster (3 CPU cores, 4 GB memory by default)Fully managed agent -- no in-cluster resource cost, production-level SLA of 99.95%
DashboardsBasic monitoring dashboardsComprehensive monitoring dashboards

Supported cluster types

Pro Edition supports the following Container Service for Kubernetes (ACK) cluster types:

  • ACK managed Pro cluster

  • ACK Lingjun cluster

  • ACK dedicated cluster

Billing

Container Monitoring Pro Edition has two billing components.

Cluster scale fee

This fee is based on Observability Capacity Unit (OCU) usage, calculated from your cluster node count. OCU is a billing unit introduced by Alibaba Cloud Native Observability that automatically calculates usage based on hourly resource usage.

ItemDetail
ConversionEvery 10 cluster nodes = 1 OCU (rounded up)
Unit price0.023 USD per OCU per hour
Billing methodPay-as-you-go
Billing cycleHourly, aggregated into a daily charge

How it works: Each hour, the system records the maximum number of nodes in your cluster and converts that number to OCUs. At the end of each day, it sums the hourly OCU values and multiplies by the unit price.

Example: For a cluster with 35 nodes:

  • Hourly OCU = ceil(35 / 10) = 4 OCUs

  • Daily cost = 4 OCUs x 24 hours x 0.023 USD = 2.21 USD/day

Prometheus instance fee

Prometheus instance fees are billed separately. For details, see Prometheus instance billing.

Prerequisites

Before you enable Pro Edition, complete the following steps:

  1. Activate Managed Service for Prometheus with one of these pay-as-you-go billing modes:

  2. Activate Container Monitoring Pro Edition.

Enable Pro Edition

Choose the method that matches your situation:

Select Pro Edition during integration

  1. Go to the Integration Center page and select Kubernetes Cluster Monitoring.

  2. In the Kubernetes Cluster Monitoring panel, select the cluster to integrate, choose Container Monitoring Pro Edition, and click OK.

Select Pro Edition during integration

Upgrade from Basic Edition to Pro Edition

  1. Go to the Integration Management page and choose Integrated Environments > Container Service.

  2. Find your cluster and click Upgrade to Pro Edition in the Actions column. In the dialog box, click OK.

Upgrade to Pro Edition

Supported dashboards

All dashboards are automatically available after you enable Pro Edition.

CategoryDashboard
Monitoring overviewCluster monitoring overview
Cluster namespace dashboard
Cluster core componentsACK Pro API server
ACK Pro ETCD
ACK Pro Scheduler
ACK Pro Cloud Controller Manager
ACK Pro Kube Controller Manager
Node monitoringNode pool overview
Cluster node monitoring details
Application monitoringStatefulSet monitoring
Deployment monitoring
Daemon process set application monitoring
Cluster Pod monitoring
Network monitoringCoreDNS component monitoring
Cluster Ingress traffic monitoring
Storage monitoringCSI storage component monitoring-cluster level
CSI storage component monitoring-node level
Pod IO Monitoring (Pod Level)
Frontend Storage IO Monitoring (Cluster Level)
GPU monitoringCluster GPU monitoring-cluster level
Cluster GPU monitoring-node level
Cluster GPU monitoring-application Pod dimension
Cost analysis/Resource optimizationResource profile
OthersBackend Storage IO Monitoring (Cluster Level)
k8s-reclaimed-resource
Cluster Prometheus self-monitoring
Virtual Node(ECI) Overview

Default alert rules

The following alert rules are active by default after you enable Pro Edition.

Node alerts

Alert ruleAlert template
Node CPU usage greater than 75%Node {{ $labels.instance }} CPU usage greater than 75%, current CPU usage {{ printf "%.2f" $value }}%
Node CPU usage greater than 85%Node {{ $labels.instance }} CPU usage greater than 85%, current CPU usage {{ printf "%.2f" $value }}%
Node memory usage greater than 75%Node {{ $labels.instance }} memory usage greater than 75%, current memory usage {{ printf "%.2f" $value }}%
Node memory usage greater than 85%Node {{ $labels.instance }} memory usage greater than 85%, current memory usage {{ printf "%.2f" $value }}%
Node anomaliesNode {{$labels.node}} has been in unavailable status for more than 10 minutes
Disk usage greater than 95%Node {{ $labels.instance }} disk {{ $labels.device }} usage exceeds 95%, current disk usage {{ printf "%.2f" $value }}%

Workload alerts

Alert ruleAlert template
Deployment Pod availability less than 50%Namespace: {{$labels.namespace}} / Deployment: {{$labels.deployment}} Pod availability less than 50%, current unavailable Pod count {{ $value }}
Job execution failedNamespace: {{$labels.namespace}}/Job: {{$labels.job_name}} execution failed
Pod startup timeout failureNamespace: {{$labels.namespace}}/Pod: {{$labels.pod_name}} has not started successfully for more than 15 minutes, waiting reason {{$labels.reason}}
Pod status abnormalNamespace: {{$labels.namespace}}/Pod: {{$labels.pod_name}} has been in {{$labels.phase}} status for more than 10 minutes
Pod frequent restartNamespace: {{$labels.namespace}}/Pod: {{$labels.pod_name}} restarted more than {{ $labels.metrics_params_value}} times within {{$labels.metrics_params_time}} minutes, current restart count {{ $value }}
Container CPU usage exceeds 85%Namespace: {{$labels.namespace}} / Pod: {{$labels.pod_name}} / Container: {{$labels.container}} CPU usage greater than 85%, current value {{ printf "%.2f" $value }}%
Container CPU usage exceeds 75%Namespace: {{$labels.namespace}} / Pod: {{$labels.pod_name}} / Container: {{$labels.container}} CPU usage greater than 75%, current value {{ printf "%.2f" $value }}%
Container memory usage exceeds 75%Namespace: {{$labels.namespace}} / Pod: {{$labels.pod_name}} / Container: {{$labels.container}} memory usage greater than 75%, current value {{ printf "%.2f" $value }}%
Container memory usage exceeds 85%Namespace: {{$labels.namespace}} / Pod: {{$labels.pod_name}} / Container: {{$labels.container}} memory usage greater than 85%, current value {{ printf "%.2f" $value }}%