All Products
Search
Document Center

Cloud Monitor:GPU monitoring

Last Updated:Jun 03, 2026

CloudMonitor collects GPU metrics from ECS instances through the CloudMonitor agent. You can create alert rules to receive notifications when metrics breach configured thresholds.

Prerequisites

Metrics

GPU metrics are available at the GPU, instance, and application group levels. The following table lists the available metrics.

Metric

Unit

Metric name

Dimensions

(Agent) GPU decoder utilization

%

gpu_decoder_utilization

userId, instanceId, gpuId

(Agent) GPU encoder utilization

%

gpu_encoder_utilization

userId, instanceId, gpuId

(Agent) GPU temperature

°C

gpu_temperature

userId, instanceId, gpuId

(Agent) GPU utilization

%

gpu_utilization

userId, instanceId, gpuId

(Agent) GPU memory free space

Byte

gpu_memory_freespace

userId, instanceId, gpuId

(Agent) GPU memory free utilization

%

gpu_memory_free_utilization

userId, instanceId, gpuId

(Agent) GPU memory used space

Byte

gpu_memory_usedspace

userId, instanceId, gpuId

(Agent) GPU memory utilization

%

gpu_memory_utilization

userId, instanceId, gpuId

(Agent) GPU power draw

W

gpu_power_readings_power_draw

userId, instanceId, gpuId

(Agent) Instance-level decoder utilization

%

instance_gpu_decoder_utilization

userId, instanceId

(Agent) Instance-level encoder utilization

%

instance_gpu_encoder_utilization

userId, instanceId

(Agent) Instance-level GPU temperature

°C

instance_gpu_temperature

userId, instanceId

(Agent) Instance-level GPU utilization

%

instance_gpu_utilization

userId, instanceId

(Agent) Instance-level GPU memory free space

Byte

instance_gpu_memory_freespace

userId, instanceId

(Agent) Instance-level GPU memory free utilization

%

instance_gpu_memory_free_utilization

userId, instanceId

(Agent) Instance-level GPU memory used space

Byte

instance_gpu_memory_usedspace

userId, instanceId

(Agent) Instance-level GPU memory utilization

%

instance_gpu_memory_utilization

userId, instanceId

(Agent) Instance-level GPU power draw

W

instance_gpu_power_readings_power_draw

userId, instanceId

Note

For instance-level metrics such as instance_gpu_decoder_utilization and instance_gpu_temperature:

  • Average: The average value across all GPUs on the instance. For example, two GPUs with values 'a' and 'b' yield (a + b) / 2.

  • Maximum: The highest value among all GPUs on the instance. For example, two GPUs with values 'a' and 'b' yield max(a, b).

  • Minimum: The lowest value among all GPUs on the instance. For example, two GPUs with values 'a' and 'b' yield min(a, b).

View GPU monitoring data

  1. Log on to the Cloud Monitor console.

  2. In the left-side navigation pane, choose Cloud Resource Monitoring > Host Monitoring.

  3. On the Host Monitoring page, click the target instance's name, or click View Charts in its Actions column.

  4. Click the GPU monitoring tab.

    The GPU monitoring tab displays GPU monitoring charts for the host.

    GPU metrics also support alerting. Create an alert rule for a host. View alerts.

Related documents