AI resource metrics

Monitor AI nodes (GPU) and AI model services in Hologres using these metrics. Track resource usage, invocation volume, and latency, and configure alerting based on the data.

Metric categories

Two categories of metrics are available:

AI Resource Metrics — Monitor overall AI node usage.
AI Model Metrics — Monitor resource usage and invocation statistics per deployed model. Use these metrics to evaluate whether each model is using its allocated resources efficiently.

For information about deploying models, see AI Models and Deployment.

Supported instance types: General-purpose, Compute group type, and Primary/standby instance.

Data appears in this category only after you purchase AI resources and start using them. See AI Resource Pricing and Purchase for details.

Metric	Description
AI Resource Group CPU Usage	Total CPU usage of the AI node
AI Resource Group Memory Usage	Total memory usage of the AI node
AI Resource Group GPU Usage	Total GPU usage of the AI node

AI Model metrics

Data appears in this category only after you deploy a model and invoke it through AI Function. See AI Models and Deployment for deployment details and AI Function for invocation details.

Resource usage

These metrics show how much of the AI node's resources a specific model consumes.

Metric	Description
AI Model Service CPU Usage	CPU usage for this model
AI Model Service Memory Usage	Memory usage for this model
AI Model Service GPU Usage	GPU usage for this model

Invocation statistics

These metrics show the volume, success rate, latency, and traffic of invocations through AI Function for a specific model.

Metric	Description
AI Model Service Invocation QPS	QPS of model invocations through AI Function
AI Model Service Successful Invocation QPS	Successful QPS of model invocations through AI Function
AI Model Service Failed Invocation QPS	Failed QPS of model invocations through AI Function
AI Model Service Average Response Time	Average latency of model invocations through AI Function
AI Model Service Maximum Response Time	Maximum latency of model invocations through AI Function
AI Model Service Inbound Traffic	Inbound traffic from model invocations through AI Function
AI Model Service Outbound Traffic	Outbound traffic from model invocations through AI Function