Monitor AI nodes (GPU) and AI model services in Hologres using these metrics. Track resource usage, invocation volume, and latency, and configure alerting based on the data.
Metric categories
Two categories of metrics are available:
AI Resource Metrics — Monitor overall AI node usage.
AI Model Metrics — Monitor resource usage and invocation statistics per deployed model. Use these metrics to evaluate whether each model is using its allocated resources efficiently.
For information about deploying models, see AI Models and Deployment.
AI Resource metrics
Supported instance types: General-purpose, Compute group type, and Primary/standby instance.
Data appears in this category only after you purchase AI resources and start using them. See AI Resource Pricing and Purchase for details.
| Metric | Description |
|---|---|
| AI Resource Group CPU Usage | Total CPU usage of the AI node |
| AI Resource Group Memory Usage | Total memory usage of the AI node |
| AI Resource Group GPU Usage | Total GPU usage of the AI node |
AI Model metrics
Data appears in this category only after you deploy a model and invoke it through AI Function. See AI Models and Deployment for deployment details and AI Function for invocation details.
Resource usage
These metrics show how much of the AI node's resources a specific model consumes.
| Metric | Description |
|---|---|
| AI Model Service CPU Usage | CPU usage for this model |
| AI Model Service Memory Usage | Memory usage for this model |
| AI Model Service GPU Usage | GPU usage for this model |
Invocation statistics
These metrics show the volume, success rate, latency, and traffic of invocations through AI Function for a specific model.
| Metric | Description |
|---|---|
| AI Model Service Invocation QPS | QPS of model invocations through AI Function |
| AI Model Service Successful Invocation QPS | Successful QPS of model invocations through AI Function |
| AI Model Service Failed Invocation QPS | Failed QPS of model invocations through AI Function |
| AI Model Service Average Response Time | Average latency of model invocations through AI Function |
| AI Model Service Maximum Response Time | Maximum latency of model invocations through AI Function |
| AI Model Service Inbound Traffic | Inbound traffic from model invocations through AI Function |
| AI Model Service Outbound Traffic | Outbound traffic from model invocations through AI Function |