This topic describes the mainstream large language model (LLM) metrics which you can utilize to customize Grafana dashboards.
Common labels
Dimension description | Dimension key | Example |
Service name | service | llm-rag-demo |
Service PID | pid | ggxw4lnjuz@0cb8619bb54**** |
Server IP address | serverIp | 127.0.0.1 |
Interface | rpc | query |
Application source | source |
|
Request metrics
By design, the request metrics cover the protocols and invocation types supported by instrumentation, such as provided and dependent services. For more information, see Application monitoring metrics.
Metric description | Metric name | Measurement | Collection interval (Unit: seconds) | Unit | Dimension |
Total of requests | arms_$callType_requests_count | Gauge | 15 | None | Different dimensions are applicable to different service access types. For more information, see Application monitoring metrics. |
Number of error requests | arms_$callType_requests_error_count | Gauge | 15 | None | |
Total request duration | arms_$callType_requests_seconds | Gauge | 15 | Seconds | |
Number of slow requests | arms_$callType_requests_slow_count | Gauge | 15 | None |
LLM metrics
In addition to the common labels, the following labels may also be used: modelName, spanKind, usageType.
Dimension description | Dimension key | Example | Remarks |
Model name | modelName |
| None |
Operation type | spanKind | LLM, CHAIN, or EMBEDDING For more information, see Trace fields for LLM applications. | None |
Usage type | usageType |
| Available only to token-related metrics |
Operation types
Metric description | Metric name | Measurement | Collection interval (Unit: minutes) | Unit | Dimension |
Number of requests for invoking a LLM | genai_calls_count | Gauge | 1 | None |
|
Response duration for invoking a LLM | genai_calls_duration_seconds | Gauge | 1 | Seconds |
|
Number of LLM invoking errors | genai_calls_error_count | Gauge | 1 | None |
|
Number of slow LLM invocations | genai_calls_slow_count | Gauge | 1 | None |
|
LLM performance
Metric description | Metric name | Measurement | Collection interval (Unit: minutes) | Unit | Dimension |
Time to first token (TTFT) for an LLM | genai_llm_first_token_seconds | Gauge | 1 | Seconds |
|
LLM usage
Metric description | Metric name | Measurement | Collection interval (Unit: minutes) | Unit | Dimension |
Count of used tokens | genai_llm_usage_tokens | Gauge | 1 | None |
|