This topic describes the metrics supported by kube-apiserver, provides usage notes for the dashboards of kube-apiserver, and suggests how to troubleshoot common metric anomalies.
Metrics
Metrics can indicate the status and parameter settings of a component. The following table describes the metrics supported by kube-apiserver.
Metric | Type | Description |
apiserver_request_duration_seconds_bucket | Histogram | The latency between a request sent from a client and a response returned by kube-apiserver. This metric displays the response latency of kube-apiserver when handling different types of requests. Requests are classified based on verbs, groups, versions, resources, subresources, scopes, components, and clients. Histogram buckets: |
apiserver_request_total | Counter | The numbers of different types of requests received by kube-apiserver. Requests are classified based on verbs, groups, versions, resources, scopes, components, HTTP content types, HTTP codes, and clients. |
apiserver_request_no_resourceversion_list_total | Counter | The number of LIST requests that do not include the ResourceVersion parameter. Requests are classified based on groups, versions, resources, scopes, and clients. This metric is used to check whether an excessive number of LIST requests of the quorum read type are sent to kube-apiserver. This can help optimize client behavior. |
apiserver_current_inflight_requests | Gauge | The number of requests that are being processed by kube-apiserver. The requests are classified into ReadOnly and Mutating requests. |
apiserver_dropped_requests_total | Counter | The number of requests dropped due to throttling. A request is dropped if the HTTP status code |
apiserver_admission_controller_admission_duration_seconds_bucket | Gauge | The admission controller latency. The histogram is identified by the admission controller name, operation (CREATE, UPDATE, or CONNECT), API resource, operation type (validate or admit), and request denial (true or false). Buckets: |
apiserver_admission_webhook_admission_duration_seconds_bucket | Gauge | The admission webhook latency. The histogram is identified by the admission controller name, operation (CREATE, UPDATE, or CONNECT), API resource, operation type (validate or admit), and request denial (true or false). Buckets: |
apiserver_admission_webhook_admission_duration_seconds_count | Counter | The number of requests processed by the admission webhook. The histogram is identified by the admission controller name, operation (CREATE, UPDATE, or CONNECT), API resource, operation type (validate or admit), and request denial (true or false). |
cpu_utilization_core | Gauge | The CPU usage. Unit: vCores. |
cpu_utilization_ratio | Gauge | CPU utilization = Number of used vCores/Total number of vCores. Unit: %. |
memory_utilization_byte | Gauge | The memory usage. Unit: bytes. |
memory_utilization_ratio | Gauge | Memory utilization = Amount of used memory/Total amount of memory. Unit: %. |
up | Gauge | The availability of kube-apiserver.
|
Usage notes for dashboards
Dashboards are generated based on metrics and Prometheus Query Language (PromQL). The following sections describe the kube-apiserver dashboards for key metrics, cluster-level summary, resource analysis, QPS and latency, admission controller and webhook, and client summary.
In most cases, these dashboards are used in the following sequence:
View the key metrics dashboards to quickly check cluster performance statistics. For more information, see Key metrics.
View the cluster-level summary dashboards to analyze the response latency of kube-apiserver, the number of requests that are being processed by kube-apiserver, and whether request throttling is triggered. For more information, see Cluster-level summary.
View the resource analysis dashboards to check the resource usage of the managed components. For more information, see Resource analysis.
View the QPS and latency dashboards to analyze the QPS and response time in multiple dimensions. For more information, see QPS and latency.
View the Admission controller and webhook dashboards to analyze the QPS and response time of the admission controller and webhook. For more information, see Admission controller and webhook.
View the client summary dashboards to analyze the client QPS in multiple dimensions. For more information, see Client summary.
Filters
Multiple filters are displayed above the dashboards. You can use the following filters to filter requests sent to kube-apiserver based on verbs and resources, modify the quantile, and change the PromQL sampling interval.
To filter requests by verb or resource, use the verb or resource filter. To change the quantile, use the quantile filter. For example, if you select 0.9, 90% of the sample values of a metric are used as sample values in the histogram. A value of 0.9 (P90) can help eliminate the impacts of long-tail samples, which are only a small portion of the total sample values. A value of 0.99 (P99) includes long-tail samples.
The following filters are used to set the time period and update interval.
Key metrics
Observability
Feature
Metric
PromQL
Description
API QPS
sum(irate(apiserver_request_total[$interval]))
The QPS of kube-apiserver.
Read Request Success Rate
sum(irate(apiserver_request_total{code=~"20.*",verb=~"GET|LIST"}[$interval]))/sum(irate(apiserver_request_total{verb=~"GET|LIST"}[$interval]))
The success rate of read requests sent to kube-apiserver.
Write request success rate
sum(irate(apiserver_request_total{code=~"20.*",verb!~"GET|LIST|WATCH|CONNECT"}[$interval]))/sum(irate(apiserver_request_total{verb!~"GET|LIST|WATCH|CONNECT"}[$interval]))
The success rate of write requests sent to kube-apiserver.
Number of read requests processed
sum(apiserver_current_inflight_requests{requestKind="readOnly"})
The number of read requests that are being processed by kube-apiserver.
Number of write requests processed
sum(apiserver_current_inflight_requests{requestKind="mutating"})
The number of write requests that are being processed by kube-apiserver.
Request Limit Rate
sum(irate(apiserver_dropped_requests_total[$interval]))
The number of requests dropped per second.
Cluster-level summary
Observability
Feature
Metric
PromQL
Description
GET read request delay P[0.9]
histogram_quantile($quantile, sum(irate(apiserver_request_duration_seconds_bucket{verb="GET",resource!="",subresource!~"log|proxy"}[$interval])) by (pod, verb, resource, subresource, scope, le))
The response latency to GET requests based on the following dimensions: API server pods, verbs (GET), resources (such as ConfigMaps, pods, and leases), and scopes (such as scopes to namespaces or clusters).
LIST read request delay P[0.9]
histogram_quantile($quantile, sum(irate(apiserver_request_duration_seconds_bucket{verb="LIST"}[$interval])) by (pod_name, verb, resource, scope, le))
The response latency to LIST requests based on the following dimensions: API server pods, verbs (GET), resources (such as ConfigMaps, pods, and leases), and scopes (such as scopes to namespaces or clusters).
Write request delay P[0.9]
histogram_quantile($quantile, sum(irate(apiserver_request_duration_seconds_bucket{verb!~"GET|WATCH|LIST|CONNECT"}[$interval])) by (cluster, pod_name, verb, resource, scope, le))
The response latency to mutating requests based on the following dimensions: API server pods, verbs (GET), resources (such as ConfigMaps, pods, and leases), and scopes (such as scopes to namespaces or clusters).
Number of read requests processed
apiserver_current_inflight_requests{request_kind="readOnly"}
The number of read requests that are being processed by kube-apiserver.
Number of write requests processed
apiserver_current_inflight_requests{request_kind="mutating"}
The number of write requests that are being processed by kube-apiserver.
Request Limit Rate
sum(irate(apiserver_dropped_requests_total{request_kind="readOnly"}[$interval])) by (name)
sum(irate(apiserver_dropped_requests_total{request_kind="mutating"}[$interval])) by (name)
Whether kube-apiserver triggers request throttling. No data or 0 indicates that request throttling is not triggered.
Resource analysis
Observability
Feature
Metric
PromQL
Description
Memory Usage
memory_utilization_byte{container="kube-apiserver"}
The memory usage of kube-apiserver. Unit: bytes.
CPU Usage
cpu_utilization_core{container="kube-apiserver"}*1000
The CPU usage of kube-apiserver. Unit: millicores.
Memory Usage
memory_utilization_ratio{container="kube-apiserver"}
The memory utilization of kube-apiserver. Unit: %.
CPU Usage
cpu_utilization_ratio{container="kube-apiserver"}
The CPU utilization of kube-apiserver. Unit: %.
Number of resource objects
max by(resource)(apiserver_storage_objects)
max by(resource)(etcd_object_counts)
The number of each type of resource object that is managed by Kubernetes. The metric name varies based on Kubernetes version used by your ACK cluster:
If your ACK cluster uses Kubernetes 1.22 or later, the metric name is apiserver_storage_objects.
If your ACK cluster uses Kubernetes 1.22 or earlier, the metric name is etcd_object_counts.
QPS and latency
Observability
Feature
Metric
PromQL
Description
Analyze QPS [All] P[0.9] by Verb dimension
sum(irate(apiserver_request_total{verb=~"$verb"}[$interval]))by(verb)
The QPS calculated based on verbs.
Analyze QPS [All] P[0.9] by Verb Resource dimension
sum(irate(apiserver_request_total{verb=~"$verb",resource=~"$resource"}[$interval]))by(verb,resource)
The QPS calculated based on verbs and resources.
Analyze request latency by Verb dimension [All] P[0.9]
histogram_quantile($quantile, sum(irate(apiserver_request_duration_seconds_bucket{verb=~"$verb", verb!~"WATCH|CONNECT",resource!=""}[$interval])) by (le,verb))
The response latency calculated based on verbs.
Analyze request latency by Verb Resource dimension [All] P[0.9]
histogram_quantile($quantile, sum(irate(apiserver_request_duration_seconds_bucket{verb=~"$verb", verb!~"WATCH|CONNECT", resource=~"$resource",resource!=""}[$interval])) by (le,verb,resource))
The response latency calculated based on verbs and resources.
Read request QPS [5m] for non-2xx return values
sum(irate(apiserver_request_total{verb=~"GET|LIST",resource=~"$resource",code!~"2.*"}[$interval])) by (verb,resource,code)
The QPS of read requests that are answered with status codes other than 2xx.
QPS [5m] for write requests with non-2xx return values
sum(irate(apiserver_request_total{verb!~"GET|LIST|WATCH",verb=~"$verb",resource=~"$resource",code!~"2.*"}[$interval])) by (verb,resource,code)
The QPS of write requests that are answered with status codes other than 2xx.
Apiserver to Etcd request latency [5m]
histogram_quantile($quantile, sum(irate(etcd_request_duration_seconds_bucket[$interval])) by (le,operation,type,instance))
The latency of requests from kube-apiserver to etcd.
Admission controller and webhook
Observability
Feature
Metric
PromQL
Description
Admission controller delay [admit]
histogram_quantile($quantile, sum by(operation, name, le, type, rejected) (irate(apiserver_admission_controller_admission_duration_seconds_bucket{type="admit"}[$interval])) )
Statistics about the admit type admission controller, the operations performed, whether the operations are denied, and the duration of the operations.
Buckets:
0.005, 0.025, 0.1, 0.5, and 2.5
. Unit: seconds.
Admission Controller Delay [validate]
histogram_quantile($quantile, sum by(operation, name, le, type, rejected) (irate(apiserver_admission_controller_admission_duration_seconds_bucket{type="validate"}[$interval])) )
Statistics about the validate type admission controller, the operations performed, whether the operations are denied, and the duration of the operations.
Buckets:
0.005, 0.025, 0.1, 0.5, and 2.5
. Unit: seconds.
Admission Webhook delay [admit]
histogram_quantile($quantile, sum by(operation, name, le, type, rejected) (irate(apiserver_admission_webhook_admission_duration_seconds_bucket{type="admit"}[$interval])) )
Statistics about the admit type admission webhook, the operations performed, whether the operations are denied, and the duration of the operations.
Buckets:
0.005, 0.025, 0.1, 0.5, and 2.5
. Unit: seconds.
Admission Webhook Delay [validating]
histogram_quantile($quantile, sum by(operation, name, le, type, rejected) (irate(apiserver_admission_webhook_admission_duration_seconds_bucket{type="validating"}[$interval])) )
Statistics about the validate type admission webhook, the operations performed, whether the operations are denied, and the duration of the operations.
Buckets:
0.005, 0.025, 0.1, 0.5, and 2.5
. Unit: seconds.
Admission Webhook Request QPS
sum(irate(apiserver_admission_webhook_admission_duration_seconds_count[$interval]))by(name,operation,type,rejected)
The QPS of the admission webhook.
Client summary
Observability
Feature
Metric
PromQL
Description
Analyze QPS by Client dimension
sum(irate(apiserver_request_total{client!=""}[$interval])) by (client)
The QPS statistics based on clients. This can help you analyze the clients that access kube-apiserver and the relevant QPS values.
Analyze QPS by Verb Resource Client dimension
sum(irate(apiserver_request_total{client!="",verb=~"$verb", resource=~"$resource"}[$interval]))by(verb,resource,client)
The QPS statistics based on verbs, resources, and clients.
Analyze LIST request QPS by Verb Resource Client dimension (no resourceVersion field)
sum(irate(apiserver_request_no_resourceversion_list_total[$interval]))by(resource,client)
The QPS of LIST requests (without the resourceVersion field) based on verbs, resources, and clients.
You can analyze and optimize the LIST operations performed by clients based on the LIST requests sent to kube-apiserver and the LIST requests that retrieve data from etcd.
Common metric anomalies
Success rate of read/write requests
Normal | Abnormal | Description | Suggestion |
The values of Read Request Success Rate and Write request success rate are close to 100%. | The values of Read Request Success Rate and Write request success rate are low. For example, the success rates are lower than 90%. | A large number of requests are answered with status codes other than 200. |
|
Latency of GET/LIST requests and latency of write requests
Normal | Abnormal | Description | Suggestion |
The values of GET read request delay P[0.9], LIST read request delay P[0.9], and Write request delay P[0.9] vary based on the amount of cluster resources and the cluster size. Therefore, no specific thresholds can be used to identify anomalies. All cases are acceptable if your workloads are not adversely affected. For example, if the number of clients that access a specific type of resource increases, the latency of LIST requests increases. In most cases, GET read request delay P[0.9] and Write request delay P[0.9] are shorter than 1 second, and LIST read request delay P[0.9] is shorter than 5 seconds. |
| Check whether the response latency increases because of the admission webhook or the increase in clients that access the resources. |
|
Number of in-flight read/write requests and dropped requests
Normal | Abnormal | Description | Suggestion |
In most cases, if the values of Number of read requests processed and Number of write requests processed are less than 100 and Request Limit Rate is 0, no anomaly occurs. |
| The request queue is full. Check whether the issue is caused by temporary request spikes or the admission webhook. If the number of pending requests exceeds the length of the queue, kube-apiserver triggers request throttling and Request Limit Rate exceeds 0. As a result, the stability of the cluster is affected. |
|
Memory/CPU usage
Normal | Abnormal | Description | Suggestion |
The value of Memory Usage is lower than 80% and the value of CPU Usage is lower than 90%. | The values of Memory Usage and CPU Usage exceed 90%. |
|
|
Admission webhook latency
Normal | Abnormal | Description | Suggestion |
The value of Admission Webhook Delay is shorter than 0.5 second. | The value of Admission Webhook Delay remains above 0.5 second. | If the admission webhook cannot respond promptly, the response latency of kube-apiserver increases. Check whether the admission webhook works as expected. | Analyze the admission webhook log and check whether the admission webhook works as expected. If you no longer need the admission webhook, uninstall it. |