This topic describes the basic metrics that Managed Service for Prometheus provides for container clusters.
Managed Service for Prometheus charges fees based on the amount of written observable data or the number of data reports. The metrics are classified into two types: basic metrics and custom metrics. Custom metrics refer to non-basic metrics. Basic metrics are free of charge. You are charged for custom metrics starting from January 6, 2020.
Managed Service for Prometheus will change the basic metrics provided for container clusters from 00:00:00 on November 12, 2024 (UTC+8). The following tables describe the new basic metrics.
The default set of basic metrics collected for container clusters is confined to the range specified in this topic.
Metrics that fall outside the scope of this topic are considered custom metrics and are subject to charges. For more information, see Billing overview.
cAdvisor (job name: _arms/kubelet/cadvisor)
Metric | Description |
container_cpu_usage_seconds_total | The total CPU time consumed by the container in seconds. |
container_fs_usage_bytes | The number of bytes used by the container file system. |
container_memory_cache | The memory cache size of the container in bytes. |
container_memory_usage_bytes | The amount of memory used by the container in bytes. |
container_memory_working_set_bytes | The memory working set size (WSS) of the container in bytes. |
container_network_receive_bytes_total | The total network traffic received by the container in bytes. |
container_network_transmit_bytes_total | The total network traffic transmitted by the container in bytes. |
container_scrape_error | The number of container metric scraping errors. |
DCGM_CUSTOM_CONTAINER_CP_ALLOCATED | The ratio of the GPU computing power allocated to the container to the total computing power of the GPU. The value ranges from 0 to 1. In exclusive GPU mode or in shared GPU mode in which the container requests only GPU memory, the value of this metric is 0, which indicates that the allocation of GPU computing power is unlimited. For example, if a GPU provides a total of 100 compute units (CUs) of GPU computing power and allocates 30 CUs to a container, the ratio of the GPU computing power allocated to the container is calculated by using the following formula: 30/100 = 0.3. |
DCGM_CUSTOM_CONTAINER_MEM_ALLOCATED | The amount of GPU memory allocated to the container. |
DCGM_CUSTOM_DEV_FB_ALLOCATED | The ratio of the allocated GPU memory to the total memory of the GPU. The value ranges from 0 to 1. |
DCGM_CUSTOM_DEV_FB_TOTAL | The total memory of the GPU. |
DCGM_CUSTOM_DEV_HEALTH | The health status of the GPU. |
DCGM_CUSTOM_PROCESS_DECODE_UTIL | The decoder utilization of GPU threads. |
DCGM_CUSTOM_PROCESS_ENCODE_UTIL | The encoder utilization of GPU threads. |
DCGM_CUSTOM_PROCESS_MEM_COPY_UTIL | The memory copy utilization of GPU threads. |
DCGM_CUSTOM_PROCESS_MEM_USED | The amount of GPU memory used by GPU threads. |
DCGM_CUSTOM_PROCESS_SM_UTIL | The streaming multiprocessor (SM) utilization of GPU threads. |
DCGM_CUSTOM_PROF_MEM_BANDWIDTH_USED | The GPU memory bandwidth used. |
DCGM_CUSTOM_PROF_TENS_TFPS_USED | The tensor core utilization. |
DCGM_FI_DEV_DEC_UTIL | The decoder utilization. |
DCGM_FI_DEV_ENC_UTIL | The encoder utilization. |
DCGM_FI_DEV_FB_FREE | The amount of free frame buffer memory. |
DCGM_FI_DEV_FB_USED | The amount of used frame buffer memory. The value of this metric is the same as the value of Memory-Usage returned by the nvidia-smi command. |
DCGM_FI_DEV_GPU_TEMP | The GPU temperature. |
DCGM_FI_DEV_GPU_UTIL | The GPU utilization within a cycle of 1 second or 1/6 second. The cycle varies based on the GPU model. A cycle is a period of time during which one or more kernel functions remain active. This metric only indicates that one or more kernel functions are occupying GPU resources. The metric does not display detailed GPU usage information. |
DCGM_FI_DEV_MEM_CLOCK | The memory clock speed. |
DCGM_FI_DEV_MEM_COPY_UTIL | The memory bandwidth utilization. For example, the maximum memory bandwidth of NVIDIA V100 is 900 GB/s. If the memory bandwidth used is 450 GB/s, the memory bandwidth utilization is 50%. |
DCGM_FI_DEV_POWER_USAGE | The power usage. |
DCGM_FI_DEV_SM_CLOCK | The SM clock speed. |
DCGM_FI_DEV_TOTAL_ENERGY_CONSUMPTION | The total energy consumed since the driver was last loaded. |
DCGM_FI_DEV_XID_ERRORS | The last XID error that occurred within a period of time. |
DCGM_FI_PROF_DRAM_ACTIVE | The cycle fraction for memory bandwidth utilization when sending data to device memory or receiving data from device memory. The value is an average value within a time interval rather than an instantaneous value. A larger value of this metric indicates higher device memory utilization. If the value is 1 (100%), a DRAM command is executed every cycle within the entire interval. The peak value of the metric can reach 0.8 (80%). If the value of this metric is 0.2 (20%), 20% of the cycles within the time interval are spent reading from or writing to device memory. |
DCGM_FI_PROF_NVLINK_RX_BYTES | The TX rate of NVLink and the RX rate of NVLink. The bytes transmitted or received exclude the header. The value is an average value within a time interval rather than an instantaneous value. For example, if 1 GB of data is transmitted within 1 second, the TX rate is 1 GB/s regardless of whether the transmission occurs at a consistent rate or in bursts. Theoretically, the maximum NVLink Gen2 bandwidth is 25 GB/s per direction per link. |
DCGM_FI_PROF_NVLINK_TX_BYTES | The total number of bytes sent through NVLink. |
DCGM_FI_PROF_PCIE_RX_BYTES | The TX rate of PCle and the RX rate of PCIe. The bytes transmitted or received include both the header and payload. The value is an average value within a time interval rather than an instantaneous value. For example, if 1 GB of data is transmitted within 1 second, the TX rate is 1 GB/s regardless of whether the transmission occurs at a consistent rate or in bursts. Theoretically, the maximum PCIe Gen3 bandwidth is 985 MB/s per lane. |
DCGM_FI_PROF_PCIE_TX_BYTES | The TX rate of PCle and the RX rate of PCIe. The bytes transmitted or received include both the header and payload. The value is an average value within a time interval rather than an instantaneous value. For example, if 1 GB of data is transmitted within 1 second, the TX rate is 1 GB/s regardless of whether the transmission occurs at a consistent rate or in bursts. Theoretically, the maximum PCIe Gen3 bandwidth is 985 MB/s per lane. |
DCGM_FI_PROF_PIPE_TENSOR_ACTIVE | The cycle fraction for the Tensor (HMMA/IMMA) pipe being in the Active state. The value is an average value within a time interval rather than an instantaneous value. A larger value of this metric indicates higher tensor core utilization. If the value is 1 (100%), a Tensor instruction is issued every cycle within the entire interval. One instruction completes in two cycles. If the value of this metric is 0.2 (20%), one of the following conditions may exist: The tensor core utilization of 20% of the SMs within the time interval is 100%. The tensor core utilization of all SMs within the time interval is 20%. The tensor core utilization of all SMs within 20% of the time interval is 100%. Other conditions. |
DCGM_FI_PROF_SM_ACTIVE | The ratio of cycles during which at least one warp on an SM remains active. The value is an average of all SMs. The value does not vary with the number of warps included in the thread block. When a warp is scheduled and resources are allocated to the warp, the warp is considered active. In this case, the status of the warp may be Computing or not Computing; for example, it may be waiting for memory requests or in another non-Computing state. If the value of this metric drops below 0.5, the GPU utilization is low. To ensure high GPU utilization, make sure that the value is greater than 0.8. Assume that a GPU has N SMs. If all SMs in N thread blocks run a kernel function within a time interval, the value of this metric is 1 (100%). If N/5 thread blocks run a kernel function within a time interval, the value of this metric is 0.2. If N thread blocks run a kernel function during 20% of the cycle within a time interval, the value of this metric is 0.2. |
machine_cpu_cores | The number of CPU cores on the machine. |
machine_memory_bytes | The machine memory in bytes. |
node_exporter_build_info | The build information about the node exporter. |
nvidia_gpu_duty_cycle | The percentage of time over the past sample period during which the NVIDIA GPU was occupied. |
nvidia_gpu_memory_total_bytes | The total memory of the NVIDIA GPU in bytes. |
nvidia_gpu_memory_used_bytes | The memory used by the NVIDIA GPU in bytes. |
nvidia_gpu_num_devices | The number of NVIDIA GPUs. |
nvidia_gpu_power_usage_milliwatts | The power consumption of the NVIDIA GPU in milliwatts. |
nvidia_gpu_temperature_celsius | The temperature of the NVIDIA GPU in °C. |
rdma_service_monitor_local_ack_timeout_err | The number of timeout errors that occurred in the remote direct memory access (RDMA) network. |
rdma_service_monitor_out_of_seq | The number of out-of-order packets in the RDMA network. |
rdma_service_monitor_packet_seq_err | The number of out-of-order packet errors in the RDMA network. |
rdma_service_monitor_rx_bytes | The throughput received over the RDMA network in bytes. |
rdma_service_monitor_rx_packets | The number of packets received over the RDMA network. |
rdma_service_monitor_tx_bytes | The throughput sent over the RDMA network in bytes. |
rdma_service_monitor_tx_packets | The number of packets sent over the RDMA network. |
up | The connectivity of metric collection. |
ACK ControlPlane APIServer (Control plane components for ACK Pro clusters: APIServer, ETCD, Scheduler, Kube Controller Manager, and Cloud Controller Manager as well as Control plane component for ACK dedicated clusters: APIServer) (job name: apiserver)
Metric | Description |
aggregator_discovery_aggregation_count_total | The count of discovery aggregations performed by the aggregator. |
aggregator_openapi_v2_regeneration_count | The number of regenerations based on OpenAPI 2.0. |
aggregator_openapi_v2_regeneration_duration | The amount of time consumed for regenerations based on OpenAPI 2.0. |
aggregator_unavailable_apiservice | The APIServices that are unavailable to the aggregator. |
aggregator_unavailable_apiservice_count | The count of APIServices that are unavailable to the aggregator. |
aggregator_unavailable_apiservice_total | The total number of APIServices that are unavailable to the aggregator. |
aliyun_prometheus_agent_append_duration_seconds | The additional time spent by the Prometheus agent in seconds. |
aliyun_prometheus_agent_job_discovery_status | The job status that is discovered by the Prometheus agent. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of target scrapes performed by the Prometheus agent. |
aliyun_prometheus_agent_target_info | The information about targets scraped by the Prometheus agent. |
apiextensions_apiserver_validation_ratcheting_seconds_bucket | The distribution of incremental time intervals for validation in seconds in the APIServer. |
apiextensions_apiserver_validation_ratcheting_seconds_count | The count of incremental time intervals for validation in seconds in the APIServer. |
apiextensions_apiserver_validation_ratcheting_seconds_sum | The sum of incremental time intervals for validation in seconds in the APIServer. |
apiextensions_openapi_v2_regeneration_count | The number of API extension regenerations based on OpenAPI 2.0. |
apiextensions_openapi_v3_regeneration_count | The number of API extension regenerations based on OpenAPI 3.0. |
apiserver_accepted_listall_requests_total | The total number of ListAll requests accepted by the APIServer. |
apiserver_admission_controller_admission_duration_seconds_bucket | The distribution of APIServer admission controller durations in seconds. |
apiserver_admission_controller_admission_duration_seconds_count | The count of APIServer admission controller durations in seconds. |
apiserver_admission_controller_admission_duration_seconds_sum | The sum of APIServer admission controller durations in seconds. |
apiserver_admission_step_admission_duration_seconds_bucket | The distribution of APIServer admission step durations in seconds. |
apiserver_admission_step_admission_duration_seconds_count | The count of APIServer admission step durations per second. |
apiserver_admission_step_admission_duration_seconds_sum | The sum of APIServer admission step durations in seconds. |
apiserver_admission_step_admission_duration_seconds_summary | The summary of APIServer admission step durations in seconds. |
apiserver_admission_step_admission_duration_seconds_summary_count | The summary count of APIServer admission step durations in seconds. |
apiserver_admission_step_admission_duration_seconds_summary_sum | The summary total of APIServer admission step durations in seconds. |
apiserver_admission_webhook_admission_duration_seconds_bucket | The distribution of APIServer admission webhook durations in seconds. |
apiserver_admission_webhook_admission_duration_seconds_count | The count of APIServer admission webhook durations in seconds. |
apiserver_admission_webhook_admission_duration_seconds_sum | The sum of APIServer admission webhook durations in seconds. |
apiserver_admission_webhook_fail_open_count | The count of times that the APIServer admission webhook is configured as fail open. |
apiserver_admission_webhook_rejection_count | The count of requests rejected by the APIServer admission webhook. |
apiserver_admission_webhook_request_total | The total number of requests to the APIServer admission webhook. |
apiserver_audit_error_total | The total number of APIServer audit errors. |
apiserver_audit_event_total | The total number of APIServer audit events. |
apiserver_audit_level_total | The total number of APIServer audit levels. |
apiserver_audit_requests_rejected_total | The total number of rejected APIServer requests. |
apiserver_authorization_decisions_total | The total number of authorization decisions made by the APIServer. |
apiserver_cache_list_fetched_objects_total | The total number of objects obtained by the APIServer cache list. |
apiserver_cache_list_returned_objects_total | The total number of objects returned by the APIServer cache list. |
apiserver_cache_list_total | The total number of operations performed by the APIServer cache list. |
apiserver_cacher_received_events | The number of events received by the APIServer cache. |
apiserver_cacher_sended_events_latency_milliseconds_bucket | The distribution of APIServer event sending latencies in milliseconds. |
apiserver_cacher_sended_events_latency_milliseconds_count | The count of APIServer event sending latencies in milliseconds. |
apiserver_cacher_sended_events_latency_milliseconds_sum | The total of APIServer event sending latencies in milliseconds. |
apiserver_cacher_watcher_channel_length | The watcher channel length of the APIServer cache. |
apiserver_cel_compilation_duration_seconds_bucket | The distribution of APIServer Common Expression Language (CEL) compilation latencies in seconds. |
apiserver_cel_compilation_duration_seconds_count | The count of APIServer CEL compilations. |
apiserver_cel_compilation_duration_seconds_sum | The total time consumed for APIServer CEL compilations in seconds. |
apiserver_cel_evaluation_duration_seconds_bucket | The distribution of APIServer CEL evaluation latencies in seconds. |
apiserver_cel_evaluation_duration_seconds_count | The count of APIServer CEL evaluations. |
apiserver_cel_evaluation_duration_seconds_sum | The total of APIServer CEL evaluation latencies in seconds. |
apiserver_client_certificate_expiration_seconds_bucket | The distribution of remaining seconds until APIServer client certificate expiration. |
apiserver_client_certificate_expiration_seconds_count | The count of remaining seconds until APIServer client certificate expiration. |
apiserver_client_certificate_expiration_seconds_sum | The total remaining seconds until APIServer client certificate expiration. |
apiserver_clusterip_repair_ip_errors_total | The total number of ClusterIP errors fixed by the APIServer. |
apiserver_clusterip_repair_reconcile_errors_total | The total number of ClusterIP reconcile errors fixed by the APIServer. |
apiserver_conversion_webhook_duration_seconds_bucket | The distribution of APIServer conversion webhook latencies in seconds. |
apiserver_conversion_webhook_duration_seconds_count | The count of APIServer conversion webhook calls. |
apiserver_conversion_webhook_duration_seconds_sum | The total of APIServer conversion webhook latencies in seconds. |
apiserver_conversion_webhook_request_total | The total number of APIServer conversion webhook requests. |
apiserver_crd_conversion_webhook_duration_seconds_bucket | The distribution of APIServer Custom Resource Definition (CRD) conversion webhook latencies in seconds. |
apiserver_crd_conversion_webhook_duration_seconds_count | The count of APIServer CRD conversion webhook calls. |
apiserver_crd_conversion_webhook_duration_seconds_sum | The total of APIServer CRD conversion webhook latencies in seconds. |
apiserver_crd_webhook_conversion_duration_seconds_bucket | The distribution of APIServer CRD webhook conversion latencies in seconds. |
apiserver_crd_webhook_conversion_duration_seconds_count | The count of APIServer CRD webhook conversions. |
apiserver_crd_webhook_conversion_duration_seconds_sum | The total of APIServer CRD webhook conversion latencies in seconds. |
apiserver_created_watchers | The number of watchers created by the APIServer. |
apiserver_current_inflight_requests | The number of requests that are being processed by the APIServer. |
apiserver_current_inqueue_requests | The maximum number of queued requests in the APIServer. |
apiserver_dropped_requests_total | The total number of requests dropped by the APIServer. |
apiserver_encryption_config_controller_automatic_reload_failures_total | The number of times that the encryption configuration controller of the APIServer failed to be automatically reloaded. |
apiserver_encryption_config_controller_automatic_reload_success_total | The number of times that the encryption configuration controller of the APIServer was automatically reloaded. |
apiserver_envelope_encryption_dek_cache_fill_percent | The percentage of APIServer envelope encryption Data Encryption Key (DEK) cache filled. |
apiserver_error_watchers | The number of watchers in the Error state in the APIServer. |
apiserver_flowcontrol_current_executing_requests | The number of requests being processed by APIServer rate limiting. |
apiserver_flowcontrol_current_executing_seats | The number of seats occupied by APIServer rate limiting. |
apiserver_flowcontrol_current_inqueue_requests | The number of requests pending in queues in the APF system. |
apiserver_flowcontrol_current_inqueue_seats | The number of seats pending in APIServer rate limiting queues. |
apiserver_flowcontrol_current_limit_seats | The number of seats limited by APIServer rate limiting. |
apiserver_flowcontrol_current_r | The current R value of APIServer rate limiting. |
apiserver_flowcontrol_demand_seats_average | The average number of seats requested by APIServer rate limiting. |
apiserver_flowcontrol_demand_seats_bucket | The distribution of seats requested by APIServer rate limiting. |
apiserver_flowcontrol_demand_seats_count | The count of seats requested by APIServer rate limiting. |
apiserver_flowcontrol_demand_seats_high_watermark | The high watermark of seats requested by APIServer rate limiting. |
apiserver_flowcontrol_demand_seats_smoothed | The smoothed value of seats requested by APIServer rate limiting. |
apiserver_flowcontrol_demand_seats_stdev | The standard deviation of seats requested by APIServer rate limiting. |
apiserver_flowcontrol_demand_seats_sum | The sum of seats requested by APIServer rate limiting. |
apiserver_flowcontrol_dispatch_r | The scheduling R value of APIServer rate limiting. |
apiserver_flowcontrol_dispatched_requests_total | The total number of requests scheduled by APIServer rate limiting. |
apiserver_flowcontrol_latest_s | The recent S value bounds of APIServer rate limiting. |
apiserver_flowcontrol_lower_limit_seats | The lower bound of seats in APIServer rate limiting. |
apiserver_flowcontrol_next_discounted_s_bounds | The next discounted S value bounds of APIServer rate limiting. |
apiserver_flowcontrol_next_s_bounds | The next S value bounds of APIServer rate limiting. |
apiserver_flowcontrol_nominal_limit_seats | The nominal upper bound of seats in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_count_samples_bucket | The distribution of priority level request samples in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_count_samples_count | The count of priority level request samples in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_count_samples_sum | The sum of priority level request samples in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_count_watermarks_bucket | The distribution of watermark levels for priority level request samples in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_count_watermarks_count | The count of watermark levels for priority level request samples in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_count_watermarks_sum | The sum of watermark levels for priority level request samples in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_utilization_bucket | The distribution of request utilization samples by priority level in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_utilization_count | The count of request utilization samples by priority level in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_request_utilization_sum | The sum of request utilization by priority level in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_seat_count_samples_bucket | The distribution of seat samples for priority level in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_seat_count_samples_count | The count of seat samples for priority level in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_seat_count_samples_sum | The sum of seat samples for priority level in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_seat_count_watermarks_bucket | The distribution of watermark levels for seat samples in APIServer rate limiting by priority level. |
apiserver_flowcontrol_priority_level_seat_count_watermarks_count | The count of watermark levels for seat samples in APIServer rate limiting by priority level. |
apiserver_flowcontrol_priority_level_seat_count_watermarks_sum | The sum of watermark levels for seat samples in APIServer rate limiting by priority level. |
apiserver_flowcontrol_priority_level_seat_utilization_bucket | The distribution of seat utilization samples by priority level in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_seat_utilization_count | The count of seat utilization samples by priority level in APIServer rate limiting. |
apiserver_flowcontrol_priority_level_seat_utilization_sum | The sum of seat utilization by priority level in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_current_requests_bucket | The distribution of current read/write requests in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_current_requests_count | The count of current read/write requests in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_current_requests_sum | The sum of current read/write requests in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_request_count_samples_bucket | The distribution of read/write request count samples in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_request_count_samples_count | The count of read/write request count samples in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_request_count_samples_sum | The sum of read/write request count samples in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_request_count_watermarks_bucket | The distribution of read/write request count watermarks in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_request_count_watermarks_count | The count of read/write request count watermarks in APIServer rate limiting. |
apiserver_flowcontrol_read_vs_write_request_count_watermarks_sum | The sum of read/write request count watermarks in APIServer rate limiting. |
apiserver_flowcontrol_rejected_requests_total | The total number of requests rejected by APIServer rate limiting. |
apiserver_flowcontrol_request_concurrency_in_use | The count of concurrent requests in APIServer rate limiting. |
apiserver_flowcontrol_request_concurrency_limit | The concurrent request limit in APIServer rate limiting. |
apiserver_flowcontrol_request_dispatch_no_accommodation_total | The total number of requests that could not be accommodated by the scheduling of APIServer rate limiting. |
apiserver_flowcontrol_request_execution_seconds_bucket | The distribution of request latencies in seconds in APIServer rate limiting. |
apiserver_flowcontrol_request_execution_seconds_count | The count of request latencies in seconds in APIServer rate limiting. |
apiserver_flowcontrol_request_execution_seconds_sum | The sum of request latencies in seconds in APIServer rate limiting. |
apiserver_flowcontrol_request_queue_length_after_enqueue_bucket | The distribution of request queue lengths after enqueuing in APIServer rate limiting. |
apiserver_flowcontrol_request_queue_length_after_enqueue_count | The count of request queue lengths after enqueuing in APIServer rate limiting. |
apiserver_flowcontrol_request_queue_length_after_enqueue_sum | The sum of request queue lengths after enqueuing in APIServer rate limiting. |
apiserver_flowcontrol_request_wait_duration_seconds_bucket | The distribution of request waiting durations in seconds in APIServer rate limiting. |
apiserver_flowcontrol_request_wait_duration_seconds_count | The count of request waiting durations in seconds in APIServer rate limiting. |
apiserver_flowcontrol_request_wait_duration_seconds_sum | The sum of request waiting durations in seconds in APIServer rate limiting. |
apiserver_flowcontrol_seat_fair_frac | The fair share ratios determined by the APIServer during the last borrowing adjustment period. |
apiserver_flowcontrol_target_seats | The target number of seats in APIServer rate limiting. |
apiserver_flowcontrol_upper_limit_seats | The upper bound of seats in APIServer rate limiting. |
apiserver_flowcontrol_watch_count_samples_bucket | The distribution of observed samples in APIServer rate limiting. |
apiserver_flowcontrol_watch_count_samples_count | The count of observed samples in APIServer rate limiting. |
apiserver_flowcontrol_watch_count_samples_sum | The sum of observed samples in APIServer rate limiting. |
apiserver_flowcontrol_work_estimated_seats_bucket | The distribution of estimated seats in APIServer rate limiting. |
apiserver_flowcontrol_work_estimated_seats_count | The count of estimated seats in APIServer rate limiting. |
apiserver_flowcontrol_work_estimated_seats_sum | The sum of estimated seats in APIServer rate limiting. |
apiserver_init_events_total | The total number of initialization events in the APIServer. |
apiserver_kube_aggregator_x509_insecure_sha1_total | The number of requests using insecure Secure Hash Algorithm 1 (SHA1) signatures. |
apiserver_kube_aggregator_x509_missing_san_total | The total number of x509 certificates missing Subject Alternative Names (SANs) in APIServer kube-aggregator. |
apiserver_longrunning_gauge | The long-running meter in the APIServer. |
apiserver_longrunning_requests | The long-running requests in the APIServer. |
apiserver_nodeport_repair_reconcile_errors_total | The total number of node port fix reconcile errors in the APIServer. |
apiserver_realtime_watchers | The number of real-time observers in the APIServer. |
apiserver_registered_watchers | The number of registered watchers in the APIServer. |
apiserver_request_aborts_total | The total number of suspended APIServer requests. |
apiserver_request_body_size_bytes_bucket | The distribution of APIServer request body sizes in bytes. |
apiserver_request_body_size_bytes_count | The count of APIServer request body sizes in bytes. |
apiserver_request_body_size_bytes_sum | The sum of APIServer request body sizes in bytes. |
apiserver_request_count | The number of APIServer requests. |
apiserver_request_duration_seconds_bucket | The distribution of APIServer request latencies in seconds |
apiserver_request_duration_seconds_count | The count of APIServer request latencies in seconds |
apiserver_request_duration_seconds_sum | The sum of APIServer request latencies in seconds |
apiserver_request_filter_duration_seconds_bucket | The distribution of request filter latencies in seconds. |
apiserver_request_filter_duration_seconds_count | The count of request filter latencies in seconds. |
apiserver_request_filter_duration_seconds_sum | The sum of request filter latencies in seconds. |
apiserver_request_latencies_summary | The summary of APIServer request latencies. |
apiserver_request_no_resourceversion_list_total | The total number of unversioned LIST requests. |
apiserver_request_post_timeout_total | The total number of timed out POST requests. |
apiserver_request_sli_duration_seconds_bucket | The distribution of Service Level Indicator (SLI) request latencies in seconds. |
apiserver_request_sli_duration_seconds_count | The count of SLI request latencies in seconds. |
apiserver_request_sli_duration_seconds_sum | The sum of SLI request latencies in seconds. |
apiserver_request_slo_duration_seconds_bucket | The distribution of Service Level Objective (SLO) request latencies in seconds. |
apiserver_request_slo_duration_seconds_count | The count of SLO request latencies in seconds. |
apiserver_request_slo_duration_seconds_sum | The sum of SLO request latencies in seconds. |
apiserver_request_terminations_total | The total number of terminated API requests. |
apiserver_request_timestamp_comparison_time_bucket | The distribution of time spent in timestamp comparison of API requests. |
apiserver_request_timestamp_comparison_time_count | The count of API request samples for timestamp comparison. |
apiserver_request_timestamp_comparison_time_sum | The sum of time spent in timestamp comparison of API requests. |
apiserver_request_total | The total number of API requests. |
apiserver_requested_deprecated_apis | The count of APIServer requests for deprecated APIs. |
apiserver_response_sizes_bucket | The distribution of response body sizes of API requests. |
apiserver_response_sizes_count | The count of response body sizes of API requests. |
apiserver_response_sizes_sum | The sum of response body sizes of API requests. |
apiserver_selfrequest_total | The total number of APIServer self-requests. |
apiserver_storage_data_key_generation_duration_seconds_bucket | The distribution of time consumed by the APIServer to generate data keys in seconds. |
apiserver_storage_data_key_generation_duration_seconds_count | The count of time consumed by the APIServer to generate data keys in seconds. |
apiserver_storage_data_key_generation_duration_seconds_sum | The sum of time consumed by the APIServer to generate data keys in seconds. |
apiserver_storage_data_key_generation_failures_total | The total number of data key generation failures. |
apiserver_storage_db_total_size_in_bytes | The total size of APIServer databases in bytes. |
apiserver_storage_decode_errors_total | The total number of decoding errors in the APIServer. |
apiserver_storage_envelope_transformation_cache_misses_total | The total number of envelope conversion cache misses in the APIServer. |
apiserver_storage_events_received_total | The total number of events received by the APIServer. |
apiserver_storage_list_evaluated_objects_total | The total number of evaluated objects in the APIServer storage list. |
apiserver_storage_list_fetched_objects_total | The total number of objects obtained by the APIServer storage list. |
apiserver_storage_list_returned_objects_total | The total number of objects returned by the APIServer storage list. |
apiserver_storage_list_total | The total number of operations performed by the APIServer storage list. |
apiserver_storage_objects | The number of objects stored in the APIServer. |
apiserver_storage_size_bytes | The total size of objects stored in the APIServer. |
apiserver_terminated_watchers_total | The total number of watchers terminated by the APIServer. |
apiserver_tls_handshake_errors_total | The total number of requests with Transport Layer Security (TLS) handshake errors in the APIServer. |
apiserver_too_large_resourceversion_errors | The total number of requests whose resource version is too late in the APIServer. |
apiserver_watch_cache_events_dispatched_total | The total number of cache distribution events observed by the APIServer. |
apiserver_watch_cache_events_received_total | The total number of cache reception events observed by the APIServer. |
apiserver_watch_cache_initializations_total | The total number of cache initializations observed by the APIServer. |
apiserver_watch_cache_read_wait_seconds_bucket | The distribution of cache read waiting durations in seconds observed by the APIServer. |
apiserver_watch_cache_read_wait_seconds_count | The count of cache read waiting durations in seconds observed by the APIServer. |
apiserver_watch_cache_read_wait_seconds_sum | The sum of cache read waiting durations in seconds observed by the APIServer. |
apiserver_watch_cache_watch_cache_initializations_total | The total number of cache initializations observed by the APIServer. |
apiserver_watch_events_sizes_bucket | The distribution of sizes of events observed by the APIServer. |
apiserver_watch_events_sizes_count | The count of sizes of events observed by the APIServer. |
apiserver_watch_events_sizes_sum | The sum of sizes of events observed by the APIServer. |
apiserver_watch_events_total | The total number of events observed by the APIServer. |
apiserver_webhooks_x509_insecure_sha1_total | The number of requests using insecure SHA1 signatures. |
apiserver_webhooks_x509_missing_san_total | The total number of missing SANs in APIServer webhooks. |
authenticated_user_requests | The total number of authenticated user requests. |
authentication_attempts | The number of authentication attempts. |
authentication_duration_seconds_bucket | The distribution of authentication durations in seconds. |
authentication_duration_seconds_count | The count of authentication durations in seconds. |
authentication_duration_seconds_sum | The sum of authentication durations in seconds. |
authentication_token_cache_active_fetch_count | The count of active fetches for the authentication token cache. |
authentication_token_cache_fetch_total | The total number of times the authentication token was retrieved from the cache. |
authentication_token_cache_request_duration_seconds_bucket | The distribution of request durations in seconds for authentication token cache. |
authentication_token_cache_request_duration_seconds_count | The count of request durations in seconds for authentication token cache. |
authentication_token_cache_request_duration_seconds_sum | The sum of request durations in seconds for authentication token cache. |
authentication_token_cache_request_total | The total number of requests for authentication token cache. |
authorization_attempts_total | The total number of authorization attempts. |
authorization_duration_seconds_bucket | The distribution of authorization durations in seconds. |
authorization_duration_seconds_count | The count of authorization durations in seconds. |
authorization_duration_seconds_sum | The sum of authorization durations in seconds. |
cardinality_enforcement_unexpected_categorizations_total | The total number of unexpected classifications in classification execution. |
count | The count details. |
cpu_utilization_core | The CPU utilization of the core. |
disabled_metric_total | The total number of disabled metrics. |
disabled_metrics_total | The total number of disabled metrics. |
etcd_bookmark_counts | The number of ETCD bookmarks. |
etcd_db_total_size_in_bytes | The total size of ETCD databases in bytes. |
etcd_lease_object_counts_bucket | The distribution of objects attached to a single ETCD lease. |
etcd_lease_object_counts_count | The count of objects attached to a single ETCD lease. |
etcd_lease_object_counts_sum | The sum of objects attached to a single ETCD lease. |
etcd_object_counts | The number of ETCD objects. |
etcd_request_duration_seconds_bucket | The distribution of ETCD request latencies in seconds. |
etcd_request_duration_seconds_count | The count of ETCD request latencies in seconds. |
etcd_request_duration_seconds_sum | The sum of ETCD request latencies in seconds. |
etcd_request_errors_total | The total number of failed ETCD requests. |
etcd_requests_total | The total number of ETCD requests. |
etcd_watcher_channel_length | The channel length of the ETCD watcher. |
etcd_watcher_received_events | The number of events received by the ETCD watcher. |
etcd_watcher_sended_events_latency_milliseconds_bucket | The distribution of event sending latencies of the ETCD watcher in milliseconds. |
etcd_watcher_sended_events_latency_milliseconds_count | The count of event sending latencies of the ETCD watcher in milliseconds. |
etcd_watcher_sended_events_latency_milliseconds_sum | The sum of event sending latencies of the ETCD watcher in milliseconds. |
field_validation_request_duration_seconds_bucket | The distribution of field validation request latencies in seconds. |
field_validation_request_duration_seconds_count | The count of field validation request latencies in seconds. |
field_validation_request_duration_seconds_sum | The sum of field validation request latencies in seconds. |
get_token_count | The number of obtained tokens. |
get_token_fail_count | The number of token obtaining failures. |
go_cgo_go_to_c_calls_calls_total | The total number of C function calls made by cgo. |
go_cpu_classes_gc_mark_assist_cpu_seconds_total | The total CPU seconds spent on garbage collection (GC) mark assistance by Go. |
go_cpu_classes_gc_mark_dedicated_cpu_seconds_total | The total CPU seconds spent on dedicated GC marking by Go. |
go_cpu_classes_gc_mark_idle_cpu_seconds_total | The total CPU seconds spent on idle GC marking by Go. |
go_cpu_classes_gc_pause_cpu_seconds_total | The total CPU seconds spent on GC pauses by Go. |
go_cpu_classes_gc_total_cpu_seconds_total | The total CPU seconds spent on GC by Go. |
go_cpu_classes_idle_cpu_seconds_total | The total CPU idle time in Go. |
go_cpu_classes_scavenge_assist_cpu_seconds_total | The total CPU seconds spent on GC assist scanning by Go. |
go_cpu_classes_scavenge_background_cpu_seconds_total | The total CPU seconds spent on background GC scanning by Go. |
go_cpu_classes_scavenge_total_cpu_seconds_total | The total CPU seconds spent on GC by Go. |
go_cpu_classes_total_cpu_seconds_total | The total CPU seconds. |
go_cpu_classes_user_cpu_seconds_total | The user CPU time. |
go_gc_cycles_automatic_gc_cycles_total | The total number of automatic GC cycles. |
go_gc_cycles_forced_gc_cycles_total | The total number of forced GC cycles. |
go_gc_cycles_total_gc_cycles_total | The total number of GC cycles. |
go_gc_duration_seconds | The GC pause time in seconds. |
go_gc_duration_seconds_count | The count of GC pause time in seconds. |
go_gc_duration_seconds_sum | The sum of GC pause time in seconds. |
go_gc_gogc_percent | The GO GC target percentage. |
go_gc_gomemlimit_bytes | The GC memory limit in bytes. |
go_gc_heap_allocs_by_size_bytes_bucket | The distribution of allocated heap memory sizes in bytes. |
go_gc_heap_allocs_by_size_bytes_count | The count of allocated heap memory sizes in bytes. |
go_gc_heap_allocs_by_size_bytes_sum | The sum of allocated heap memory sizes in bytes. |
go_gc_heap_allocs_by_size_bytes_total_bucket | The distribution of all allocated heap memory sizes in bytes. |
go_gc_heap_allocs_by_size_bytes_total_count | The count of all allocated heap memory sizes in bytes. |
go_gc_heap_allocs_by_size_bytes_total_sum | The sum of all allocated heap memory sizes in bytes. |
go_gc_heap_allocs_bytes_total | The total number of bytes allocated on the heap. |
go_gc_heap_allocs_objects_total | The total number of objects allocated on the heap. |
go_gc_heap_frees_by_size_bytes_bucket | The distribution of released heap memory sizes in bytes. |
go_gc_heap_frees_by_size_bytes_count | The count of released heap memory sizes in bytes. |
go_gc_heap_frees_by_size_bytes_sum | The sum of released heap memory sizes in bytes. |
go_gc_heap_frees_by_size_bytes_total_bucket | The distribution of all released heap memory sizes in bytes. |
go_gc_heap_frees_by_size_bytes_total_count | The count of all released heap memory sizes in bytes. |
go_gc_heap_frees_by_size_bytes_total_sum | The sum of all released heap memory sizes in bytes. |
go_gc_heap_frees_bytes_total | The total number of bytes released from the heap. |
go_gc_heap_frees_objects_total | The total number of objects released from the heap. |
go_gc_heap_goal_bytes | The expected heap size in bytes. |
go_gc_heap_live_bytes | The heap memory occupied by live objects in bytes. |
go_gc_heap_objects_objects | The number of objects that occupy the heap memory. |
go_gc_heap_tiny_allocs_objects_total | The total number of tiny object allocations. |
go_gc_limiter_last_enabled_gc_cycle | The last GC cycle enabled. |
go_gc_pauses_seconds_bucket | The distribution of GC pause durations. |
go_gc_pauses_seconds_count | The count of GC pause durations. |
go_gc_pauses_seconds_sum | The sum of GC pause durations. |
go_gc_pauses_seconds_total_bucket | The distribution of all GC pause durations. |
go_gc_pauses_seconds_total_count | The count of all GC pause durations. |
go_gc_pauses_seconds_total_sum | The sum of all GC pause durations. |
go_gc_scan_globals_bytes | The number of bytes scanned in global variables. |
go_gc_scan_heap_bytes | The number of bytes scanned in the heap. |
go_gc_scan_stack_bytes | The number of bytes scanned in the stack. |
go_gc_scan_total_bytes | The total number of scanned bytes. |
go_gc_stack_starting_size_bytes | The initial stack size in bytes. |
go_godebug_non_default_behavior_execerrdot_events_total | The count of non-default behavior debug events related to the execerrdot debug setting. |
go_godebug_non_default_behavior_gocachehash_events_total | The count of non-default behavior debug events related to the gocachehash debug setting. |
go_godebug_non_default_behavior_gocachetest_events_total | The count of non-default behavior debug events related to the gocachetest debug setting. |
go_godebug_non_default_behavior_gocacheverify_events_total | The count of non-default behavior debug events related to the gocacheverify debug setting. |
go_godebug_non_default_behavior_gotypesalias_events_total | The count of non-default behavior debug events related to the gotypesalias debug setting. |
go_godebug_non_default_behavior_http2client_events_total | The count of non-default behavior debug events related to the http2client debug setting. |
go_godebug_non_default_behavior_http2server_events_total | The count of non-default behavior debug events related to the http2server debug setting. |
go_godebug_non_default_behavior_httplaxcontentlength_events_total | The count of non-default behavior debug events related to the httplaxcontentlength debug setting. |
go_godebug_non_default_behavior_httpmuxgo121_events_total | The count of non-default behavior debug events related to the httpmuxgo121 debug setting. |
go_godebug_non_default_behavior_installgoroot_events_total | The count of non-default behavior debug events related to the installgoroot debug setting. |
go_godebug_non_default_behavior_jstmpllitinterp_events_total | The count of non-default behavior debug events related to the jstmpllitinterp debug setting. |
go_godebug_non_default_behavior_multipartmaxheaders_events_total | The count of non-default behavior debug events related to the multipartmaxheaders debug setting. |
go_godebug_non_default_behavior_multipartmaxparts_events_total | The count of non-default behavior debug events related to the multipartmaxparts debug setting. |
go_godebug_non_default_behavior_multipathtcp_events_total | The count of non-default behavior debug events related to the multipathtcp debug setting. |
go_godebug_non_default_behavior_panicnil_events_total | The count of non-default behavior debug events related to the panicnil debug setting. |
go_godebug_non_default_behavior_randautoseed_events_total | The count of non-default behavior debug events related to the randautoseed debug setting. |
go_godebug_non_default_behavior_tarinsecurepath_events_total | The count of non-default behavior debug events related to the tarinsecurepath debug setting. |
go_godebug_non_default_behavior_tls10server_events_total | The count of non-default behavior debug events related to the tls10server debug setting. |
go_godebug_non_default_behavior_tlsmaxrsasize_events_total | The count of non-default behavior debug events related to the tlsmaxrsasize debug setting. |
go_godebug_non_default_behavior_tlsrsakex_events_total | The count of non-default behavior debug events related to the tlsrsakex debug setting. |
go_godebug_non_default_behavior_tlsunsafeekm_events_total | The count of non-default behavior debug events related to the tlsunsafeekm debug setting. |
go_godebug_non_default_behavior_x509sha1_events_total | The count of non-default behavior debug events related to the x509sha1 debug setting. |
go_godebug_non_default_behavior_x509usefallbackroots_events_total | The count of non-default behavior debug events related to the x509usefallbackroots debug setting. |
go_godebug_non_default_behavior_x509usepolicies_events_total | The count of non-default behavior debug events related to the x509usepolicies debug setting. |
go_godebug_non_default_behavior_zipinsecurepath_events_total | The count of non-default behavior debug events related to the zipinsecurepath debug setting. |
go_goroutines | The number of goroutines. |
go_info | The operating system information. |
go_memory_classes_heap_free_bytes | The amount of idle heap memory in bytes. |
go_memory_classes_heap_objects_bytes | The amount of heap memory occupied by objects in bytes. |
go_memory_classes_heap_released_bytes | The amount of heap memory released in bytes. |
go_memory_classes_heap_stacks_bytes | The amount of memory reserved for the stack in bytes. |
go_memory_classes_heap_unused_bytes | The amount of heap memory not used in bytes. |
go_memory_classes_metadata_mcache_free_bytes | The amount of idle memory in mcache in bytes. |
go_memory_classes_metadata_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memory_classes_metadata_mspan_free_bytes | The amount of idle memory in mspan in bytes. |
go_memory_classes_metadata_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memory_classes_metadata_other_bytes | The amount of memory occupied by other metadata in bytes. |
go_memory_classes_os_stacks_bytes | The amount of memory reserved for the operating system stack in bytes. |
go_memory_classes_other_bytes | The amount of memory used for other purposes in bytes. |
go_memory_classes_profiling_buckets_bytes | The bytes used by profiling buckets. |
go_memory_classes_total_bytes | The total memory in bytes. |
go_memstats_alloc_bytes | The amount of memory allocated in bytes. |
go_memstats_alloc_bytes_total | The cumulative amount of memory allocated in bytes. |
go_memstats_buck_hash_sys_bytes | The amount of memory used by hash tables in the operating system in bytes. |
go_memstats_frees_total | The total number of releases. |
go_memstats_gc_cpu_fraction | The GC CPU utilization (%). |
go_memstats_gc_sys_bytes | The amount of memory used by GC in the operating system in bytes. |
go_memstats_heap_alloc_bytes | The amount of heap memory allocated in bytes. |
go_memstats_heap_idle_bytes | The amount of idle heap memory in bytes. |
go_memstats_heap_inuse_bytes | The amount of heap memory in use in bytes. |
go_memstats_heap_objects | The number of objects allocated on the heap. |
go_memstats_heap_released_bytes | The amount of heap memory released in bytes. |
go_memstats_heap_sys_bytes | The amount of memory allocated to the heap by the operating system in bytes. |
go_memstats_last_gc_time_seconds | The last GC duration in seconds. |
go_memstats_lookups_total | The total number of lookups. |
go_memstats_mallocs_total | The total number of allocations. |
go_memstats_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memstats_mcache_sys_bytes | The amount of memory allocated to mcache by the operating system in bytes. |
go_memstats_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memstats_mspan_sys_bytes | The amount of memory allocated to mspan by the operating system in bytes. |
go_memstats_next_gc_bytes | The number of bytes to be released at the next GC in bytes. |
go_memstats_other_sys_bytes | The amount of memory allocated for other purposes by the operating system in bytes. |
go_memstats_stack_inuse_bytes | The amount of stack memory in use in bytes. |
go_memstats_stack_sys_bytes | The amount of memory allocated to the stack by the operating system in bytes. |
go_memstats_sys_bytes | The total memory allocated by the operating system in bytes. |
go_sched_gomaxprocs_threads | The number of threads determined by GOMAXPROCS. |
go_sched_goroutines_goroutines | The number of goroutines. |
go_sched_latencies_seconds_bucket | The distribution of scheduling latencies in seconds. |
go_sched_latencies_seconds_count | The count of scheduling latencies in seconds. |
go_sched_latencies_seconds_sum | The sum of scheduling latencies in seconds. |
go_sched_pauses_stopping_gc_seconds_bucket | The distribution of stop-the-world GC pause durations in seconds. |
go_sched_pauses_stopping_gc_seconds_count | The count of stop-the-world GC pause durations in seconds. |
go_sched_pauses_stopping_gc_seconds_sum | The sum of stop-the-world GC pause durations in seconds. |
go_sched_pauses_stopping_other_seconds_bucket | The distribution of other GC pause durations for other specific stops in seconds. |
go_sched_pauses_stopping_other_seconds_count | The count of other GC pause durations for other specific stops in seconds. |
go_sched_pauses_stopping_other_seconds_sum | The sum of other GC pause durations for other specific stops in seconds. |
go_sched_pauses_total_gc_seconds_bucket | The distribution of all GC pause durations in seconds. |
go_sched_pauses_total_gc_seconds_count | The count of all GC pause durations in seconds. |
go_sched_pauses_total_gc_seconds_sum | The sum of all GC pause durations in seconds. |
go_sched_pauses_total_other_seconds_bucket | The distribution of other GC pause durations for all other stops in seconds. |
go_sched_pauses_total_other_seconds_count | The count of other GC pause durations for all other stops in seconds. |
go_sched_pauses_total_other_seconds_sum | The cumulative sum of all goroutine pause durations caused by non-major activities in the scheduler in seconds. |
go_sync_mutex_wait_total_seconds_total | The total waiting duration for Mutex locks in seconds. |
go_threads | The number of Go threads. |
grpc_client_handled_total | The total number of requests handled by the gRPC client. |
grpc_client_msg_received_total | The total number of messages received by the gRPC client. |
grpc_client_msg_sent_total | The total number of messages sent by the gRPC client. |
grpc_client_started_total | The total number of gRPC client startups. |
hidden_metric_total | The total number of hidden metrics. |
hidden_metrics_total | The total number of hidden metrics. |
http_request_duration_microseconds | The HTTP request latency in microseconds. |
http_request_size_bytes | The HTTP request size in bytes. |
http_requests_total | The total number of HTTP requests. |
http_response_size_bytes | The HTTP response body size in bytes. |
job | The job name. |
job_instance_mode | The job instance mode. |
kube_apiserver_clusterip_allocator_allocated_ips | Kubernetes APIServer: The number of allocated cluster IP addresses. |
kube_apiserver_clusterip_allocator_allocation_errors_total | Kubernetes APIServer: The total number of errors that occurred in cluster IP address allocations. |
kube_apiserver_clusterip_allocator_allocation_total | Kubernetes APIServer: The total number of cluster IP address allocations. |
kube_apiserver_clusterip_allocator_available_ips | Kubernetes APIServer: The number of available cluster IP addresses. |
kube_apiserver_nodeport_allocator_allocated_ports | Kubernetes APIServer: The number of allocated node ports. |
kube_apiserver_nodeport_allocator_allocation_errors_total | Kubernetes APIServer: The total number of errors that occurred in node port allocations. |
kube_apiserver_nodeport_allocator_allocation_total | Kubernetes APIServer: The total number of node port allocations. |
kube_apiserver_nodeport_allocator_available_ports | Kubernetes APIServer: The number of available node ports. |
kube_apiserver_pod_logs_backend_tls_failure_total | Kubernetes APIServer: The total number of pod/log requests that failed due to TLS verification errors. |
kube_apiserver_pod_logs_insecure_backend_total | Kubernetes APIServer: The total number of insecure pod/log requests. |
kube_apiserver_pod_logs_pods_logs_backend_tls_failure_total | Kubernetes APIServer: The total number of pod/log requests that failed due to TLS verification errors. |
kube_apiserver_pod_logs_pods_logs_insecure_backend_total | Kubernetes APIServer: The total number of insecure pod/log requests. |
kubelet_container_log_filesystem_used_bytes | Kubelet: The space of the file system used by container logs in bytes. |
kubelet_node_name | Kubelet: The node name. |
kubelet_pleg_relist_duration_seconds_bucket | Kubelet: The distribution of PLEG relisting durations in seconds. |
kubelet_pod_worker_duration_seconds_bucket | Kubelet: The distribution of Pod worker relisting durations in seconds. |
kubelet_volume_stats_available_bytes | Kubelet: The number of available bytes in the volume. |
kubelet_volume_stats_capacity_bytes | Kubelet: The volume capacity in bytes. |
kubelet_volume_stats_inodes | Kubelet: The number of available inodes in the volume. |
kubelet_volume_stats_inodes_free | Kubelet: The number of idle inodes in the volume. |
kubelet_volume_stats_inodes_used | Kubelet: The number of used inodes in the volume. |
kubelet_volume_stats_used_bytes | Kubelet: The number of used bytes in the volume. |
kubernetes_build_info | The Kubernetes build information. |
kubernetes_feature_enabled | Specifies that Kubernetes features are enabled. |
last_list_all_response_size_in_bytes | The total size of all response bodies in the recent list in bytes. |
memory_utilization_byte | The used memory in bytes. |
node_authorizer_graph_actions_duration_seconds_bucket | Node authorizer: The distribution of graph operation durations in seconds. |
node_authorizer_graph_actions_duration_seconds_count | Node authorizer: The count of graph operation durations in seconds. |
node_authorizer_graph_actions_duration_seconds_sum | Node authorizer: The sum of graph operation durations in seconds. |
pod_security_evaluations_total | The total number of pod security evaluations. |
pod_security_exemptions_total | The total number of pod security exemptions. |
process_cpu_seconds_total | The total process CPU seconds. |
process_max_fds | The maximum number of file descriptors for the process. |
process_open_fds | The number of file descriptors opened by the process. |
process_resident_memory_bytes | The resident memory size of the process in bytes. |
process_start_time_seconds | The process startup duration in seconds. |
process_virtual_memory_bytes | The number of virtual memory bytes for the process. |
process_virtual_memory_max_bytes | The maximum number of virtual memory bytes for the process. |
registered_metric_total | The total number of registered metrics. |
registered_metrics_total | The total number of registered metrics. |
rest_client_exec_plugin_certificate_rotation_age_bucket | REST client plug-in: The distribution of certificate rotation ages in seconds. |
rest_client_exec_plugin_certificate_rotation_age_count | REST client plug-in: The count of certificate rotation ages in seconds. |
rest_client_exec_plugin_certificate_rotation_age_sum | REST client plug-in: The sum of certificate rotation ages in seconds. |
rest_client_exec_plugin_ttl_seconds | REST client plug-in: The time to live (TTL) of the certificate in seconds. |
rest_client_request_duration_seconds_bucket | The distribution of REST client request durations in seconds. |
rest_client_request_duration_seconds_count | The count of REST client request durations in seconds. |
rest_client_request_duration_seconds_sum | The sum of REST client request durations in seconds. |
rest_client_request_latency_seconds_bucket | The total of REST client request latencies in seconds. |
rest_client_request_size_bytes_bucket | The distribution of REST client request-body sizes in bytes. |
rest_client_request_size_bytes_count | The count of REST client request-body sizes in bytes. |
rest_client_request_size_bytes_sum | The sum of REST client request-body sizes in bytes. |
rest_client_requests_total | The number of REST client requests. |
rest_client_response_size_bytes_bucket | The distribution of REST client response-body sizes in bytes. |
rest_client_response_size_bytes_count | The count of REST client response-body sizes in bytes. |
rest_client_response_size_bytes_sum | The sum of REST client response-body sizes in bytes. |
rest_client_transport_cache_entries | The number of transport entries of the REST client. |
rest_client_transport_create_calls_total | The total number of transport creation calls of the REST client. |
scheduler_pending_pods | Scheduler: The number of pods to be scheduled. |
scheduler_pod_scheduling_attempts_bucket | Scheduler: The distribution of pod scheduling attempts. |
scheduler_scheduler_cache_size | The scheduler cache size. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
serviceaccount_invalid_legacy_auto_token_uses_total | The total number of uses of invalid legacy automatic service account tokens. |
serviceaccount_legacy_auto_token_uses_total | The total number of uses of legacy automatic service account tokens. |
serviceaccount_legacy_manual_token_uses_total | The total number of uses of legacy manual service account tokens. |
serviceaccount_legacy_tokens_total | The total number of legacy service account tokens. |
serviceaccount_stale_tokens_total | The total number of stale service account tokens. |
serviceaccount_valid_tokens_total | The total number of valid service account tokens. |
ssh_tunnel_open_count | The number of opened Secure Shell (SSH) tunnels. |
ssh_tunnel_open_fail_count | The number of SSH tunnels that failed to be opened. |
up | The connectivity of metric collection. |
watch_cache_capacity | The capacity of the monitoring cache. |
watch_cache_capacity_decrease_total | The increasing capacity of the monitoring cache. |
watch_cache_capacity_increase_total | The decreasing capacity of the monitoring cache. |
workqueue_adds_total | The total number of additions to the work queue. |
workqueue_depth | The work queue depth. |
workqueue_longest_running_processor_seconds | The longest running processor time in the work queue in seconds. |
workqueue_queue_duration_seconds_bucket | The distribution of queueing durations in the work queue in seconds. |
workqueue_queue_duration_seconds_count | The count of queueing durations in the work queue in seconds. |
workqueue_queue_duration_seconds_sum | The sum of queueing durations in the work queue in seconds. |
workqueue_retries_total | The total number of retries in the work queue. |
workqueue_unfinished_work_seconds | The duration of unfinished work in the work queue in seconds. |
workqueue_work_duration_seconds_bucket | The distribution of work durations in the work queue in seconds. |
workqueue_work_duration_seconds_count | The count of work durations in the work queue in seconds. |
workqueue_work_duration_seconds_sum | The sum of work durations in the work queue in seconds. |
Node Exporter (job name: node-exporter)
Metric | Description |
ALERTS | The alerts. |
ALERTS_FOR_STATE | The number of alerts based on status. |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_job_discovery_status | The discovery status of the Prometheus agent collection jobs. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
count | The Go-specific count details. |
go_gc_duration_seconds | The Go GC pause duration in seconds. |
go_gc_duration_seconds_count | The Go GC pause duration in seconds. |
go_gc_duration_seconds_sum | The total Go GC pause duration in seconds. |
go_goroutines | The number of goroutines. |
go_info | The Go-specific information. |
go_memstats_alloc_bytes | The amount of memory allocated in bytes. |
go_memstats_alloc_bytes_total | The cumulative amount of memory allocated in bytes. |
go_memstats_buck_hash_sys_bytes | The amount of memory used by hash tables in the operating system in bytes. |
go_memstats_frees_total | The total number of releases. |
go_memstats_gc_cpu_fraction | The GC CPU utilization (%). |
go_memstats_gc_sys_bytes | The amount of memory used by GC in the operating system in bytes. |
go_memstats_heap_alloc_bytes | The amount of heap memory allocated in bytes. |
go_memstats_heap_idle_bytes | The amount of idle heap memory in bytes. |
go_memstats_heap_inuse_bytes | The amount of heap memory in use in bytes. |
go_memstats_heap_objects | The number of objects allocated on the heap. |
go_memstats_heap_released_bytes | The amount of heap memory released in bytes. |
go_memstats_heap_sys_bytes | The amount of memory allocated to the heap by the operating system in bytes. |
go_memstats_last_gc_time_seconds | The last GC duration in seconds. |
go_memstats_lookups_total | The total number of lookups. |
go_memstats_mallocs_total | The total number of allocations. |
go_memstats_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memstats_mcache_sys_bytes | The amount of memory allocated to mcache by the operating system in bytes. |
go_memstats_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memstats_mspan_sys_bytes | The amount of memory allocated to mspan by the operating system in bytes. |
go_memstats_next_gc_bytes | The number of bytes to be released at the next GC in bytes. |
go_memstats_other_sys_bytes | The amount of memory allocated for other purposes by the operating system in bytes. |
go_memstats_stack_inuse_bytes | The amount of stack memory in use in bytes. |
go_memstats_stack_sys_bytes | The amount of memory allocated to the stack by the operating system in bytes. |
go_memstats_sys_bytes | The total memory allocated by the operating system in bytes. |
go_threads | The number of threads. |
instance | The instance. |
instance_device | The instance device. |
job | The job name. |
k8s_node_cpu_utilization | The CPU utilization of Kubernetes nodes. |
k8s_node_disk_utilization | The disk usage of Kubernetes nodes. |
k8s_node_memory_utilization | The memory usage of Kubernetes nodes. |
node_arp_entries | The number of Address Resolution Protocol (ARP) entries on the node. |
node_boot_time_seconds | The node startup duration in seconds. |
node_context_switches_total | The total number of context switches on the node. |
node_cooling_device_cur_state | The current state of the cooling device of the node. |
node_cooling_device_max_state | The maximum state of the cooling device of the node. |
node_cpu_core_throttles_total | The total number of CPU core throttling events on the node. |
node_cpu_frequency_max_hertz | The maximum CPU frequency of the node in Hertz. |
node_cpu_frequency_min_hertz | The minimum CPU frequency of the node in Hertz. |
node_cpu_guest_seconds_total | The total virtual machine time of the node CPU. |
node_cpu_package_throttles_total | The total number of CPU package throttling events on the node. |
node_cpu_scaling_frequency_hertz | The dynamic CPU frequency of the node in Hz. |
node_cpu_scaling_frequency_max_hertz | The maximum dynamic CPU frequency of the node in Hz. |
node_cpu_scaling_frequency_min_hertz | The minimum dynamic CPU frequency of the node in Hz. |
node_cpu_scaling_governor | The dynamic CPU governor of the node. |
node_cpu_seconds_total | The total CPU time consumed on the node. |
node_disk_device_mapper_info | The DeviceMapper information of the node. |
node_disk_discard_time_seconds_total | The total disk discard time of the node in seconds. |
node_disk_discarded_sectors_total | The total disk discard sectors of the node. |
node_disk_discards_completed_total | The total completed disk discards of the node. |
node_disk_discards_merged_total | The total merged disk discards of the node. |
node_disk_filesystem_info | The file system information of the node. |
node_disk_flush_requests_time_seconds_total | The total flush request duration of the node in seconds. |
node_disk_flush_requests_total | The total number of flush requests of the node. |
node_disk_info | The node disk information. |
node_disk_io_now | The current disk I/O of the node. |
node_disk_io_time_seconds_total | The total disk I/O duration of the node in seconds. |
node_disk_io_time_weighted_seconds_total | The total weighted disk I/O time of the node in seconds. |
node_disk_read_bytes_total | The total number of bytes read from the disk of the node. |
node_disk_read_time_seconds_total | The total disk read time of the node in seconds. |
node_disk_reads_completed_total | The total number of complete disk reads of the node. |
node_disk_reads_merged_total | The total number of merged disk reads of the node. |
node_disk_write_time_seconds_total | The total disk write time of the node in seconds. |
node_disk_writes_completed_total | The total number of complete disk writes of the node. |
node_disk_writes_merged_total | The total number of merged disk writes of the node. |
node_disk_written_bytes_total | The total number of bytes written to the disk of the node. |
node_dmi_info | The Desktop Management Interface (DMI) information of the node. |
node_edac_correctable_errors_total | The total number of correctable memory errors of the node. |
node_edac_csrow_correctable_errors_total | The total number of correctable memory errors in chip-select rows of the node. |
node_edac_csrow_uncorrectable_errors_total | The total number of uncorrectable memory errors in chip-select rows of the node. |
node_edac_uncorrectable_errors_total | The total number of uncorrectable memory errors of the node. |
node_entropy_available_bits | The number of bits of available entropy of the node. |
node_entropy_pool_size_bits | The number of bits of the entropy pool of the node. |
node_exporter_build_info | The build Information of the node exporter. |
node_filefd_allocated | The number of allocated file descriptors of the node. |
node_filefd_maximum | The maximum number of file descriptors of the node. |
node_filesystem_avail_bytes | The available bytes of the node file system. |
node_filesystem_device_error | The number of device errors in the file system of the node. |
node_filesystem_files | The number of files in the file system of the node. |
node_filesystem_files_free | The number of idle files in the file system of the node. |
node_filesystem_free_bytes | The amount of idle space in the file system of the node in bytes. |
node_filesystem_readonly | The read-only state of the file system of the node. |
node_filesystem_size_bytes | The total size of the file system of the node in bytes. |
node_forks_total | The total number of process forks of the node. |
node_infiniband_excessive_buffer_overrun_errors_total | The total number of InfiniBand excessive buffer overflow errors on the node. |
node_infiniband_info | The InfiniBand information of the node. |
node_infiniband_link_downed_total | The total number of InfiniBand link down events on the node. |
node_infiniband_link_error_recovery_total | The total number of InfiniBand link error recoveries on the node. |
node_infiniband_local_link_integrity_errors_total | The total number of InfiniBand local link integrity errors of the node. |
node_infiniband_multicast_packets_received_total | The total number of InfiniBand multicast packets received on the node. |
node_infiniband_multicast_packets_transmitted_total | The total number of InfiniBand multicast packets sent from the node. |
node_infiniband_physical_state_id | The physical state ID of the InfiniBand port on the node. |
node_infiniband_port_constraint_errors_received_total | The total number of InfiniBand port constraint error received on the node. |
node_infiniband_port_constraint_errors_transmitted_total | The total number of InfiniBand port constraint error sent from the node. |
node_infiniband_port_data_received_bytes_total | The total bytes of data received by the InfiniBand port of the node. |
node_infiniband_port_data_transmitted_bytes_total | The total data bytes sent on the node InfiniBand port. |
node_infiniband_port_discards_transmitted_total | The total discarded sends on the node InfiniBand port. |
node_infiniband_port_errors_received_total | The total errors received on the node InfiniBand port. |
node_infiniband_port_packets_received_total | The total number of packets received by the InfiniBand port of the node. |
node_infiniband_port_packets_transmitted_total | The total number of packets sent by the InfiniBand port of the node. |
node_infiniband_port_receive_remote_physical_errors_total | The total remote physical errors received on the node InfiniBand port. |
node_infiniband_port_receive_switch_relay_errors_total | The total switch relay errors received on the node InfiniBand port. |
node_infiniband_port_transmit_wait_total | The total send waits on the node InfiniBand port. |
node_infiniband_rate_bytes_per_second | The InfiniBand port rate in bytes per second on the node. |
node_infiniband_state_id | The state ID of the InfiniBand port of the node. |
node_infiniband_symbol_error_total | The total number of InfiniBand symbol errors of the node. |
node_infiniband_unicast_packets_received_total | The total number of unicast packets received on the InfiniBand port of the node. |
node_infiniband_unicast_packets_transmitted_total | The total number of unicast packets sent by the InfiniBand port of the node. |
node_infiniband_vl15_dropped_total | The total VL15 discards on the node InfiniBand port. |
node_intr_total | The total interrupts on the node. |
node_load1 | The 1-minute load on the node. |
node_load15 | The 15-minute load on the node. |
node_load5 | The 5-minute load on the node. |
node_memory_Active_anon_bytes | The size of anonymous active memory on the node in bytes. |
node_memory_Active_bytes | The size of active memory on the node in bytes. |
node_memory_Active_file_bytes | The size of active file memory on the node (in bytes). |
node_memory_AnonHugePages_bytes | The size of anonymous huge pages on the node (in bytes). |
node_memory_AnonPages_bytes | The size of anonymous pages on the node (in bytes). |
node_memory_Bounce_bytes | The size of bounce pages on the node (in bytes). |
node_memory_Buffers_bytes | The size of buffers memory on the node (in bytes). |
node_memory_Cached_bytes | The size of cached memory on the node (in bytes). |
node_memory_CmaFree_bytes | The size of Contiguous Memory Allocator (CMA) free memory on the node (in bytes). |
node_memory_CmaTotal_bytes | The total size of CMA memory on the node (in bytes). |
node_memory_CommitLimit_bytes | The commit limit of memory on the node (in bytes). |
node_memory_Committed_AS_bytes | The committed address space of memory on the node (in bytes). |
node_memory_DirectMap1G_bytes | The size of 1 GB direct map memory on the node (in bytes). |
node_memory_DirectMap2M_bytes | The size of 2 MB direct map memory on the node (in bytes). |
node_memory_DirectMap4k_bytes | The size of 4 KB direct map memory on the node (in bytes). |
node_memory_Dirty_bytes | The size of dirty memory on the node (in bytes). |
node_memory_DupText_bytes | The size of duplicate text memory on the node (in bytes). |
node_memory_FileHugePages_bytes | The size of file huge pages memory on the node (in bytes). |
node_memory_FilePmdMapped_bytes | The size of physically allocated memory via file mapping on the node (in bytes). |
node_memory_HardwareCorrupted_bytes | The size of hardware corrupted memory on the node (in bytes). |
node_memory_HugePages_Free | The number of free huge pages on the node. |
node_memory_HugePages_Rsvd | The number of reserved huge pages on the node. |
node_memory_HugePages_Surp | The number of surplus huge pages on the node. |
node_memory_HugePages_Total | The total number of huge pages on the node. |
node_memory_Hugepagesize_bytes | The size of huge pages on the node (in bytes). |
node_memory_Hugetlb_bytes | The size of Hugetlb memory on the node (in bytes). |
node_memory_Inactive_anon_bytes | The size of inactive anonymous memory on the node (in bytes). |
node_memory_Inactive_bytes | The size of inactive memory on the node (in bytes). |
node_memory_Inactive_file_bytes | The size of inactive file memory on the node (in bytes). |
node_memory_KernelStack_bytes | The size of KernelStack memory on the node (in bytes). |
node_memory_KReclaimable_bytes | The size of KReclaimable memory on the node (in bytes). |
node_memory_Mapped_bytes | The size of mapped memory on the node (in bytes). |
node_memory_MemAvailable_bytes | The size of available memory on the node (in bytes). |
node_memory_MemFree_bytes | The size of free memory on the node (in bytes). |
node_memory_MemTotal_bytes | The total size of memory on the node (in bytes). |
node_memory_MemZeroed_bytes | The size of zeroed memory on the node (in bytes). |
node_memory_Mlocked_bytes | The size of locked memory on the node (in bytes). |
node_memory_NFS_Unstable_bytes | The size of unstable NFS memory on the node (in bytes). |
node_memory_PageTables_bytes | The size of page table memory on the node (in bytes). |
node_memory_Percpu_bytes | The size of per-CPU memory on the node (in bytes). |
node_memory_Shmem_bytes | The size of shared memory on the node (in bytes). |
node_memory_ShmemHugePages_bytes | The size of shared huge pages memory on the node (in bytes). |
node_memory_ShmemPmdMapped_bytes | The size of shared memory page middle directory (PMD) mapping on the node (in bytes). |
node_memory_Slab_bytes | The size of Slab memory on the node (in bytes). |
node_memory_SReclaimable_bytes | The size of SReclaimable memory on the node (in bytes). |
node_memory_SUnreclaim_bytes | The size of SUnreclaim memory on the node (in bytes). |
node_memory_SwapCached_bytes | The size of cached swap space on the node (in bytes). |
node_memory_SwapFree_bytes | The size of free swap space on the node (in bytes). |
node_memory_SwapTotal_bytes | The total size of swap space on the node (in bytes). |
node_memory_Unevictable_bytes | The size of unevictable memory on the node (in bytes). |
node_memory_VmallocChunk_bytes | The size of vmallocChunk memory on the node (in bytes). |
node_memory_VmallocTotal_bytes | The total size of vmalloc memory on the node (in bytes). |
node_memory_VmallocUsed_bytes | The size of used vmalloc memory on the node (in bytes). |
node_memory_Writeback_bytes | The size of writeback memory on the node (in bytes). |
node_memory_WritebackTmp_bytes | The size of temporary writeback memory on the node (in bytes). |
node_netstat_Icmp_InErrors | The number of Internet Control Message Protocol (ICMP) receive errors on the node. |
node_netstat_Icmp_InMsgs | The number of received ICMP messages. |
node_netstat_Icmp_OutMsgs | The number of sent ICMP messages. |
node_netstat_Icmp6_InErrors | The number of ICMPv6 receive errors. |
node_netstat_Icmp6_InMsgs | The number of ICMPv6 messages received. |
node_netstat_Icmp6_OutMsgs | The number of ICMPv6 messages sent. |
node_netstat_Ip_Forwarding | The status of IP forwarding. |
node_netstat_Ip6_InOctets | The number of bytes received over IPv6. |
node_netstat_Ip6_OutOctets | The number of bytes sent over IPv6. |
node_netstat_IpExt_InOctets | The number of bytes received for IP extended statistics. |
node_netstat_IpExt_OutOctets | The number of bytes sent for IP extended statistics. |
node_netstat_Tcp_ActiveOpens | The number of bytes received for IP extended statistics. |
node_netstat_Tcp_CurrEstab | The current number of established TCP connections. |
node_netstat_Tcp_InErrs | The number of TCP receive errors. |
node_netstat_Tcp_InSegs | The number of TCP segments received. |
node_netstat_Tcp_OutRsts | The number of TCP resets sent. |
node_netstat_Tcp_OutSegs | The number of TCP segments sent. |
node_netstat_Tcp_PassiveOpens | The number of passive TCP connections opened. |
node_netstat_Tcp_RetransSegs | The number of TCP segments retransmitted. |
node_netstat_TcpExt_ListenDrops | The number of TCP connections dropped from the listen queue. |
node_netstat_TcpExt_ListenOverflows | The number of times the listen queue overflowed. |
node_netstat_TcpExt_SyncookiesFailed | The number of times SYN_COOKIE validation failed. |
node_netstat_TcpExt_SyncookiesRecv | The number of SYN_COOKIES received. |
node_netstat_TcpExt_SyncookiesSent | The number of SYN_COOKIES sent. |
node_netstat_TcpExt_TCPOFOQueue | The number of OFOs in the TCP send queue. |
node_netstat_TcpExt_TCPSynRetrans | The number of TCP SYN retransmissions. |
node_netstat_TcpExt_TCPTimeouts | The number of TCP timeouts. |
node_netstat_Udp_InDatagrams | The number of UDP datagrams received. |
node_netstat_Udp_InErrors | The number of UDP receive errors. |
node_netstat_Udp_NoPorts | The number of UDP packets with unreachable destination ports. |
node_netstat_Udp_OutDatagrams | The number of UDP datagrams sent. |
node_netstat_Udp_RcvbufErrors | The number of UDP receive buffer errors. |
node_netstat_Udp_SndbufErrors | The number of UDP send buffer errors. |
node_netstat_Udp6_InDatagrams | The number of IPv6 UDP datagrams received. |
node_netstat_Udp6_InErrors | The number of IPv6 UDP packets with unreachable destination ports. |
node_netstat_Udp6_NoPorts | The number of IPv6 UDP packets with unreachable destination ports. |
node_netstat_Udp6_OutDatagrams | The number of IPv6 UDP datagrams sent. |
node_netstat_Udp6_RcvbufErrors | The number of IPv6 UDP receive buffer errors. |
node_netstat_Udp6_SndbufErrors | The number of IPv6 UDP send buffer errors. |
node_netstat_UdpLite_InErrors | The number of UDP Lite receive errors. |
node_netstat_UdpLite6_InErrors | The number of IPv6 UDP Lite receive errors. |
node_network_address_assign_type | The assignment type of the network address. |
node_network_carrier | The information about the network carrier. |
node_network_carrier_changes_total | The information about the network carrier. |
node_network_carrier_down_changes_total | The total number of network carrier downgrade changes. |
node_network_carrier_up_changes_total | The total number of network carrier upgrade changes. |
node_network_device_id | The dormant state of the network. |
node_network_dormant | The status of network dormancy. |
node_network_flags | The network flags. |
node_network_iface_id | The network interface ID. |
node_network_iface_link | The link state of the network interface. |
node_network_iface_link_mode | The link mode of the network interface. |
node_network_info | The information about the network interface. |
node_network_mtu_bytes | The maximum transmission unit size in bytes on the network. |
node_network_name_assign_type | The assignment type of the network name. |
node_network_net_dev_group | The network device group to which the network device belongs. |
node_network_protocol_type | The network protocol type. |
node_network_receive_bytes_total | The total number of bytes received cumulatively. |
node_network_receive_compressed_total | The total number of compressed packets received. |
node_network_receive_drop_total | The total number of packets dropped while receiving. |
node_network_receive_errs_total | The total number of receive errors. |
node_network_receive_fifo_total | The total number of receive first-in, first-out (FIFO) buffer errors while receiving. |
node_network_receive_frame_total | The total number of frame alignment errors while receiving. |
node_network_receive_multicast_total | The total number of multicast packets received. |
node_network_receive_nohandler_total | The total number of receptions without a handler. |
node_network_receive_packets_total | The total number of packets received. |
node_network_speed_bytes | The network speed in bytes. |
node_network_transmit_bytes_total | The total number of bytes sent cumulatively. |
node_network_transmit_carrier_total | The total number of packets sent but lost due to ISP-related issues. |
node_network_transmit_colls_total | The total number of transmission collisions. |
node_network_transmit_compressed_total | The total number of compressed packets sent. |
node_network_transmit_drop_total | The total number of packets sent but dropped. |
node_network_transmit_errs_total | The total number of send errors. |
node_network_transmit_fifo_total | The total number of FIFO buffer errors while sending. |
node_network_transmit_packets_total | The total number of packets sent. |
node_network_transmit_queue_length | The length of the send queue. |
node_network_up | Indicates whether the network interface is enabled. |
node_nf_conntrack_entries | The number of entries in the connection tracking table. |
node_nf_conntrack_entries_limit | The limit of entries in the connection tracking table. |
node_nf_conntrack_stat_drop | The limit of entries in the connection tracking table. |
node_nf_conntrack_stat_early_drop | The early drop count for connection tracking. |
node_nf_conntrack_stat_found | The success find count for connection tracking. |
node_nf_conntrack_stat_ignore | The ignore count for connection tracking. |
node_nf_conntrack_stat_insert | The insert count for connection tracking. |
node_nf_conntrack_stat_insert_failed | The insert failure count for connection tracking. |
node_nf_conntrack_stat_invalid | The invalid count for connection tracking. |
node_nf_conntrack_stat_search_restart | The search restart count for connection tracking. |
node_nfs_connections_total | The total number of NFS connections. |
node_nfs_packets_total | The total number of NFS packets. |
node_nfs_requests_total | The total number of NFS requests. |
node_nfs_rpc_authentication_refreshes_total | The total number of NFS Remote Procedure Call (RPC) authentication refreshes. |
node_nfs_rpc_retransmissions_total | The total number of NFS RPC retransmissions. |
node_nfs_rpcs_total | The total number of NFS RPCs. |
node_nfsd_connections_total | The total number of connections to the NFS server. |
node_nfsd_disk_bytes_read_total | The total number of bytes read from the disk by the NFS server. |
node_nfsd_disk_bytes_written_total | The total number of bytes written to the disk by the NFS server. |
node_nfsd_file_handles_stale_total | The total number of stale file handles on the NFS server. |
node_nfsd_packets_total | The total number of packets processed by the NFS server. |
node_nfsd_read_ahead_cache_not_found_total | The total number of times the read-ahead cache of the NFS server was not found. |
node_nfsd_read_ahead_cache_size_blocks | The size of blocks in the read-ahead cache of the NFS server. |
node_nfsd_reply_cache_hits_total | The total number of hits in the NFS server reply cache. |
node_nfsd_reply_cache_misses_total | The total number of misses in the NFS server reply cache. |
node_nfsd_reply_cache_nocache_total | The total number of no-cache situations in the NFS server reply cache. |
node_nfsd_requests_total | The total number of requests to the NFS server. |
node_nfsd_rpc_errors_total | The total number of RPC errors on the NFS server. |
node_nfsd_server_rpcs_total | The total number of RPCs processed by the NFS server. |
node_nfsd_server_threads | The number of threads on the NFS server. |
node_nvme_info | The information about Non-Volatile Memory Express (NVMe). |
node_os_info | The information about the operating system. |
node_os_version | The version of the operating system. |
node_pressure_cpu_waiting_seconds_total | The total seconds the CPU has spent waiting under pressure. |
node_pressure_io_stalled_seconds_total | The total seconds the I/O has been stalled under pressure. |
node_pressure_io_waiting_seconds_total | The total seconds the I/O has spent waiting under pressure. |
node_pressure_memory_stalled_seconds_total | The total seconds memory has been stalled under pressure. |
node_pressure_memory_waiting_seconds_total | The total seconds memory has spent waiting under pressure. |
node_processes_max_processes | The maximum number of processes. |
node_processes_max_threads | The maximum number of threads. |
node_processes_pids | The number of process IDs. |
node_processes_state | The distribution of process states. |
node_processes_threads | The number of threads. |
node_procs_blocked | The number of blocked processes. |
node_procs_running | The number of running processes. |
node_schedstat_running_seconds_total | The total seconds run in scheduling statistics. |
node_schedstat_timeslices_total | The total number of time slices in scheduling statistics. |
node_schedstat_waiting_seconds_total | The total seconds waited in scheduling statistics. |
node_scrape_collector_duration_seconds | The duration of the scrape collector in seconds. |
node_scrape_collector_success | The number of successful scrapes by the collector. |
node_selinux_enabled | Indicates whether Security-Enhanced Linux (SELinux) is enabled. |
node_sockstat_FRAG_inuse | The number of FRAG sockets in use. |
node_sockstat_FRAG_memory | The amount of memory occupied by FRAG sockets. |
node_sockstat_FRAG6_inuse | The number of FRAG6 sockets in use. |
node_sockstat_FRAG6_memory | The amount of memory occupied by FRAG6 sockets. |
node_sockstat_RAW_inuse | The number of RAW sockets in use. |
node_sockstat_RAW6_inuse | The number of RAW6 sockets in use. |
node_sockstat_sockets_used | The total number of sockets in use. |
node_sockstat_TCP_alloc | The number of TCP sockets allocated. |
node_sockstat_TCP_inuse | The number of TCP sockets in use. |
node_sockstat_TCP_mem | The amount of memory used by TCP sockets. |
node_sockstat_TCP_mem_bytes | The number of bytes of memory used by TCP sockets. |
node_sockstat_TCP_orphan | The number of orphaned TCP sockets. |
node_sockstat_TCP_tw | The number of TCP sockets in the TIME_WAIT state. |
node_sockstat_TCP6_inuse | The number of TCP6 sockets in use. |
node_sockstat_UDP_inuse | The number of UDP sockets in use. |
node_sockstat_UDP_mem | The amount of memory used by UDP sockets. |
node_sockstat_UDP_mem_bytes | The number of bytes of memory used by UDP sockets. |
node_sockstat_UDP6_inuse | The number of IPv6 UDP sockets in use. |
node_sockstat_UDPLITE_inuse | The number of UDP-Lite sockets in use. |
node_sockstat_UDPLITE6_inuse | The number of UDP-Lite6 sockets in use. |
node_softnet_backlog_len | The length of the soft interrupt queue. |
node_softnet_cpu_collision_total | The total number of CPU collisions in soft interrupts. |
node_softnet_dropped_total | The total number of soft interrupts dropped. |
node_softnet_flow_limit_count_total | The total number of flow limit counts in soft interrupts. |
node_softnet_processed_total | The total number of soft interrupts processed. |
node_softnet_received_rps_total | The total receive rate per second of soft interrupts. |
node_softnet_times_squeezed_total | The total number of times soft interrupts were squeezed. |
node_textfile_scrape_error | The number of text file scrape errors. |
node_thermal_zone_temp | The temperature of the thermal zone. |
node_time_clocksource_available_info | The available clock source information. |
node_time_clocksource_current_info | The information about the current clock source. |
node_time_seconds | The number of seconds since the system started. |
node_time_zone_offset_seconds | The time zone offset in seconds. |
node_timex_estimated_error_seconds | The estimated time error in seconds. |
node_timex_frequency_adjustment_ratio | The frequency adjustment ratio of the system clock. |
node_timex_loop_time_constant | The time adjustment loop constant. |
node_timex_maxerror_seconds | The maximum error in seconds. |
node_timex_offset_seconds | The time offset in seconds. |
node_timex_pps_calibration_total | The total number of pulse per second (PPS) calibrations. |
node_timex_pps_error_total | The total number of PPS errors. |
node_timex_pps_frequency_hertz | The PPS frequency in Hz. |
node_timex_pps_jitter_seconds | The PPS jitter in seconds. |
node_timex_pps_jitter_total | The cumulative PPS jitter. |
node_timex_pps_shift_seconds | The PPS offset in seconds. |
node_timex_pps_stability_exceeded_total | The number of times PPS stability exceeded limits. |
node_timex_pps_stability_hertz | The PPS stability frequency in hertz. |
node_timex_status | The status of clock time adjustments. |
node_timex_sync_status | The synchronization status of the clock. |
node_timex_tai_offset_seconds | The International Atomic Time (TAI) offset in seconds. |
node_timex_tick_seconds | The tick interval of the clock in seconds. |
node_udp_queues | The statistics of UDP queues. |
node_uname_info | The system information (uname). |
node_vmstat_oom_kill | The number of out-of-memory (OOM) kills in VM statistics. |
node_vmstat_pgfault | The number of page faults in VM statistics. |
node_vmstat_pgmajfault | The number of major page faults in VM statistics. |
node_vmstat_pgpgin | The number of page ins in VM statistics. |
node_vmstat_pgpgout | The number of page outs in VM statistics. |
node_vmstat_pswpin | The number of swap page ins in VM statistics. |
node_vmstat_pswpout | The number of swap page outs in VM statistics. |
node_xfs_allocation_btree_compares_total | The total number of B-tree comparisons for XFS allocation. |
node_xfs_allocation_btree_lookups_total | The total number of B-tree lookups for XFS allocation. |
node_xfs_allocation_btree_records_deleted_total | The total number of B-tree records deleted for XFS allocation. |
node_xfs_allocation_btree_records_inserted_total | The total number of B-tree records inserted for XFS allocation. |
node_xfs_block_map_btree_compares_total | The total number of B-tree comparisons for XFS block mapping. |
node_xfs_block_map_btree_lookups_total | The total number of B-tree lookups for XFS block mapping. |
node_xfs_block_map_btree_records_deleted_total | The total number of B-tree records deleted for XFS block mapping. |
node_xfs_block_map_btree_records_inserted_total | The total number of B-tree records inserted for XFS block mapping. |
node_xfs_block_mapping_extent_list_compares_total | The total number of extent list comparisons for XFS block mapping. |
node_xfs_block_mapping_extent_list_deletions_total | The total number of extent list deletions for XFS block mapping. |
node_xfs_block_mapping_extent_list_insertions_total | The number of extent list insertions for a file system. |
node_xfs_block_mapping_extent_list_lookups_total | The total number of extent list lookups for XFS block mapping. |
node_xfs_block_mapping_reads_total | The total number of reads for XFS block mapping. |
node_xfs_block_mapping_unmaps_total | The total number of unmappings for XFS block mapping. |
node_xfs_block_mapping_writes_total | The total number of writes for XFS block mapping. |
node_xfs_directory_operation_create_total | The total number of directory creation operations in XFS. |
node_xfs_directory_operation_getdents_total | The total number of directory entry retrieval operations in XFS. |
node_xfs_directory_operation_lookup_total | The total number of directory lookup operations in XFS. |
node_xfs_directory_operation_remove_total | The total number of directory removal operations in XFS. |
node_xfs_extent_allocation_blocks_allocated_total | The total number of blocks allocated in XFS. |
node_xfs_extent_allocation_blocks_freed_total | The total number of blocks freed in XFS. |
node_xfs_extent_allocation_extents_allocated_total | The total number of extents allocated in XFS. |
node_xfs_extent_allocation_extents_freed_total | The total number of extents freed in XFS. |
node_xfs_inode_operation_attempts_total | The total number of attempts at inode operations in XFS. |
node_xfs_inode_operation_attribute_changes_total | The total number of attribute change operations on inodes in XFS. |
node_xfs_inode_operation_duplicates_total | The total number of duplicate operations on inodes in XFS. |
node_xfs_inode_operation_found_total | The total number of hits in inode operations in XFS. |
node_xfs_inode_operation_missed_total | The total number of misses in inode operations in XFS. |
node_xfs_inode_operation_reclaims_total | The total number of reclaim operations on inodes in XFS. |
node_xfs_inode_operation_recycled_total | The total number of reuse operations on inodes in XFS. |
node_xfs_read_calls_total | The total number of read calls in XFS. |
node_xfs_vnode_active_total | The total number of active vnodes in XFS. |
node_xfs_vnode_allocate_total | The total number of vnode allocations in XFS. |
node_xfs_vnode_get_total | The total number of vnode retrievals in XFS. |
node_xfs_vnode_hold_total | The total number of vnodes held in XFS. |
node_xfs_vnode_reclaim_total | The total number of vnodes reclaimed in XFS. |
node_xfs_vnode_release_total | The total number of vnodes released in XFS. |
node_xfs_vnode_remove_total | The total number of vnodes removed in XFS. |
node_xfs_write_calls_total | The total number of write calls in XFS. |
process_cpu_seconds_total | The total process CPU seconds. |
process_max_fds | The maximum number of file descriptors for the process. |
process_open_fds | The number of file descriptors opened by the process. |
process_resident_memory_bytes | The resident memory size of the process in bytes. |
process_start_time_seconds | The process startup duration in seconds. |
process_virtual_memory_bytes | The number of virtual memory bytes for the process. |
process_virtual_memory_max_bytes | The maximum number of virtual memory bytes for the process. |
promhttp_metric_handler_errors_total | The total number of errors from the Prometheus HTTP metric handler. |
promhttp_metric_handler_requests_in_flight | The current number of requests being handled by the Prometheus HTTP metric handler. |
promhttp_metric_handler_requests_total | The total number of requests handled by the Prometheus HTTP metric handler. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
up | The connectivity of metric collection. |
kube-state-metrics (job name: _kube-state-metrics)
Metric | Description |
kube_configmap_info | The information about the ConfigMap. |
kube_cronjob_annotations | The annotations of the Kubernetes CronJob. |
kube_cronjob_created | The creation time of the Kubernetes CronJob. |
kube_cronjob_info | The information about the Kubernetes CronJob. |
kube_cronjob_labels | The labels of the Kubernetes CronJob. |
kube_cronjob_metadata_resource_version | The metadata resource version of the Kubernetes CronJob. |
kube_cronjob_next_schedule_time | The next schedule time of the Kubernetes CronJob. |
kube_cronjob_spec_failed_job_history_limit | The failed job history limit of the Kubernetes CronJob. |
kube_cronjob_spec_starting_deadline_seconds | The starting deadline seconds of the Kubernetes CronJob. |
kube_cronjob_spec_successful_job_history_limit | The successful job history limit of the Kubernetes CronJob. |
kube_cronjob_spec_suspend | The suspend status of the Kubernetes CronJob. |
kube_cronjob_status_active | The number of active jobs of the Kubernetes CronJob. |
kube_cronjob_status_last_schedule_time | The last schedule time of the Kubernetes CronJob. |
kube_cronjob_status_last_successful_time | The last successful execution time of the Kubernetes CronJob. |
kube_daemonset_created | The creation time of the Kubernetes DaemonSet. |
kube_daemonset_status_current_number_scheduled | The current number of scheduled nodes for the Kubernetes DaemonSet. |
kube_daemonset_status_desired_number_scheduled | The desired number of scheduled nodes for the Kubernetes DaemonSet |
kube_daemonset_status_number_available | The number of available nodes in the Kubernetes DaemonSet. |
kube_daemonset_status_number_misscheduled | The number of missed scheduled nodes in the Kubernetes DaemonSet. |
kube_daemonset_status_number_ready | The number of ready nodes in the Kubernetes DaemonSet. |
kube_daemonset_status_number_unavailable | The number of unavailable nodes in the Kubernetes DaemonSet. |
kube_daemonset_status_updated_number_scheduled | The number of updated scheduled nodes in the Kubernetes DaemonSet |
kube_daemonset_updated_number_scheduled | The number of updated scheduled nodes in the Kubernetes DaemonSet |
kube_deployment_created | The creation time of the Kubernetes Deployment. |
kube_deployment_labels | The labels of the Kubernetes Deployment. |
kube_deployment_metadata_generation | The metadata generation of the Kubernetes Deployment. |
kube_deployment_spec_replicas | The number of replicas specified in the Kubernetes Deployment. |
kube_deployment_spec_strategy_rollingupdate_max_unavailable | The maximum number of unavailable pods during rolling update of the Kubernetes Deployment. |
kube_deployment_status_observed_generation | The observed generation of the Kubernetes Deployment. |
kube_deployment_status_replicas | The total number of replicas in the Kubernetes Deployment. |
kube_deployment_status_replicas_available | The number of available replicas in the Kubernetes Deployment. |
kube_deployment_status_replicas_ready | The number of ready replicas in the Kubernetes Deployment. |
kube_deployment_status_replicas_unavailable | The number of unavailable replicas in the Kubernetes Deployment. |
kube_deployment_status_replicas_updated | The number of updated replicas in the Kubernetes Deployment. |
kube_horizontalpodautoscaler_info | The information about the Kubernetes HorizontalPodAutoscaler. |
kube_horizontalpodautoscaler_labels | The labels of the Kubernetes HorizontalPodAutoscaler. |
kube_horizontalpodautoscaler_metadata_generation | The metadata generation of the Kubernetes HorizontalPodAutoscaler. |
kube_horizontalpodautoscaler_spec_max_replicas | The maximum number of replicas specified in the Kubernetes HorizontalPodAutoscaler. |
kube_horizontalpodautoscaler_spec_min_replicas | The minimum number of replicas specified in the Kubernetes HorizontalPodAutoscaler. |
kube_horizontalpodautoscaler_spec_target_metric | The target metrics of the Kubernetes HorizontalPodAutoscaler. |
kube_horizontalpodautoscaler_status_condition | The status conditions of the Kubernetes HorizontalPodAutoscaler. |
kube_horizontalpodautoscaler_status_current_replicas | The current number of replicas in the Kubernetes HorizontalPodAutoscaler. |
kube_horizontalpodautoscaler_status_desired_replicas | The desired number of replicas in the Kubernetes HorizontalPodAutoscaler. |
kube_hpa_labels | The labels of the Kubernetes HorizontalPodAutoscaler. |
kube_hpa_metadata_generation | The metadata generation of the Kubernetes HorizontalPodAutoscaler. |
kube_hpa_spec_max_replicas | The maximum number of replicas specified in the Kubernetes HorizontalPodAutoscaler. |
kube_hpa_spec_min_replicas | The minimum number of replicas specified in the Kubernetes HorizontalPodAutoscaler. |
kube_hpa_spec_target_metric | The target metrics of the Kubernetes HorizontalPodAutoscaler. |
kube_hpa_status_condition | The status conditions of the Kubernetes HorizontalPodAutoscaler. |
kube_hpa_status_current_replicas | The current number of replicas in the Kubernetes HorizontalPodAutoscaler. |
kube_hpa_status_desired_replicas | The desired number of replicas in the Kubernetes HorizontalPodAutoscaler. |
kube_ingress_info | The information about the Ingress. |
kube_job_created | The information about the Ingress |
kube_job_failed | The total number of failures for the job. |
kube_job_info | The information about the Job. |
kube_job_spec_completions | The number of completed jobs. |
kube_job_status_active | The number of active jobs. |
kube_job_status_failed | The number of failed jobs. |
kube_job_status_succeeded | The number of successful jobs. |
kube_namespace_created | The creation time of the namespace. |
kube_namespace_labels | The labels of the namespace. |
kube_namespace_status_phase | The phase of the namespace status. |
kube_node_info | The information about the node. |
kube_node_labels | The labels of the node. |
kube_node_spec_taint | The taint configurations of the node. |
kube_node_spec_unschedulable | The unschedulable flag of the node. |
kube_node_status_allocatable | The allocatable resources of the node. |
kube_node_status_allocatable_cpu_cores | The allocatable CPU cores of the node. |
kube_node_status_allocatable_memory_bytes | The allocatable memory bytes of the node. |
kube_node_status_allocatable_pods | The allocatable number of Pods on the node. |
kube_node_status_capacity | The capacity of the node. |
kube_node_status_capacity_cpu_cores | The capacity CPU cores of the node. |
kube_node_status_capacity_memory_bytes | The capacity memory bytes of the node. |
kube_node_status_capacity_pods | The capacity number of Pods on the node. |
kube_node_status_condition | The status conditions of the node. |
kube_persistentvolume_status_phase | The phase of the PersistentVolume (PV) status. |
kube_persistentvolumeclaim_info | The information about the PersistentVolumeClaim (PVC). |
kube_persistentvolumeclaim_resource_requests_storage_bytes | The storage resource request of the PVC. |
kube_persistentvolumeclaim_status_phase | The phase of the PVC status. |
kube_pod_completion_time | The completion time of the Pod. |
kube_pod_container_info | The information about the Pod container. |
kube_pod_container_resource_limits | The resource limit of the Pod container. |
kube_pod_container_resource_limits_cpu_cores | The CPU core limit of the Pod container. |
kube_pod_container_resource_limits_memory_bytes | The memory byte limit of the Pod container. |
kube_pod_container_resource_requests | The resource requests of the Pod container. |
kube_pod_container_resource_requests_cpu_cores | The CPU core requests of the Pod container |
kube_pod_container_resource_requests_memory_bytes | The memory byte requests of the Pod container |
kube_pod_container_status_last_terminated_reason | The last termination reason of the Pod container. |
kube_pod_container_status_ready | The ready status of the Pod container. |
kube_pod_container_status_restarts_total | The total number of restarts for the Pod container. |
kube_pod_container_status_running | The running status of the Pod container. |
kube_pod_container_status_terminated | The terminated status of the Pod container. |
kube_pod_container_status_terminated_reason | The termination reason of the Pod container. |
kube_pod_container_status_waiting | The waiting status of the Pod container. |
kube_pod_container_status_waiting_reason | The waiting reason of the Pod container. |
kube_pod_created | The creation time of the Pod. |
kube_pod_deletion_timestamp | The deletion timestamp of the Pod. |
kube_pod_info | The information about the Pod. |
kube_pod_labels | The labels of the Pod. |
kube_pod_owner | The owner of the Pod. |
kube_pod_start_time | The start time of the Pod. |
kube_pod_status_container_ready_time | The container ready time of the Pod status. |
kube_pod_status_initialized_time | The initialization completion time of the Pod status. |
kube_pod_status_phase | The phase of the Pod status. |
kube_pod_status_ready | The ready status of the Pod. |
kube_pod_status_ready_time | The ready time of the Pod. |
kube_pod_status_reason | The reason for the Pod status. |
kube_pod_status_scheduled_time | The scheduling time of the Pod. |
kube_pod_status_unschedulable | The unschedulable flag of the Pod. |
kube_replicaset_owner | The owner of the ReplicaSet. |
kube_replicaset_status_ready_replicas | The number of ready replicas in the ReplicaSet. |
kube_resource_relationship | The relationships between resources. |
kube_resourcequota | The resource quota. |
kube_resourcequota_created | The creation time of the resource quota. |
kube_secret_info | The information about the secret. |
kube_service_info | The information about the service. |
kube_service_spec_type | The type specification of the service. |
kube_service_status_load_balancer_ingress | The load balancer ingress information of the service status. |
kube_statefulset_created | The creation time of the StatefulSet. |
kube_statefulset_metadata_generation | The metadata generation of the StatefulSet. |
kube_statefulset_replicas | The number of replicas in the StatefulSet. |
kube_statefulset_status_replicas | The number of replicas in the state of the StatefulSet. |
kube_statefulset_status_replicas_available | The number of available replicas in the state of the StatefulSet. |
kube_statefulset_status_replicas_ready | The number of ready replicas in the state of the StatefulSet. |
kube_statefulset_status_replicas_updated | The number of updated replicas in the state of the StatefulSet. |
process_cpu_seconds_total | The total number of CPU seconds used by the process. |
process_resident_memory_bytes | The resident memory size of the process in bytes. |
rest_client_requests_total | The number of REST client requests. |
up | The connectivity of metric collection. |
workqueue_adds_total | The total number of additions to the work queue. |
workqueue_depth | The work queue depth. |
workqueue_queue_duration_seconds_bucket | The distribution of queue duration in seconds for the work queue. |
kube-events (job name: _arms/kube-event)
Metric | Description |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_job_discovery_status | The discovery status of the Prometheus agent collection jobs. |
aliyun_prometheus_agent_scrape_custom_error | The number of custom collection errors of the Prometheus agent. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
eventer_events_error_total | The total number of event processing errors. |
eventer_events_normal_total | The total number of normal events. |
eventer_events_warning_total | The total number of warning events. |
eventer_exporter_duration_milliseconds_count | The count of samples for exporter duration in milliseconds. |
eventer_exporter_duration_milliseconds_sum | The sum of exporter duration in milliseconds. |
eventer_manager_last_time_seconds | The last operation time of the event manager in seconds. |
eventer_scraper_duration_milliseconds_count | The count of scraper duration in milliseconds. |
eventer_scraper_duration_milliseconds_sum | The sum of scraper duration in milliseconds. |
eventer_scraper_events_total_number | The total number of events scraped. |
eventer_scraper_last_time_seconds | The last execution time of the scraper in seconds. |
go_gc_duration_seconds | The Go GC pause duration in seconds. |
go_gc_duration_seconds_count | The Go GC pause duration in seconds. |
go_gc_duration_seconds_sum | The total Go GC pause duration in seconds. |
go_goroutines | The number of goroutines. |
go_info | The Go-specific information. |
go_memstats_alloc_bytes | The amount of memory allocated in bytes. |
go_memstats_alloc_bytes_total | The cumulative amount of memory allocated in bytes. |
go_memstats_buck_hash_sys_bytes | The amount of memory used by hash tables in the operating system in bytes. |
go_memstats_frees_total | The total number of releases. |
go_memstats_gc_cpu_fraction | The GC CPU utilization (%). |
go_memstats_gc_sys_bytes | The amount of memory used by GC in the operating system in bytes. |
go_memstats_heap_alloc_bytes | The amount of heap memory allocated in bytes. |
go_memstats_heap_idle_bytes | The amount of idle heap memory in bytes. |
go_memstats_heap_inuse_bytes | The amount of heap memory in use in bytes. |
go_memstats_heap_objects | The number of objects allocated on the heap. |
go_memstats_heap_released_bytes | The amount of heap memory released in bytes. |
go_memstats_heap_sys_bytes | The amount of memory allocated to the heap by the operating system in bytes. |
go_memstats_last_gc_time_seconds | The last GC duration in seconds. |
go_memstats_lookups_total | The total number of lookups. |
go_memstats_mallocs_total | The total number of allocations. |
go_memstats_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memstats_mcache_sys_bytes | The amount of memory allocated to mcache by the operating system in bytes. |
go_memstats_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memstats_mspan_sys_bytes | The amount of memory allocated to mspan by the operating system in bytes. |
go_memstats_next_gc_bytes | The number of bytes to be released at the next GC in bytes. |
go_memstats_other_sys_bytes | The amount of memory allocated for other purposes by the operating system in bytes. |
go_memstats_stack_inuse_bytes | The amount of stack memory in use in bytes. |
go_memstats_stack_sys_bytes | The amount of memory allocated to the stack by the operating system in bytes. |
go_memstats_sys_bytes | The total memory allocated by the operating system in bytes. |
go_threads | The number of threads. |
process_cpu_seconds_total | The total process CPU seconds. |
process_max_fds | The maximum number of file descriptors for the process. |
process_open_fds | The number of file descriptors opened by the process. |
process_resident_memory_bytes | The resident memory size of the process in bytes. |
process_start_time_seconds | The process startup duration in seconds. |
process_virtual_memory_bytes | The number of virtual memory bytes for the process. |
process_virtual_memory_max_bytes | The maximum number of virtual memory bytes for the process. |
promhttp_metric_handler_requests_in_flight | The current number of requests being handled by the Prometheus HTTP metric handler. |
promhttp_metric_handler_requests_total | The total number of requests handled by the Prometheus HTTP metric handler. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
up | The connectivity of metric collection. |
CoreDNS (job name: arms-ack-coredns)
Metric | Description |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_job_discovery_status | The discovery status of the Prometheus agent collection jobs. |
aliyun_prometheus_agent_scrape_custom_error | The number of custom collection errors of the Prometheus agent. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
coredns_autopath_success_count_total | The total number of successful automatic path resolutions in CoreDNS. |
coredns_autopath_success_total | The total number of successful automatic path resolutions in CoreDNS. |
coredns_build_info | The build information of CoreDNS. |
coredns_cache_drops_total | The total number of cache drops in CoreDNS. |
coredns_cache_entries | The number of cache entries in CoreDNS. |
coredns_cache_evictions_total | The total number of cache evictions in CoreDNS. |
coredns_cache_hits_total | The total number of cache hits in CoreDNS. |
coredns_cache_misses_total | The total number of cache misses in CoreDNS. |
coredns_cache_requests_total | The total number of cache requests in CoreDNS. |
coredns_cache_size | The size of the cache in CoreDNS. |
coredns_dns_do_requests_total | The total number of DNS DO requests in CoreDNS. |
coredns_dns_request_count_total | The total count of DNS requests in CoreDNS. |
coredns_dns_request_duration_seconds_bucket | The percentile of DNS request durations in seconds in CoreDNS. |
coredns_dns_request_duration_seconds_count | The count of DNS request durations in seconds in CoreDNS. |
coredns_dns_request_duration_seconds_sum | The sum of DNS request durations in seconds in CoreDNS. |
coredns_dns_request_size_bytes_bucket | The percentile of DNS request sizes in bytes in CoreDNS. |
coredns_dns_request_size_bytes_count | The count of DNS request sizes in bytes in CoreDNS. |
coredns_dns_request_size_bytes_sum | The sum of DNS request sizes in bytes in CoreDNS. |
coredns_dns_request_type_count_total | The total count of DNS request types in CoreDNS. |
coredns_dns_requests_total | The total number of DNS requests in CoreDNS. |
coredns_dns_response_rcode_count_total | The total count of DNS response codes in CoreDNS. |
coredns_dns_response_size_bytes_bucket | The percentile of DNS response sizes in bytes in CoreDNS. |
coredns_dns_response_size_bytes_count | The count of DNS response sizes in bytes in CoreDNS. |
coredns_dns_response_size_bytes_sum | The sum of DNS response sizes in bytes in CoreDNS. |
coredns_dns_responses_total | The total number of DNS responses in CoreDNS. |
coredns_forward_conn_cache_hits_total | The total number of cache hits for forwarded connections in CoreDNS. |
coredns_forward_conn_cache_misses_total | The total number of cache misses for forwarded connections in CoreDNS. |
coredns_forward_healthcheck_broken_total | The total number of health check failures for forwarded connections in CoreDNS. |
coredns_forward_healthcheck_failure_count_total | The total count of health check failures for forwarded connections in CoreDNS. |
coredns_forward_healthcheck_failures_total | The total number of health check failures for forwarded connections in CoreDNS. |
coredns_forward_max_concurrent_rejects_total | The total number of maximum concurrent rejections for forwarded connections in CoreDNS. |
coredns_forward_request_count_total | The total count of forwarded requests in CoreDNS. |
coredns_forward_request_duration_seconds_bucket | The percentile of forwarded request durations in seconds in CoreDNS. |
coredns_forward_request_duration_seconds_count | The count of forwarded request durations in seconds in CoreDNS. |
coredns_forward_request_duration_seconds_sum | The sum of forwarded request durations in seconds in CoreDNS. |
coredns_forward_requests_total | The total number of forwarded requests in CoreDNS. |
coredns_forward_response_rcode_count_total | The total count of forwarded response codes in CoreDNS. |
coredns_forward_responses_total | The total number of forwarded responses in CoreDNS. |
coredns_forward_sockets_open | The number of open sockets for forwarded connections in CoreDNS. |
coredns_health_request_duration_seconds_bucket | The percentile of health check request durations in seconds in CoreDNS. |
coredns_health_request_duration_seconds_count | The count of health check request durations in seconds in CoreDNS. |
coredns_health_request_duration_seconds_sum | The sum of health check request durations in seconds in CoreDNS. |
coredns_health_request_failures_total | The total number of health check request failures in CoreDNS. |
coredns_hosts_entries | The number of host entries in CoreDNS. |
coredns_hosts_reload_timestamp_seconds | The timestamp of the last host reload in CoreDNS in seconds. |
coredns_kubernetes_dns_programming_duration_seconds_bucket | The percentile of Kubernetes DNS programming durations in seconds in CoreDNS. |
coredns_kubernetes_dns_programming_duration_seconds_count | The count of Kubernetes DNS programming durations in seconds in CoreDNS. |
coredns_kubernetes_dns_programming_duration_seconds_sum | The sum of Kubernetes DNS programming durations in seconds in CoreDNS. |
coredns_local_localhost_requests_total | The total number of localhost requests in CoreDNS. |
coredns_panic_count_total | The total number of panics in CoreDNS. |
coredns_panics_total | The total count of panics in CoreDNS. |
coredns_plugin_enabled | The enabling status of CoreDNS plugins. |
coredns_reload_failed_total | The total number of reload failures in CoreDNS. |
coredns_reload_version_info | The version information of CoreDNS reloads. |
coredns_template_matches_total | The total number of template matches in CoreDNS. |
go_gc_duration_seconds | The Go GC pause duration in seconds. |
go_gc_duration_seconds_count | The Go GC pause duration in seconds. |
go_gc_duration_seconds_sum | The total Go GC pause duration in seconds. |
go_goroutines | The number of goroutines. |
go_info | The Go-specific information. |
go_memstats_alloc_bytes | The amount of memory allocated in bytes. |
go_memstats_alloc_bytes_total | The cumulative amount of memory allocated in bytes. |
go_memstats_buck_hash_sys_bytes | The amount of memory used by hash tables in the operating system in bytes. |
go_memstats_frees_total | The total number of releases. |
go_memstats_gc_cpu_fraction | The GC CPU utilization (%). |
go_memstats_gc_sys_bytes | The amount of memory used by GC in the operating system in bytes. |
go_memstats_heap_alloc_bytes | The amount of heap memory allocated in bytes. |
go_memstats_heap_idle_bytes | The amount of idle heap memory in bytes. |
go_memstats_heap_inuse_bytes | The amount of heap memory in use in bytes. |
go_memstats_heap_objects | The number of objects allocated on the heap. |
go_memstats_heap_released_bytes | The amount of heap memory released in bytes. |
go_memstats_heap_sys_bytes | The amount of memory allocated to the heap by the operating system in bytes. |
go_memstats_last_gc_time_seconds | The last GC duration in seconds. |
go_memstats_lookups_total | The total number of lookups. |
go_memstats_mallocs_total | The total number of allocations. |
go_memstats_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memstats_mcache_sys_bytes | The amount of memory allocated to mcache by the operating system in bytes. |
go_memstats_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memstats_mspan_sys_bytes | The amount of memory allocated to mspan by the operating system in bytes. |
go_memstats_next_gc_bytes | The number of bytes to be released at the next GC in bytes. |
go_memstats_other_sys_bytes | The amount of memory allocated for other purposes by the operating system in bytes. |
go_memstats_stack_inuse_bytes | The amount of stack memory in use in bytes. |
go_memstats_stack_sys_bytes | The amount of memory allocated to the stack by the operating system in bytes. |
go_memstats_sys_bytes | The total memory allocated by the operating system in bytes. |
go_threads | The number of threads. |
process_cpu_seconds_total | The total process CPU seconds. |
process_max_fds | The maximum number of file descriptors for the process. |
process_open_fds | The number of file descriptors opened by the process. |
process_resident_memory_bytes | The resident memory size of the process in bytes. |
process_start_time_seconds | The process startup duration in seconds. |
process_virtual_memory_bytes | The number of virtual memory bytes for the process. |
process_virtual_memory_max_bytes | The maximum number of virtual memory bytes for the process. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
up | The connectivity of metric collection. |
CSI clusters (job name: k8s-csi-cluster-pv)
Metric | Description |
alibaba_cloud_storage_operator_build_info | The build information about the storage operations system on Alibaba Cloud. |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_job_discovery_status | The discovery status of the Prometheus agent collection jobs. |
aliyun_prometheus_agent_scrape_custom_error | The number of custom collection errors of the Prometheus agent. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
cluster_pv_detail_num_total | The total number of detailed PV information in the cluster. |
cluster_pv_status_num_total | The total number of PV states in the cluster. |
cluster_pvc_detail_num_total | The total number of detailed PVC information in the cluster. |
cluster_pvc_status_num_total | The total number of PVC states in the cluster. |
cluster_scrape_collector_duration_seconds | The duration of the cluster scrape collector in seconds. |
cluster_scrape_collector_success | The number of successful scrapes by the cluster collector. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
up | The connectivity of metric collection. |
CSI nodes (job name: k8s-csi-node-pv)
Metric | Description |
alibaba_cloud_csi_driver_build_info | The build information about the Container Storage Interface (CSI) driver. |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_job_discovery_status | The discovery status of the Prometheus agent collection jobs. |
aliyun_prometheus_agent_scrape_custom_error | The number of custom collection errors of the Prometheus agent. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
cluster_scrape_collector_duration_seconds | The duration of the cluster scrape collector in seconds. |
cluster_scrape_collector_success | The number of successful scrapes by the cluster collector. |
container_fs_available_bytes | The available bytes of the container file system. |
container_fs_inodes_free | The number of available inodes in the container file system. |
container_fs_inodes_total | The total number of inodes in the container file system. |
container_fs_inodes_used | The number of used inodes in the container file system. |
container_fs_limit_bytes | The limit of bytes in the container file system. |
container_fs_usage_bytes | The used bytes in the container file system. |
ephemeral_storage_pod_available_bytes | The available bytes of ephemeral storage Pod. |
ephemeral_storage_pod_inodes_free | The available inodes of ephemeral storage Pod. |
ephemeral_storage_pod_inodes_total | The total number of inodes in the ephemeral storage Pod. |
ephemeral_storage_pod_inodes_used | The used inodes in the ephemeral storage Pod. |
ephemeral_storage_pod_limit_bytes | The limit of bytes in the ephemeral storage Pod. |
ephemeral_storage_pod_usage_bytes | The used bytes in the ephemeral storage Pod. |
node_volume_backend_posix_access_total_counter | The total counter for Portable Operating System Interface (POSIX) access to the node volume backend. |
node_volume_backend_posix_getattr_total_counter | The total counter for POSIX getattr calls to the node volume backend. |
node_volume_backend_posix_getmode_total_counter | The total counter for POSIX getmode operations to the node volume backend. |
node_volume_backend_posix_link_total_counter | The total counter for POSIX link operations to the node volume backend. |
node_volume_backend_posix_lookup_total_counter | The total counter for POSIX lookup operations to the node volume backend. |
node_volume_backend_posix_mknod_total_counter | The total counter for POSIX mknod operations to the node volume backend. |
node_volume_backend_posix_readdir_total_counter | The total counter for POSIX readdir operations to the node volume backend. |
node_volume_backend_posix_readlink_total_counter | The total counter for POSIX readlink operations to the node volume backend. |
node_volume_backend_posix_remove_total_counter | The total counter for POSIX remove operations to the node volume backend. |
node_volume_backend_posix_rename_total_counter | The total counter for POSIX rename operations to the node volume backend. |
node_volume_backend_posix_setattr_total_counter | The total counter for POSIX setattr operations to the node volume backend. |
node_volume_backend_posix_statfs_total_counter | The total counter for POSIX statfs operations to the node volume backend. |
node_volume_backend_read_bytes_total_counter | The total counter for bytes read from the node volume backend. |
node_volume_backend_read_completed_total_counter | The total number of completed read requests to the node volume backend. |
node_volume_backend_read_time_milliseconds_total_counter | The total milliseconds spent on reads to the node volume backend. |
node_volume_backend_write_bytes_total_counter | The total number of bytes written to the node volume backend. |
node_volume_backend_write_completed_total_counter | The total number of completed write requests to the node volume backend. |
node_volume_backend_write_time_milliseconds_total_counter | The total milliseconds spent on writes to the node volume backend. |
node_volume_capacity_bytes_available | The available capacity of the node volume in bytes. |
node_volume_capacity_bytes_available_counter | The available capacity of the node volume in bytes. |
node_volume_capacity_bytes_total | The total capacity of the node volume in bytes. |
node_volume_capacity_bytes_total_counter | The total capacity of the node volume in bytes (counter). |
node_volume_capacity_bytes_used | The used capacity of the node volume in bytes. |
node_volume_capacity_bytes_used_counter | The used capacity of the node volume in bytes (counter). |
node_volume_hot_spot_head_file_top | The top hot spot files in the node volume. |
node_volume_hot_spot_read_file_top | The top files read in the node volume hot spots. |
node_volume_hot_spot_write_file_top | The top files written in the node volume hot spots. |
node_volume_inode_bytes_available_counter | The counter for available inode bytes in the node volume. |
node_volume_inode_bytes_total_counter | The counter for total inode bytes in the node volume. |
node_volume_inode_bytes_used_counter | The counter for used inode bytes in the node volume. |
node_volume_inodes_available | The number of available inodes in the node volume. |
node_volume_inodes_total | The total number of inodes in the node volume. |
node_volume_inodes_used | The number of used inodes in the node volume. |
node_volume_io_now | The current I/O count in the node volume. |
node_volume_io_time_seconds_total | The total seconds spent on I/O in the node volume. |
node_volume_oss_delete_object_total_counter | The total counter for Object Storage Service (OSS) object deletions in the node volume. |
node_volume_oss_get_object_total_counter | The total counter for OSS object gets in the node volume. |
node_volume_oss_head_object_total_counter | The total counter for OSS object metadata in the node volume. |
node_volume_oss_post_object_total_counter | The total counter for OSS object POSTs in the node volume. |
node_volume_oss_put_object_total_counter | The total counter for OSS object PUTs in the node volume. |
node_volume_posix_access_total_counter | The total counter for POSIX accesses in the node volume. |
node_volume_posix_chmod_total_counter | The total counter for POSIX chmod operations in the node volume. |
node_volume_posix_chown_total_counter | The total counter for POSIX chown operations in the node volume. |
node_volume_posix_create_total_counter | The total counter for POSIX creations in the node volume. |
node_volume_posix_flush_total_counter | The total counter for POSIX flushes in the node volume. |
node_volume_posix_fsync_total_counter | The total counter for POSIX fsyncs in the node volume. |
node_volume_posix_mkdir_total_counter | The total counter for POSIX mkdir operations in the node volume. |
node_volume_posix_open_total_counter | The total counter for POSIX opens in the node volume. |
node_volume_posix_opendir_total_counter | The total counter for POSIX opendir operations in the node volume. |
node_volume_posix_read_total_counter | The total counter for POSIX reads in the node volume. |
node_volume_posix_readdir_total_counter | The total counter for POSIX readdir operations in the node volume. |
node_volume_posix_release_total_counter | The total counter for POSIX releases in the node volume. |
node_volume_posix_rename_total_counter | The total counter for POSIX renames in the node volume. |
node_volume_posix_rmdir_total_counter | The total counter for POSIX rmdir operations in the node volume. |
node_volume_posix_truncate_total_counter | The total counter for POSIX truncate operations in the node volume. |
node_volume_posix_write_total_counter | The total counter for POSIX writes in the node volume. |
node_volume_read_bytes_total | The total number of bytes read from the node volume. |
node_volume_read_bytes_total_counter | The total number of bytes read from the node volume (counter). |
node_volume_read_completed_total | The total number of completed read requests to the node volume. |
node_volume_read_completed_total_counter | The total number of completed read requests to the node volume (counter). |
node_volume_read_merged_total | The total number of merged read operations in the node volume. |
node_volume_read_queue_time_milliseconds_total | The total milliseconds spent on read queue in the node volume. |
node_volume_read_rtt_time_milliseconds_total | The total milliseconds spent on read round-trip time in the node volume. |
node_volume_read_sent_bytes_total | The total number of bytes sent during reads in the node volume. |
node_volume_read_time_milliseconds_total | The total milliseconds spent on reads in the node volume. |
node_volume_read_time_milliseconds_total_counter | The total milliseconds spent on reads in the node volume (counter). |
node_volume_read_timeouts_total | The total number of read timeouts in the node volume. |
node_volume_read_transmissions_total | The total number of read transmissions in the node volume. |
node_volume_vg_free_bytes | The free bytes in the volume group (VG) of the node volume. |
node_volume_vg_size_bytes | The total bytes in the VG of the node volume. |
node_volume_write_bytes_total | The total number of bytes written to the node volume. |
node_volume_write_bytes_total_counter | The total number of bytes written to the node volume (counter). |
node_volume_write_completed_total | The total number of completed write requests to the node volume. |
node_volume_write_completed_total_counter | The total number of completed write requests to the node volume (counter). |
node_volume_write_merged_total | The total number of merged write operations in the node volume. |
node_volume_write_queue_time_milliseconds_total | The total milliseconds spent on write queue in the node volume. |
node_volume_write_recv_bytes_total | The total number of bytes received during writes in the node volume. |
node_volume_write_rtt_time_milliseconds_total | The total milliseconds spent on write round-trip time in the node volume. |
node_volume_write_time_milliseconds_total | The total milliseconds spent on writes in the node volume. |
node_volume_write_time_milliseconds_total_counter | The total milliseconds spent on writes in the node volume (counter). |
node_volume_write_timeouts_total | The total number of write timeouts in the node volume. |
node_volume_write_transmissions_total | The total number of write transmissions in the node volume. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
up | The connectivity of metric collection. |
GPU-Exporter (job name: gpu-exporter)
Metric | Description |
DCGM_CUSTOM_ALLOCATE_MODE | The mode in which the node runs. A value of 0 indicates that no GPU Pods are running on the node. A value of 1 indicates that the GPU Pods on the current node run in an exclusive GPU mode. A value of 2 indicates that the GPU Pods on the current node run in a shared GPU mode. |
DCGM_CUSTOM_CONTAINER_CP_ALLOCATED | The ratio of the GPU computing power allocated to the container to the total computing power of the GPU. The value ranges from 0 to 1. In exclusive GPU mode or in shared GPU mode in which the container requests only GPU memory, the value of this metric is 0, which indicates that the allocation of GPU computing power is unlimited. For example, if a GPU provides a total of 100 compute units (CUs) of GPU computing power and allocates 30 CUs to a container, the ratio of the GPU computing power allocated to the container is calculated by using the following formula: 30/100 = 0.3. |
DCGM_CUSTOM_CONTAINER_MEM_ALLOCATED | The amount of GPU memory allocated to the container. |
DCGM_CUSTOM_DEV_FB_ALLOCATED | The ratio of the allocated GPU memory to the total memory of the GPU. The value ranges from 0 to 1. |
DCGM_CUSTOM_DEV_FB_TOTAL | The total memory of the GPU. |
DCGM_CUSTOM_ILLEGAL_PROCESS_DECODE_UTIL | The illegal process decode utilization. |
DCGM_CUSTOM_ILLEGAL_PROCESS_ENCODE_UTIL | The illegal process encode utilization. |
DCGM_CUSTOM_ILLEGAL_PROCESS_MEM_COPY_UTIL | The memory copy utilization of illegal processes. |
DCGM_CUSTOM_ILLEGAL_PROCESS_MEM_USED | The memory used by illegal processes. |
DCGM_CUSTOM_ILLEGAL_PROCESS_SM_UTIL | The SM utilization of illegal processes. |
DCGM_CUSTOM_PROCESS_DECODE_UTIL | The decoder utilization of GPU threads. |
DCGM_CUSTOM_PROCESS_ENCODE_UTIL | The encoder utilization of GPU threads. |
DCGM_CUSTOM_PROCESS_MEM_COPY_UTIL | The memory copy utilization of GPU threads. |
DCGM_CUSTOM_PROCESS_MEM_USED | The amount of GPU memory used by GPU threads. |
DCGM_CUSTOM_PROCESS_SM_UTIL | The SM utilization of GPU threads. |
DCGM_FI_DEV_APP_MEM_CLOCK | The memory application clock speed. |
DCGM_FI_DEV_APP_SM_CLOCK | The SM application clock speed. |
DCGM_FI_DEV_BAR1_FREE | The remaining Base Address Register 1 (BAR1). |
DCGM_FI_DEV_BAR1_TOTAL | The total size of device BAR1. |
DCGM_FI_DEV_BAR1_USED | The used BAR1. |
DCGM_FI_DEV_BOARD_LIMIT_VIOLATION | The time of the violation due to board limitations. |
DCGM_FI_DEV_CLOCK_THROTTLE_REASONS | The reasons for clock throttling. |
DCGM_FI_DEV_COUNT | The number of devices. |
DCGM_FI_DEV_DEC_UTIL | The decoder utilization. |
DCGM_FI_DEV_ENC_UTIL | The encoder utilization. |
DCGM_FI_DEV_FB_FREE | The amount of free frame buffer memory. |
DCGM_FI_DEV_FB_USED | The amount of used frame buffer memory. The value of this metric is the same as the value of Memory-Usage returned by the nvidia-smi command. |
DCGM_FI_DEV_GPU_TEMP | The GPU temperature. |
DCGM_FI_DEV_GPU_UTIL | The GPU utilization within a cycle of 1 second or 1/6 second. The cycle varies based on the GPU model. A cycle is a period of time during which one or more kernel functions remain active. This metric only indicates that one or more kernel functions are occupying GPU resources. The metric does not display detailed GPU usage information. |
DCGM_FI_DEV_LOW_UTIL_VIOLATION | The time of the violation due to low utilization. |
DCGM_FI_DEV_MEM_CLOCK | The memory clock speed. |
DCGM_FI_DEV_MEM_COPY_UTIL | The memory bandwidth utilization. For example, the maximum memory bandwidth of NVIDIA V100 is 900 GB/s. If the memory bandwidth used is 450 GB/s, the memory bandwidth utilization is 50%. |
DCGM_FI_DEV_MEMORY_TEMP | The memory temperature. |
DCGM_FI_DEV_NVLINK_BANDWIDTH_TOTAL | The total NVLink bandwidth. |
DCGM_FI_DEV_PCIE_REPLAY_COUNTER | The PCIe replay counter. |
DCGM_FI_DEV_POWER_USAGE | The power usage. |
DCGM_FI_DEV_POWER_VIOLATION | The time of the violation due to power limitations. |
DCGM_FI_DEV_PSTATE | The status of the device power. |
DCGM_FI_DEV_RELIABILITY_VIOLATION | The time of the violation due to board reliability. |
DCGM_FI_DEV_RETIRED_DBE | The number of pages retired due to double bit errors. |
DCGM_FI_DEV_RETIRED_PENDING | The number of pages to be retired. These pages are marked as unavailable due to errors in the GPU memory. |
DCGM_FI_DEV_RETIRED_SBE | The number of pages retired due to single bit errors. |
DCGM_FI_DEV_SM_CLOCK | The SM clock speed. |
DCGM_FI_DEV_SYNC_BOOST_VIOLATION | The time of the violation due to synchronous limit raising. |
DCGM_FI_DEV_THERMAL_VIOLATION | The time of the violation due to thermal limitations. |
DCGM_FI_DEV_TOTAL_ENERGY_CONSUMPTION | The total energy consumed since the driver was last loaded. |
DCGM_FI_DEV_VIDEO_CLOCK | The video clock speed. |
DCGM_FI_DEV_XID_ERRORS | The last XID error that occurred within a period of time. |
DCGM_FI_PROF_DRAM_ACTIVE | The cycle fraction for memory bandwidth utilization when sending data to device memory or receiving data from device memory. The value is an average value within a time interval rather than an instantaneous value. A larger value of this metric indicates higher device memory utilization. If the value is 1 (100%), a DRAM command is executed every cycle within the entire interval. The peak value of the metric can reach 0.8 (80%). If the value of this metric is 0.2 (20%), 20% of the cycles within the time interval are spent reading from or writing to device memory. |
DCGM_FI_PROF_GR_ENGINE_ACTIVE | The percentage of time that the Graphics or Compute engines were active within a time interval. The value indicates the average across all Graphics and Compute engines. A Graphics or Compute engine is considered active when a Graphics or Compute context is bound to a thread and the Graphics or Compute context is in a busy state. |
DCGM_FI_PROF_NVLINK_RX_BYTES | The TX rate of NVLink and the RX rate of NVLink. The bytes transmitted or received exclude the header. The value is an average value within a time interval rather than an instantaneous value. For example, if 1 GB of data is transmitted within 1 second, the TX rate is 1 GB/s regardless of whether the transmission occurs at a consistent rate or in bursts. Theoretically, the maximum NVLink Gen2 bandwidth is 25 GB/s per direction per link. |
DCGM_FI_PROF_NVLINK_TX_BYTES | The total number of bytes sent through NVLink. |
DCGM_FI_PROF_PCIE_RX_BYTES | The TX rate of PCle and the RX rate of PCIe. The bytes transmitted or received include both the header and payload. The value is an average value within a time interval rather than an instantaneous value. For example, if 1 GB of data is transmitted within 1 second, the TX rate is 1 GB/s regardless of whether the transmission occurs at a consistent rate or in bursts. Theoretically, the maximum PCIe Gen3 bandwidth is 985 MB/s per lane. |
DCGM_FI_PROF_PCIE_TX_BYTES | The TX rate of PCle and the RX rate of PCIe. The bytes transmitted or received include both the header and payload. The value is an average value within a time interval rather than an instantaneous value. For example, if 1 GB of data is transmitted within 1 second, the TX rate is 1 GB/s regardless of whether the transmission occurs at a consistent rate or in bursts. Theoretically, the maximum PCIe Gen3 bandwidth is 985 MB/s per lane. |
DCGM_FI_PROF_PIPE_FP16_ACTIVE | The fraction of cycles during which the FP16 (half-precision) pipeline was active. The value is an average value within a time interval rather than an instantaneous value. A higher value indicates higher utilization of the FP16 cores. A value of 1 (100%) means that an FP16 instruction was executed every two cycles throughout the entire time interval (for example, on Volta-type cards). If the value of this metric is 0.2 (20%), one of the following conditions may exist: The FP16 core utilization of 20% of the SMs within the time interval is 100%. The FP16 core utilization of all SMs within the time interval is 20%. The FP16 core utilization of all SMs within 20% of the time interval is 100%. Other conditions. |
DCGM_FI_PROF_PIPE_FP32_ACTIVE | The fraction of cycles during which the FMA (Fused Multiply-Add) pipeline was active. The FMA operations include both FP32 (single-precision) and integer operations. The value is an average value within a time interval rather than an instantaneous value. A higher value indicates higher utilization of the FP32 cores. A value of 1 (100%) means that an FP32 instruction was executed every two cycles throughout the entire time interval (for example, on Volta-type cards). If the value of this metric is 0.2 (20%), one of the following conditions may exist: The FP32 core utilization of 20% of the SMs within the time interval is 100%. The FP32 core utilization of all SMs within the time interval is 20%. The FP32 core utilization of all SMs within 20% of the time interval is 100%. Other conditions. |
DCGM_FI_PROF_PIPE_FP64_ACTIVE | The fraction of cycles during which the FP64 (double-precision) pipeline was active. The value is an average value within a time interval rather than an instantaneous value. A higher value indicates higher utilization of the FP64 cores. A value of 1 (100%) means that an FP64 instruction was executed every four cycles throughout the entire time interval (for example, on Volta-type cards). If the value of this metric is 0.2 (20%), one of the following conditions may exist: The FP64 core utilization of 20% of the SMs within the time interval is 100%. The FP64 core utilization of 20% of the SMs within the time interval is 100%. The FP64 core utilization of all SMs within 20% of the time interval is 100%. Other conditions. |
DCGM_FI_PROF_PIPE_TENSOR_ACTIVE | The cycle fraction for the Tensor (HMMA/IMMA) pipe being in the Active state. The value is an average value within a time interval rather than an instantaneous value. A larger value of this metric indicates higher tensor core utilization. If the value is 1 (100%), a Tensor instruction is issued every cycle within the entire interval. One instruction completes in two cycles. If the value of this metric is 0.2 (20%), one of the following conditions may exist: The tensor core utilization of 20% of the SMs within the time interval is 100%. The tensor core utilization of all SMs within the time interval is 20%. The tensor core utilization of all SMs within 20% of the time interval is 100%. Other conditions. |
DCGM_FI_PROF_SM_ACTIVE | The ratio of cycles during which at least one warp on an SM remains active. The value is an average of all SMs. The value does not vary with the number of warps included in the thread block. When a warp is scheduled and resources are allocated to the warp, the warp is considered active. In this case, the status of the warp may be Computing or not Computing; for example, it may be waiting for memory requests or in another non-Computing state. If the value of this metric drops below 0.5, the GPU utilization is low. To ensure high GPU utilization, make sure that the value is greater than 0.8. Assume that a GPU has N SMs. If all SMs in N thread blocks run a kernel function within a time interval, the value of this metric is 1 (100%). If N/5 thread blocks run a kernel function within a time interval, the value of this metric is 0.2. If N thread blocks run a kernel function during 20% of the cycle within a time interval, the value of this metric is 0.2. |
DCGM_FI_PROF_SM_OCCUPANCY | The ratio of warps resident on an SM to the maximum number of warps that can reside on that SM, averaged over all SMs within a time interval. A higher occupancy does not necessarily indicate higher GPU utilization. Only in workloads where GPU memory bandwidth is the limiting factor (DCGM_FI_PROF_DRAM_ACTIVE), does a higher occupancy indicate more effective GPU utilization. |
nvidia_gpu_allocated_num_devices | The number of allocated GPU devices. Warning: Will be deprecated in the future. |
nvidia_gpu_memory_allocated_bytes | The full memory of GPU devices. Warning: Will be deprecated in the future, replaced by DCGM_CUSTOM_DEV_FB_allocated. |
nvidia_gpu_sharing_memory | The memory allocated for GPU sharing. Warning: Will be deprecated in the future, DCGM_CUSTOM_DEV_FB_allocated. |
up | The connectivity of metric collection. |
Cost-Exporter (job name: alibaba-cloud-cost-exporter)
Metric | Description |
deducted_by_cash_coupons | The bill discount amount for the current instance. |
deducted_by_prepaid_card | The prepaid card discount amount for the current instance. |
invoice_discount | The discount amount for the current instance. |
list_price | The unit price for the current instance. |
node_current_price | The actual price of the current node. |
node_payAsYouGo_price | The pay-as-you-go price of the current node. |
node_payByPeriod_price | The subscription price of the current node. |
node_spot_price | The spot price of the current node. |
outstanding_amount | The outstanding amount for the current instance. |
payent_amount | The cash payment amount for the current instance. |
pretax_amount | The payable amount for the current instance. |
pretax_gross_amount | The original amount for the current instance. |
usage | The resource usage for the current instance. |
up | The connectivity of metric collection. |
Ingress (job name: arms-ack-ingress)
Metric | Description |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_job_discovery_status | The discovery status of the Prometheus agent collection jobs. |
aliyun_prometheus_agent_scrape_custom_error | The number of custom collection errors of the Prometheus agent. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
go_cgo_go_to_c_calls_calls_total | The total number of C function calls made by cgo. |
go_gc_cycles_automatic_gc_cycles_total | The total number of automatic GC cycles. |
go_gc_cycles_forced_gc_cycles_total | The total number of forced GC cycles. |
go_gc_cycles_total_gc_cycles_total | The total number of GC cycles. |
go_gc_duration_seconds | The Go GC pause duration in seconds. |
go_gc_duration_seconds_count | The Go GC pause duration in seconds. |
go_gc_duration_seconds_sum | The total Go GC pause duration in seconds. |
go_gc_heap_allocs_by_size_bytes_total_bucket | The distribution of Go GC heap allocations classified by size in bytes. |
go_gc_heap_allocs_by_size_bytes_total_count | The count of Go GC heap allocations classified by size in bytes. |
go_gc_heap_allocs_by_size_bytes_total_sum | The sum of Go GC heap allocations classified by size in bytes. |
go_gc_heap_allocs_bytes_total | The total bytes allocated in the Go GC heap. |
go_gc_heap_allocs_objects_total | The total objects allocated in the Go GC heap. |
go_gc_heap_frees_by_size_bytes_total_bucket | The distribution of Go GC heap releases classified by size in bytes. |
go_gc_heap_frees_by_size_bytes_total_count | The count of Go GC heap releases classified by size in bytes. |
go_gc_heap_frees_by_size_bytes_total_sum | The sum of Go GC heap releases classified by size in bytes. |
go_gc_heap_frees_bytes_total | The total bytes released in the Go GC heap. |
go_gc_heap_frees_objects_total | The total objects released in the Go GC heap. |
go_gc_heap_goal_bytes | The target size of the Go GC heap in bytes. |
go_gc_heap_objects_objects | The number of objects in the Go GC heap. |
go_gc_heap_tiny_allocs_objects_total | The total number of small object allocations in the Go GC. |
go_gc_limiter_last_enabled_gc_cycle | The last enabled GC cycle. |
go_gc_pauses_seconds_total_bucket | The distribution of Go GC pause time in seconds. |
go_gc_pauses_seconds_total_count | The count of Go GC pause time in seconds. |
go_gc_pauses_seconds_total_sum | The sum of Go GC pause time in seconds. |
go_gc_stack_starting_size_bytes | The starting size of the Go GC stack in bytes. |
go_goroutines | The number of goroutines. |
go_info | The Go-specific information. |
go_memory_classes_heap_free_bytes | The amount of idle heap memory in bytes. |
go_memory_classes_heap_objects_bytes | The amount of heap memory occupied by objects in bytes. |
go_memory_classes_heap_released_bytes | The amount of heap memory released in bytes. |
go_memory_classes_heap_stacks_bytes | The amount of memory reserved for the stack in bytes. |
go_memory_classes_heap_unused_bytes | The amount of heap memory not used in bytes. |
go_memory_classes_metadata_mcache_free_bytes | The amount of idle memory in mcache in bytes. |
go_memory_classes_metadata_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memory_classes_metadata_mspan_free_bytes | The amount of idle memory in mspan in bytes. |
go_memory_classes_metadata_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memory_classes_metadata_other_bytes | The amount of memory occupied by other metadata in bytes. |
go_memory_classes_os_stacks_bytes | The amount of memory reserved for the operating system stack in bytes. |
go_memory_classes_other_bytes | The amount of memory used for other purposes in bytes. |
go_memory_classes_profiling_buckets_bytes | The bytes used by profiling buckets. |
go_memory_classes_total_bytes | The total memory in bytes. |
go_memstats_alloc_bytes | The amount of memory allocated in bytes. |
go_memstats_alloc_bytes_total | The cumulative amount of memory allocated in bytes. |
go_memstats_buck_hash_sys_bytes | The amount of memory used by hash tables in the operating system in bytes. |
go_memstats_frees_total | The total number of releases. |
go_memstats_gc_cpu_fraction | The GC CPU utilization (%). |
go_memstats_gc_sys_bytes | The amount of memory used by GC in the operating system in bytes. |
go_memstats_heap_alloc_bytes | The amount of heap memory allocated in bytes. |
go_memstats_heap_idle_bytes | The amount of idle heap memory in bytes. |
go_memstats_heap_inuse_bytes | The amount of heap memory in use in bytes. |
go_memstats_heap_objects | The number of objects allocated on the heap. |
go_memstats_heap_released_bytes | The amount of heap memory released in bytes. |
go_memstats_heap_sys_bytes | The amount of memory allocated to the heap by the operating system in bytes. |
go_memstats_last_gc_time_seconds | The last GC duration in seconds. |
go_memstats_lookups_total | The total number of lookups. |
go_memstats_mallocs_total | The total number of allocations. |
go_memstats_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memstats_mcache_sys_bytes | The amount of memory allocated to mcache by the operating system in bytes. |
go_memstats_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memstats_mspan_sys_bytes | The amount of memory allocated to mspan by the operating system in bytes. |
go_memstats_next_gc_bytes | The number of bytes to be released at the next GC in bytes. |
go_memstats_other_sys_bytes | The amount of memory allocated for other purposes by the operating system in bytes. |
go_memstats_stack_inuse_bytes | The amount of stack memory in use in bytes. |
go_memstats_stack_sys_bytes | The amount of memory allocated to the stack by the operating system in bytes. |
go_memstats_sys_bytes | The total memory allocated by the operating system in bytes. |
go_sched_gomaxprocs_threads | The maximum parallelism of the Go scheduler in threads. |
go_sched_goroutines_goroutines | The current number of goroutines in the Go scheduler. |
go_sched_latencies_seconds_bucket | The distribution of Go scheduling latencies in seconds. |
go_sched_latencies_seconds_count | The count of Go scheduling latencies in seconds. |
go_sched_latencies_seconds_sum | The sum of Go scheduling latencies in seconds. |
go_threads | The number of Go threads. |
nginx_ingress_controller_admission_config_size | The size of the NGINX Ingress controller Admission Config. |
nginx_ingress_controller_admission_render_duration | The rendering duration of the NGINX Ingress controller Admission Config. |
nginx_ingress_controller_admission_render_ingresses | The number of Ingresses rendered by the NGINX Ingress controller. |
nginx_ingress_controller_admission_roundtrip_duration | The round-trip processing duration of the NGINX Ingress controller. |
nginx_ingress_controller_admission_tested_duration | The testing duration of the NGINX Ingress controller. |
nginx_ingress_controller_admission_tested_ingresses | The number of Ingresses tested by the NGINX Ingress controller. |
nginx_ingress_controller_build_info | The build information of the NGINX Ingress controller. |
nginx_ingress_controller_bytes_sent_bucket | The distribution of total bytes sent by the NGINX Ingress controller. |
nginx_ingress_controller_bytes_sent_count | The count of total bytes sent by the NGINX Ingress controller. |
nginx_ingress_controller_bytes_sent_sum | The sum of total bytes sent by the NGINX Ingress controller. |
nginx_ingress_controller_check_errors | The number of check errors in the NGINX Ingress controller. |
nginx_ingress_controller_check_success | The number of successful checks in the NGINX Ingress controller. |
nginx_ingress_controller_config_hash | The configuration hash of the NGINX Ingress controller. |
nginx_ingress_controller_config_last_reload_successful | The success status of the last configuration reload in the NGINX Ingress controller. |
nginx_ingress_controller_config_last_reload_successful_timestamp_seconds | The timestamp of the last successful configuration reload in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_connect_duration_seconds_bucket | The distribution of connection durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_connect_duration_seconds_count | The count of connection durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_connect_duration_seconds_sum | The sum of connection durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_errors | The number of errors in the NGINX Ingress controller. |
nginx_ingress_controller_header_duration_seconds_bucket | The distribution of header processing durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_header_duration_seconds_count | The count of header processing durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_header_duration_seconds_sum | The sum of header processing durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_ingress_upstream_latency_seconds | The upstream latency in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_ingress_upstream_latency_seconds_count | The count of upstream latencies in the NGINX Ingress controller. |
nginx_ingress_controller_ingress_upstream_latency_seconds_sum | The sum of upstream latencies in the NGINX Ingress controller. |
nginx_ingress_controller_leader_election_status | The leader election status of the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_connections | The number of connections in the nginx process of the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_connections_total | The total number of connections in the nginx process of the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_cpu_seconds_total | The total CPU utilization in seconds of the nginx process in the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_num_procs | The number of nginx processes in the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_oldest_start_time_seconds | The oldest start time in seconds of the nginx process in the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_read_bytes_total | The total number of bytes read by the nginx process in the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_requests_total | The total number of requests processed by the nginx process in the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_resident_memory_bytes | The resident memory size in bytes of the nginx process in the NGINX Ingress controller. |
nginx_ingress_controller_nginx_process_virtual_memory_bytes | The amount of virtual memory that is used by an NGINX process in bytes. |
nginx_ingress_controller_nginx_process_write_bytes_total | The virtual memory size in bytes of the nginx process in the NGINX Ingress controller. |
nginx_ingress_controller_orphan_ingress | The number of orphaned Ingresses in the NGINX Ingress controller. |
nginx_ingress_controller_request_duration_seconds_bucket | The distribution of request durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_request_duration_seconds_count | The count of request durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_request_duration_seconds_sum | The sum of request durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_request_size_bucket | The distribution of request sizes in the NGINX Ingress controller. |
nginx_ingress_controller_request_size_count | The count of request sizes in the NGINX Ingress controller. |
nginx_ingress_controller_request_size_sum | The sum of request sizes in the NGINX Ingress controller. |
nginx_ingress_controller_requests | The total number of requests in the NGINX Ingress controller. |
nginx_ingress_controller_response_duration_seconds_bucket | The distribution of response durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_response_duration_seconds_count | The count of response durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_response_duration_seconds_sum | The sum of response durations in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_response_size_bucket | The distribution of response sizes in the NGINX Ingress controller. |
nginx_ingress_controller_response_size_count | The count of response sizes in the NGINX Ingress controller. |
nginx_ingress_controller_response_size_sum | The sum of response sizes in the NGINX Ingress controller. |
nginx_ingress_controller_ssl_certificate_info | The SSL certificate information in the NGINX Ingress controller. |
nginx_ingress_controller_ssl_expire_time_seconds | The expiration time of the SSL certificate in the NGINX Ingress controller in seconds. |
nginx_ingress_controller_success | The number of successes in the NGINX Ingress controller. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
up | The connectivity of metric collection. |
Koordinator (job name: kube-system, koordlet-metrics-podmonitor, or koord-manager-metrics-service)
Metric | Description |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
koord_manager_recommender_recommendation_workload_target | The recommended specification metric for workload in the resource profiling feature. |
koordlet_container_resource_limits | The limit metric for container resources. |
koordlet_container_resource_requests | The request metric for container resources. |
koordlet_node_priority_resource_reclaimable | The priority metric for node resources. |
koordlet_node_resource_allocatable | The allocatable resource metric for the node. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
slo_manager_recommender_recommendation_workload_target | The resource specifications that are recommended based on the workload by the resource profiling feature. This metric is discontinued. |
up | The connectivity of metric collection. |
ETCD (job name: etcd)
Metric | Description |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_job_discovery_status | The discovery status of the Prometheus agent collection jobs. |
aliyun_prometheus_agent_scrape_custom_error | The number of custom collection errors of the Prometheus agent. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
cpu_utilization_core | The CPU core utilization. |
etcd_cluster_version | The version of the cluster. |
etcd_debugging_auth_revision | The authentication revision number for ETCD debugging. |
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket | The distribution of ETCD debugging disk backend commit rebalance duration in seconds. |
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_count | The count of ETCD debugging disk backend commit rebalance duration in seconds. |
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_sum | The sum of ETCD debugging disk backend commit rebalance duration in seconds. |
etcd_debugging_disk_backend_commit_spill_duration_seconds_bucket | The distribution of ETCD debugging disk backend commit spill duration. |
etcd_debugging_disk_backend_commit_spill_duration_seconds_count | The count of ETCD debugging disk backend commit spill duration. |
etcd_debugging_disk_backend_commit_spill_duration_seconds_sum | The sum of ETCD debugging disk backend commit spill duration. |
etcd_debugging_disk_backend_commit_write_duration_seconds_bucket | The distribution of ETCD debugging disk backend commit write duration in seconds. |
etcd_debugging_disk_backend_commit_write_duration_seconds_count | The count of ETCD debugging disk backend commit write duration in seconds. |
etcd_debugging_disk_backend_commit_write_duration_seconds_sum | The sum of ETCD debugging disk backend commit write duration in seconds. |
etcd_debugging_lease_granted_total | The total number of lease grants in ETCD debugging. |
etcd_debugging_lease_renewed_total | The total number of lease renewals in ETCD debugging. |
etcd_debugging_lease_revoked_total | The total number of lease revocations in ETCD debugging. |
etcd_debugging_lease_ttl_total_bucket | The distribution of lease TTLs in ETCD debugging. |
etcd_debugging_lease_ttl_total_count | The count of lease TTLs in ETCD debugging. |
etcd_debugging_lease_ttl_total_sum | The sum of lease TTLs in ETCD debugging. |
etcd_debugging_mvcc_compact_revision | The compaction revision number for ETCD debugging MVCC. |
etcd_debugging_mvcc_current_revision | The current revision version for ETCD debugging MVCC. |
etcd_debugging_mvcc_db_compaction_keys_total | The total number of keys compressed in the ETCD debugging MVCC database. |
etcd_debugging_mvcc_db_compaction_last | The last compaction time for the ETCD debugging MVCC database. |
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket | The distribution of MVCC database compaction pause durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count | The count of MVCC database compaction pause durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum | The sum of MVCC database compaction pause durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket | The distribution of MVCC database compaction total durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count | The count of MVCC database compaction total durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum | The sum of MVCC database compaction total durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_db_total_size_in_bytes | The total size of the MVCC database in bytes for ETCD debugging. |
etcd_debugging_mvcc_delete_total | The total number of delete operations in ETCD debugging MVCC. |
etcd_debugging_mvcc_events_total | The total number of events in ETCD debugging. |
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket | The distribution of MVCC index compaction pause durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count | The count of MVCC index compaction pause durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum | The sum of MVCC index compaction pause durations in milliseconds for ETCD debugging. |
etcd_debugging_mvcc_keys_total | The total number of keys in ETCD debugging MVCC. |
etcd_debugging_mvcc_pending_events_total | The total number of pending events in ETCD debugging MVCC. |
etcd_debugging_mvcc_put_total | The total number of put operations in ETCD debugging MVCC. |
etcd_debugging_mvcc_range_total | The total number of range queries in ETCD MVCC. |
etcd_debugging_mvcc_slow_watcher_total | The total number of slow watchers in ETCD debugging. |
etcd_debugging_mvcc_total_put_size_in_bytes | The total size of MVCC puts in bytes for ETCD debugging. |
etcd_debugging_mvcc_txn_total | The total number of MVCC transactions in ETCD debugging. |
etcd_debugging_mvcc_watch_stream_total | The total number of snapshot streams in ETCD debugging. |
etcd_debugging_mvcc_watcher_total | The total number of watchers in ETCD debugging. |
etcd_debugging_server_lease_expired_total | The total number of expired leases in ETCD debugging. |
etcd_debugging_snap_save_marshalling_duration_seconds_bucket | The distribution of snapshot save marshalling durations in seconds for ETCD debugging. |
etcd_debugging_snap_save_marshalling_duration_seconds_count | The count of snapshot save marshalling durations in seconds for ETCD debugging. |
etcd_debugging_snap_save_marshalling_duration_seconds_sum | The sum of snapshot save marshalling durations in seconds for ETCD debugging. |
etcd_debugging_snap_save_total_duration_seconds_bucket | The distribution of snapshot save durations in seconds for ETCD debugging. |
etcd_debugging_snap_save_total_duration_seconds_count | The count of snapshot save durations in seconds for ETCD debugging. |
etcd_debugging_snap_save_total_duration_seconds_sum | The sum of snapshot save durations in seconds for ETCD debugging. |
etcd_debugging_store_expires_total | The total number of expired items in ETCD debugging storage. |
etcd_debugging_store_reads_total | The total number of reads in ETCD debugging storage. |
etcd_debugging_store_watch_requests_total | The total number of watch requests in ETCD debugging storage. |
etcd_debugging_store_watchers | The total number of watchers in ETCD debugging storage. |
etcd_debugging_store_writes_total | The total number of writes in ETCD debugging storage. |
etcd_disk_backend_commit_duration_seconds_bucket | The distribution of disk backend commit durations in seconds for ETCD. |
etcd_disk_backend_commit_duration_seconds_count | The count of disk backend commit durations in seconds for ETCD. |
etcd_disk_backend_commit_duration_seconds_sum | The sum of disk backend commit durations in seconds for ETCD. |
etcd_disk_backend_defrag_duration_seconds_bucket | The distribution of disk backend defragmentation durations in seconds for ETCD. |
etcd_disk_backend_defrag_duration_seconds_count | The count of disk backend defragmentation durations in seconds for ETCD. |
etcd_disk_backend_defrag_duration_seconds_sum | The sum of disk backend defragmentation durations in seconds for ETCD. |
etcd_disk_backend_snapshot_duration_seconds_bucket | The distribution of disk backend snapshot durations in seconds for ETCD. |
etcd_disk_backend_snapshot_duration_seconds_count | The count of disk backend snapshot durations in seconds for ETCD. |
etcd_disk_backend_snapshot_duration_seconds_sum | The sum of disk backend snapshot durations in seconds for ETCD. |
etcd_disk_defrag_inflight | The number of ongoing disk defragmentations in ETCD. |
etcd_disk_wal_fsync_duration_seconds_bucket | The distribution of WAL sync durations in seconds for ETCD disk. |
etcd_disk_wal_fsync_duration_seconds_count | The count of WAL sync durations in seconds for ETCD disk. |
etcd_disk_wal_fsync_duration_seconds_sum | The sum of WAL sync durations in seconds for ETCD disk. |
etcd_disk_wal_write_bytes_total | The total number of bytes written to the WAL in ETCD disk. |
etcd_grpc_proxy_cache_hits_total | The total number of cache hits in the ETCD gRPC proxy. |
etcd_grpc_proxy_cache_keys_total | The total number of cache keys in the ETCD gRPC proxy. |
etcd_grpc_proxy_cache_misses_total | The total number of cache misses in the ETCD gRPC proxy. |
etcd_grpc_proxy_events_coalescing_total | The total number of event coalescings in the ETCD gRPC proxy. |
etcd_grpc_proxy_watchers_coalescing_total | The total number of watcher coalescings in the ETCD gRPC proxy. |
etcd_mvcc_db_open_read_transactions | The number of open read transactions in the ETCD MVCC database. |
etcd_mvcc_db_total_size_in_bytes | The total size of the MVCC database in bytes for ETCD. |
etcd_mvcc_db_total_size_in_use_in_bytes | The total size in use of the MVCC database in bytes for ETCD. |
etcd_mvcc_delete_total | The total number of deletes in ETCD MVCC. |
etcd_mvcc_hash_duration_seconds_bucket | The distribution of MVCC hash durations in seconds for ETCD. |
etcd_mvcc_hash_duration_seconds_count | The count of MVCC hash durations in seconds for ETCD. |
etcd_mvcc_hash_duration_seconds_sum | The sum of MVCC hash durations in seconds for ETCD. |
etcd_mvcc_hash_rev_duration_seconds_bucket | The distribution of MVCC hash revision durations in seconds for ETCD. |
etcd_mvcc_hash_rev_duration_seconds_count | The count of MVCC hash revision durations in seconds for ETCD. |
etcd_mvcc_hash_rev_duration_seconds_sum | The sum of MVCC hash revision durations in seconds for ETCD. |
etcd_mvcc_put_total | The total number of put operations in ETCD MVCC. |
etcd_mvcc_range_total | The total number of range queries in ETCD MVCC. |
etcd_mvcc_txn_total | The total number of MVCC transactions in ETCD. |
etcd_network_active_peers | The number of active peers in the ETCD network. |
etcd_network_client_grpc_received_bytes_total | The total number of bytes received by the ETCD network client via gRPC. |
etcd_network_client_grpc_sent_bytes_total | The total number of bytes sent by the ETCD network client via gRPC. |
etcd_network_disconnected_peers_total | The total number of disconnected peers in the ETCD network. |
etcd_network_peer_received_bytes_total | The total number of bytes received by the ETCD network peer. |
etcd_network_peer_received_failures_total | The total number of receive failures in the ETCD network peer. |
etcd_network_peer_round_trip_time_seconds_bucket | The distribution of round trip times for the ETCD network peer in seconds. |
etcd_network_peer_round_trip_time_seconds_count | The count of round trip times for the ETCD network peer in seconds. |
etcd_network_peer_round_trip_time_seconds_sum | The sum of round trip times for the ETCD network peer in seconds. |
etcd_network_peer_sent_bytes_total | The total number of bytes sent by the ETCD network peer. |
etcd_network_peer_sent_failures_total | The total number of send failures by the ETCD network peer. |
etcd_network_server_stream_failures_total | The total number of stream failures in the ETCD network server. |
etcd_network_snapshot_receive_inflights_total | The number of concurrent snapshot receive requests in the ETCD network. |
etcd_network_snapshot_receive_success | The number of successful snapshot receives in the ETCD network. |
etcd_network_snapshot_receive_total_duration_seconds_bucket | The distribution of snapshot receive durations in seconds for the ETCD network. |
etcd_network_snapshot_receive_total_duration_seconds_count | The count of snapshot receive durations in seconds for the ETCD network. |
etcd_network_snapshot_receive_total_duration_seconds_sum | The sum of snapshot receive durations in seconds for the ETCD network. |
etcd_network_snapshot_send_inflights_total | The number of concurrent snapshot send requests in the ETCD network. |
etcd_network_snapshot_send_success | The number of successful snapshot sends in the ETCD network. |
etcd_network_snapshot_send_total_duration_seconds_bucket | The distribution of snapshot send durations in seconds for the ETCD network. |
etcd_network_snapshot_send_total_duration_seconds_count | The count of snapshot send durations in seconds for the ETCD network. |
etcd_network_snapshot_send_total_duration_seconds_sum | The sum of snapshot send durations in seconds for the ETCD network. |
etcd_server_apply_duration_seconds_bucket | The distribution of application durations in seconds for the ETCD server. |
etcd_server_apply_duration_seconds_count | The count of application durations in seconds for the ETCD server. |
etcd_server_apply_duration_seconds_sum | The sum of application durations in seconds for the ETCD server. |
etcd_server_client_requests_total | The total number of client requests to the ETCD server. |
etcd_server_go_version | The Go version of the ETCD server. |
etcd_server_has_leader | Indicates whether a leader exists in the ETCD server. |
etcd_server_health_failures | The number of health check failures in the ETCD server. |
etcd_server_health_success | The number of successful health checks in the ETCD server. |
etcd_server_heartbeat_send_failures_total | The total number of heartbeat send failures in the ETCD server. |
etcd_server_id | The ID of the ETCD server. |
etcd_server_is_leader | Indicates whether the ETCD server is a leader. |
etcd_server_is_learner | Indicates whether the ETCD server is a learner. |
etcd_server_leader_changes_seen_total | The total number of leader changes witnessed by the ETCD server. |
etcd_server_learner_promote_successes | The number of successful learner promotions in the ETCD server. |
etcd_server_proposals_applied_total | The total number of applied proposals in the ETCD server. |
etcd_server_proposals_committed_total | The total number of committed proposals in the ETCD server. |
etcd_server_proposals_failed_total | The total number of failed proposals in the ETCD server. |
etcd_server_proposals_pending | The total number of pending proposals in the ETCD server. |
etcd_server_quota_backend_bytes | The backend storage quota in bytes for the ETCD server. |
etcd_server_read_indexes_failed_total | The total number of read index failures in the ETCD server. |
etcd_server_slow_apply_total | The total number of slow applications in the ETCD server. |
etcd_server_slow_read_indexes_total | The total number of slow read indexes in the ETCD server. |
etcd_server_snapshot_apply_in_progress_total | The total number of snapshots being applied in the ETCD server. |
etcd_server_version | The version of the ETCD server. |
etcd_snap_db_fsync_duration_seconds_bucket | The distribution of ETCD snapshot database fsync durations in seconds. |
etcd_snap_db_fsync_duration_seconds_count | The count of ETCD snapshot database fsync durations in seconds. |
etcd_snap_db_fsync_duration_seconds_sum | The sum of ETCD snapshot database fsync durations in seconds. |
etcd_snap_db_save_total_duration_seconds_bucket | The distribution of ETCD snapshot database save durations in seconds. |
etcd_snap_db_save_total_duration_seconds_count | The count of ETCD snapshot database save durations in seconds. |
etcd_snap_db_save_total_duration_seconds_sum | The sum of ETCD snapshot database save durations in seconds. |
etcd_snap_fsync_duration_seconds_bucket | The distribution of ETCD snapshot fsync durations in seconds. |
etcd_snap_fsync_duration_seconds_count | The count of ETCD snapshot fsync durations in seconds. |
etcd_snap_fsync_duration_seconds_sum | The sum of ETCD snapshot fsync durations in seconds. |
go_gc_duration_seconds | The Go GC pause duration in seconds. |
go_gc_duration_seconds_count | The Go GC pause duration in seconds. |
go_gc_duration_seconds_sum | The total Go GC pause duration in seconds. |
go_goroutines | The number of goroutines. |
go_info | The Go-specific information. |
go_memstats_alloc_bytes | The amount of memory allocated in bytes. |
go_memstats_alloc_bytes_total | The cumulative amount of memory allocated in bytes. |
go_memstats_buck_hash_sys_bytes | The amount of memory used by hash tables in the operating system in bytes. |
go_memstats_frees_total | The total number of releases. |
go_memstats_gc_cpu_fraction | The GC CPU utilization (%). |
go_memstats_gc_sys_bytes | The amount of memory used by GC in the operating system in bytes. |
go_memstats_heap_alloc_bytes | The amount of heap memory allocated in bytes. |
go_memstats_heap_idle_bytes | The amount of idle heap memory in bytes. |
go_memstats_heap_inuse_bytes | The amount of heap memory in use in bytes. |
go_memstats_heap_objects | The number of objects allocated on the heap. |
go_memstats_heap_released_bytes | The amount of heap memory released in bytes. |
go_memstats_heap_sys_bytes | The amount of memory allocated to the heap by the operating system in bytes. |
go_memstats_last_gc_time_seconds | The last GC duration in seconds. |
go_memstats_lookups_total | The total number of lookups. |
go_memstats_mallocs_total | The total number of allocations. |
go_memstats_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memstats_mcache_sys_bytes | The amount of memory allocated to mcache by the operating system in bytes. |
go_memstats_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memstats_mspan_sys_bytes | The amount of memory allocated to mspan by the operating system in bytes. |
go_memstats_next_gc_bytes | The number of bytes to be released at the next GC in bytes. |
go_memstats_other_sys_bytes | The amount of memory allocated for other purposes by the operating system in bytes. |
go_memstats_stack_inuse_bytes | The amount of stack memory in use in bytes. |
go_memstats_stack_sys_bytes | The amount of memory allocated to the stack by the operating system in bytes. |
go_memstats_sys_bytes | The total memory allocated by the operating system in bytes. |
go_threads | The number of threads. |
grpc_server_handled_total | The total number of requests handled by the gRPC server. |
grpc_server_msg_received_total | The total number of requests received by the gRPC server. |
grpc_server_msg_sent_total | The total number of requests sent by the gRPC server. |
grpc_server_started_total | The total number of times the gRPC server has started. |
memory_utilization_byte | The memory usage in bytes. |
os_fd_limit | The file descriptor limit of the operating system. |
os_fd_used | The number of file descriptors used by the operating system. |
process_cpu_seconds_total | The total number of CPU seconds used by the process. |
process_max_fds | The maximum number of file descriptors for the process. |
process_open_fds | The number of file descriptors opened by the process. |
process_resident_memory_bytes | The resident memory size of the process in bytes. |
process_start_time_seconds | The process startup duration in seconds. |
process_virtual_memory_bytes | The number of virtual memory bytes for the process. |
process_virtual_memory_max_bytes | The maximum number of virtual memory bytes for the process. |
promhttp_metric_handler_requests_in_flight | The current number of requests being handled by the Prometheus HTTP metric handler. |
promhttp_metric_handler_requests_total | The total number of requests handled by the Prometheus HTTP metric handler. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
up | The connectivity of metric collection. |
Scheduler (job name: ack-scheduler)
Metric | Description |
aggregator_discovery_aggregation_count_total | The count of discovery aggregations performed by the aggregator. |
aliyun_prometheus_agent_append_duration_seconds | The duration of the Prometheus agent append operations in seconds. |
aliyun_prometheus_agent_job_discovery_status | The discovery status of the Prometheus agent collection jobs. |
aliyun_prometheus_agent_scrape_custom_error | The number of custom collection errors of the Prometheus agent. |
aliyun_prometheus_agent_scrapes_by_target_total | The total number of scrapes by the Prometheus agent per target. |
aliyun_prometheus_agent_target_info | The target information of the Prometheus agent. |
apiserver_audit_event_total | The total number of APIServer audit events. |
apiserver_audit_requests_rejected_total | The total number of APIServer audit request rejections. |
apiserver_client_certificate_expiration_seconds_bucket | The distribution of remaining seconds until APIServer client certificate expiration. |
apiserver_client_certificate_expiration_seconds_count | The count of remaining seconds until APIServer client certificate expiration. |
apiserver_client_certificate_expiration_seconds_sum | The sum of remaining seconds until APIServer client certificate expiration. |
apiserver_delegated_authn_request_duration_seconds_bucket | The distribution of delegated authentication request durations in seconds for the APIServer. |
apiserver_delegated_authn_request_duration_seconds_count | The count of delegated authentication request durations in seconds for the APIServer. |
apiserver_delegated_authn_request_duration_seconds_sum | The sum of delegated authentication request durations in seconds for the APIServer. |
apiserver_delegated_authn_request_total | The total number of delegated authentication requests for the APIServer. |
apiserver_delegated_authz_request_duration_seconds_bucket | The distribution of delegated authorization request durations in seconds for the APIServer. |
apiserver_delegated_authz_request_duration_seconds_count | The count of delegated authorization request durations in seconds for the APIServer. |
apiserver_delegated_authz_request_duration_seconds_sum | The sum of delegated authorization request durations in seconds for the APIServer. |
apiserver_delegated_authz_request_total | The total number of delegated authorization requests to the API server. |
apiserver_encryption_config_controller_automatic_reload_failures_total | The total number of automatic reload failures for the APIServer encryption configuration controller. |
apiserver_encryption_config_controller_automatic_reload_success_total | The total number of successful automatic reloads for the APIServer encryption configuration controller. |
apiserver_envelope_encryption_dek_cache_fill_percent | The percentage of envelope encryption data encryption keys (DEKs) cache fill for the APIServer. |
apiserver_storage_data_key_generation_duration_seconds_bucket | The distribution of data key generation durations for the APIServer storage. |
apiserver_storage_data_key_generation_duration_seconds_count | The count of data key generation durations for the APIServer storage. |
apiserver_storage_data_key_generation_duration_seconds_sum | The sum of data key generation durations for the APIServer storage. |
apiserver_storage_data_key_generation_failures_total | The total number of data key generation failures for the APIServer storage. |
apiserver_storage_envelope_transformation_cache_misses_total | The total number of envelope transformation cache misses for the APIServer storage. |
apiserver_webhooks_x509_insecure_sha1_total | The total count of insecure SHA1 usage in X509 certificates for APIServer Webhooks. |
apiserver_webhooks_x509_missing_san_total | The total count of missing SANs in X509 certificates for APIServer Webhooks. |
authenticated_user_requests | The number of authenticated user requests. |
authentication_attempts | The number of authentication attempts. |
authentication_duration_seconds_bucket | The distribution of authentication durations in seconds. |
authentication_duration_seconds_count | The count of authentication durations in seconds. |
authentication_duration_seconds_sum | The sum of authentication durations in seconds. |
authentication_token_cache_active_fetch_count | The count of active fetches for the authentication token cache. |
authentication_token_cache_fetch_total | The total number of fetches for the authentication token cache. |
authentication_token_cache_request_duration_seconds_bucket | The distribution of request durations in seconds for the authentication token cache. |
authentication_token_cache_request_duration_seconds_count | The count of request durations in seconds for the authentication token cache. |
authentication_token_cache_request_duration_seconds_sum | The sum of request durations in seconds for the authentication token cache. |
authentication_token_cache_request_total | The total number of requests for the authentication token cache. |
authorization_attempts_total | The total number of authorization attempts. |
authorization_duration_seconds_bucket | The distribution of authorization durations in seconds. |
authorization_duration_seconds_count | The count of authorization durations in seconds. |
authorization_duration_seconds_sum | The sum of authorization durations in seconds. |
cardinality_enforcement_unexpected_categorizations_total | The total number of unexpected categorizations during cardinality enforcement. |
cpu_utilization_core | The CPU core utilization. |
disabled_metric_total | The total number of disabled metrics. |
disabled_metrics_total | The total number of disabled metrics. |
go_cgo_go_to_c_calls_calls_total | The total number of Go to C calls via cgo. |
go_cpu_classes_gc_mark_assist_cpu_seconds_total | The total number of CPU seconds for GC mark assist. |
go_cpu_classes_gc_mark_dedicated_cpu_seconds_total | The total number of dedicated CPU seconds for GC marking in Go. |
go_cpu_classes_gc_mark_idle_cpu_seconds_total | The idle CPU seconds for GC marking in Go. |
go_cpu_classes_gc_pause_cpu_seconds_total | The total number of CPU seconds for GC pauses in Go. |
go_cpu_classes_gc_total_cpu_seconds_total | The total number of CPU seconds for all GC activities in Go. |
go_cpu_classes_idle_cpu_seconds_total | The total number of idle CPU seconds in Go. |
go_cpu_classes_scavenge_assist_cpu_seconds_total | The total number of CPU seconds for GC scavenging assist. |
go_cpu_classes_scavenge_background_cpu_seconds_total | The total number of CPU seconds for background GC scavenging. |
go_cpu_classes_scavenge_total_cpu_seconds_total | The total CPU seconds for scavenge in Go CPU classes. |
go_cpu_classes_total_cpu_seconds_total | The total CPU seconds summed across all Go CPU classes. |
go_cpu_classes_user_cpu_seconds_total | The total user CPU seconds summed across Go CPU classes. |
go_gc_cycles_automatic_gc_cycles_total | The total number of automatic GC cycles in Go. |
go_gc_cycles_forced_gc_cycles_total | The total number of forced GC cycles in Go. |
go_gc_cycles_total_gc_cycles_total | The total number of GC cycles in Go. |
go_gc_duration_seconds | The duration of Go GC in seconds. |
go_gc_duration_seconds_count | The count of Go GC durations in seconds. |
go_gc_duration_seconds_sum | The sum of Go GC pause durations in seconds. |
go_gc_gogc_percent | The GO GC target percentage. |
go_gc_gomemlimit_bytes | The heap memory limit in bytes for Go GC. |
go_gc_heap_allocs_by_size_bytes_bucket | The distribution of heap allocations by size in bytes for Go GC. |
go_gc_heap_allocs_by_size_bytes_count | The count of heap allocations by size in bytes for Go GC. |
go_gc_heap_allocs_by_size_bytes_sum | The sum of heap allocations by size in bytes for Go GC. |
go_gc_heap_allocs_by_size_bytes_total_bucket | The distribution of heap allocations by size in bytes for Go GC. |
go_gc_heap_allocs_by_size_bytes_total_count | The count of heap allocations by size in bytes for Go GC. |
go_gc_heap_allocs_by_size_bytes_total_sum | The sum of heap allocations by size in bytes for Go GC. |
go_gc_heap_allocs_bytes_total | The total bytes allocated in the Go GC heap. |
go_gc_heap_allocs_objects_total | The total number of objects allocated on the heap for Go GC. |
go_gc_heap_frees_by_size_bytes_bucket | The distribution of heap releases by size in bytes for Go GC. |
go_gc_heap_frees_by_size_bytes_count | The count of heap releases by size in bytes for Go GC. |
go_gc_heap_frees_by_size_bytes_sum | The sum of heap releases by size in bytes for Go GC. |
go_gc_heap_frees_by_size_bytes_total_bucket | The distribution of total heap releases by size in bytes for Go GC. |
go_gc_heap_frees_by_size_bytes_total_count | The count of total heap releases by size in bytes for Go GC. |
go_gc_heap_frees_by_size_bytes_total_sum | The sum of total heap releases by size in bytes for Go GC. |
go_gc_heap_frees_bytes_total | The total bytes released in the Go GC heap. |
go_gc_heap_frees_objects_total | The total number of objects freed from the heap for Go GC. |
go_gc_heap_goal_bytes | The target heap size in bytes for Go GC. |
go_gc_heap_live_bytes | The live heap size in bytes for Go GC. |
go_gc_heap_objects_objects | The number of objects in the heap for Go GC. |
go_gc_heap_tiny_allocs_objects_total | The total number of tiny object allocations in the heap for Go GC. |
go_gc_limiter_last_enabled_gc_cycle | The last enabled GC cycle for the Go GC limiter. |
go_gc_pauses_seconds_bucket | The distribution of GC pause durations in seconds. |
go_gc_pauses_seconds_count | The count of GC pause durations in seconds. |
go_gc_pauses_seconds_sum | The sum of GC pause durations in seconds. |
go_gc_pauses_seconds_total_bucket | The distribution of total GC pause durations in seconds. |
go_gc_pauses_seconds_total_count | The count of total GC pause durations in seconds. |
go_gc_pauses_seconds_total_sum | The sum of total GC pause durations in seconds. |
go_gc_scan_globals_bytes | The number of global bytes scanned during Go GC. |
go_gc_scan_heap_bytes | The number of heap bytes scanned during Go GC. |
go_gc_scan_stack_bytes | The number of stack bytes scanned during Go GC. |
go_gc_scan_total_bytes | The total number of bytes scanned during Go GC. |
go_gc_stack_starting_size_bytes | The starting size of the Go GC stack in bytes. |
go_godebug_non_default_behavior_execerrdot_events_total | The total number of execution error point events for non-default Go behavior. |
go_godebug_non_default_behavior_gocachehash_events_total | The total number of Go cache hash events for non-default Go behavior. |
go_godebug_non_default_behavior_gocachetest_events_total | The total number of gocachetest events for non-default Go debug behavior. |
go_godebug_non_default_behavior_gocacheverify_events_total | The total number of gocacheverify events for non-default Go behavior. |
go_godebug_non_default_behavior_gotypesalias_events_total | The total number of gotypealias events for non-default Go debug behavior. |
go_godebug_non_default_behavior_http2client_events_total | The total number of http2client events for non-default Go debug behavior. |
go_godebug_non_default_behavior_http2server_events_total | The total number of http2server events for non-default Go behavior. |
go_godebug_non_default_behavior_httplaxcontentlength_events_total | The total number of HTTP lax content length events for non-default Go behavior. |
go_godebug_non_default_behavior_httpmuxgo121_events_total | The total number of httpmuxgo121 events for non-default Go behavior. |
go_godebug_non_default_behavior_installgoroot_events_total | The total number of goroot installation events for non-default Go debugging. |
go_godebug_non_default_behavior_jstmpllitinterp_events_total | The total number of jstmpllitinterp events for non-default Go debug behavior. |
go_godebug_non_default_behavior_multipartmaxheaders_events_total | The total number of multipart max headers events for non-default Go behavior. |
go_godebug_non_default_behavior_multipartmaxparts_events_total | The total number of multipartmaxparts events for non-default Go debug behavior. |
go_godebug_non_default_behavior_multipathtcp_events_total | The total number of multipathtcp events for non-default Go debug behavior. |
go_godebug_non_default_behavior_panicnil_events_total | The total number of nil pointer panic events for non-default Go behavior. |
go_godebug_non_default_behavior_randautoseed_events_total | The total number of random auto-seed events for non-default Go behavior. |
go_godebug_non_default_behavior_tarinsecurepath_events_total | The total number of tarinsecurepath events for non-default Go debug behavior. |
go_godebug_non_default_behavior_tls10server_events_total | The total number of TLS1.0 events for non-default Go debug behavior. |
go_godebug_non_default_behavior_tlsmaxrsasize_events_total | The total number of tlsmaxrsasize events for non-default Go debug behavior. |
go_godebug_non_default_behavior_tlsrsakex_events_total | The total number of TLS RSA key exchange events for non-default Go debug behavior. |
go_godebug_non_default_behavior_tlsunsafeekm_events_total | The total number of TLS insecure EKM events for non-default Go debug behavior. |
go_godebug_non_default_behavior_x509sha1_events_total | The total number of x509sha1 events for non-default Go debug behavior. |
go_godebug_non_default_behavior_x509usefallbackroots_events_total | The total number of X509 use fallback roots events for non-default Go behavior. |
go_godebug_non_default_behavior_x509usepolicies_events_total | The total number of x509usepolicies events for non-default Go debug behavior. |
go_godebug_non_default_behavior_zipinsecurepath_events_total | The total number of zipinsecurepath events for non-default Go debug behavior. |
go_goroutines | Go goroutines |
go_info | The Go-specific information. |
go_memory_classes_heap_free_bytes | The free bytes in the heap. |
go_memory_classes_heap_objects_bytes | The bytes used by heap objects. |
go_memory_classes_heap_released_bytes | The released bytes in the heap for memory classes. |
go_memory_classes_heap_stacks_bytes | The bytes used by stacks. |
go_memory_classes_heap_unused_bytes | The unused bytes in the heap. |
go_memory_classes_metadata_mcache_free_bytes | The free bytes in metadata mcache. |
go_memory_classes_metadata_mcache_inuse_bytes | The in-use bytes in metadata mcache. |
go_memory_classes_metadata_mspan_free_bytes | The free bytes in metadata mspan. |
go_memory_classes_metadata_mspan_inuse_bytes | The in-use bytes in metadata mspan. |
go_memory_classes_metadata_other_bytes | The other bytes in metadata. |
go_memory_classes_os_stacks_bytes | The bytes used by OS stacks in memory classes. |
go_memory_classes_other_bytes | The other bytes. |
go_memory_classes_profiling_buckets_bytes | The bytes used by profiling buckets. |
go_memory_classes_total_bytes | The total bytes. |
go_memstats_alloc_bytes | The allocated bytes. |
go_memstats_alloc_bytes_total | The total allocated bytes. |
go_memstats_buck_hash_sys_bytes | The buck hash system bytes. |
go_memstats_frees_total | The total number of releases. |
go_memstats_gc_cpu_fraction | The fraction of CPU time spent in GC. |
go_memstats_gc_sys_bytes | The GC system bytes. |
go_memstats_heap_alloc_bytes | The allocated bytes on the heap. |
go_memstats_heap_idle_bytes | The idle bytes on the heap. |
go_memstats_heap_inuse_bytes | The in-use bytes on the heap. |
go_memstats_heap_objects | The number of objects on the heap. |
go_memstats_heap_released_bytes | The released bytes on the heap. |
go_memstats_heap_sys_bytes | The system bytes on the heap. |
go_memstats_last_gc_time_seconds | The last GC duration in seconds. |
go_memstats_lookups_total | The total number of lookups. |
go_memstats_mallocs_total | The total number of allocations. |
go_memstats_mcache_inuse_bytes | The amount of memory in use in mcache in bytes. |
go_memstats_mcache_sys_bytes | The amount of memory allocated to mcache by the operating system in bytes. |
go_memstats_mspan_inuse_bytes | The amount of memory in use in mspan in bytes. |
go_memstats_mspan_sys_bytes | The amount of memory allocated to mspan by the operating system in bytes. |
go_memstats_next_gc_bytes | The number of bytes to be released at the next GC in bytes. |
go_memstats_other_sys_bytes | The total memory allocated by the operating system in bytes. |
go_memstats_stack_inuse_bytes | The amount of stack memory in use in bytes. |
go_memstats_stack_sys_bytes | The amount of stack memory allocated by the operating system in bytes. |
go_memstats_sys_bytes | The total memory allocated by the operating system in bytes. |
go_sched_gomaxprocs_threads | The number of threads determined by GOMAXPROCS. |
go_sched_goroutines_goroutines | The number of goroutines. |
go_sched_latencies_seconds_bucket | The distribution of Go scheduling latencies in seconds. |
go_sched_latencies_seconds_count | The count of Go scheduling latencies in seconds. |
go_sched_latencies_seconds_sum | The sum of Go scheduling latencies in seconds. |
go_sched_pauses_stopping_gc_seconds_bucket | The distribution of stopping GC pause seconds. |
go_sched_pauses_stopping_gc_seconds_count | The count of stopping GC pause seconds. |
go_sched_pauses_stopping_gc_seconds_sum | The sum of stopping GC pause seconds. |
go_sched_pauses_stopping_other_seconds_bucket | The distribution of other stopping seconds for Go scheduler pauses. |
go_sched_pauses_stopping_other_seconds_count | The count of other stopping seconds for Go scheduler pauses. |
go_sched_pauses_stopping_other_seconds_sum | The sum of other stopping seconds for Go scheduler pauses. |
go_sched_pauses_total_gc_seconds_bucket | The distribution of total GC seconds for Go scheduler pauses. |
go_sched_pauses_total_gc_seconds_count | The count of total GC seconds for Go scheduler pauses. |
go_sched_pauses_total_gc_seconds_sum | The sum of total GC seconds for Go scheduler pauses. |
go_sched_pauses_total_other_seconds_bucket | The distribution of other pause seconds. |
go_sched_pauses_total_other_seconds_count | The count of other pause seconds. |
go_sched_pauses_total_other_seconds_sum | The sum of other pause seconds. |
go_sync_mutex_wait_total_seconds_total | The total seconds of Go sync mutex wait. |
go_threads | The number of Go threads. |
hidden_metric_total | The total number of hidden metrics. |
hidden_metrics_total | The total number of hidden metrics. |
kubernetes_build_info | The Kubernetes build information. |
kubernetes_feature_enabled | The Kubernetes enabled features. |
leader_election_master_status | The master status of leader election. |
memory_utilization_byte | The used memory in bytes. |
process_cpu_seconds_total | The total CPU seconds of the process. |
process_max_fds | The maximum number of file descriptors for the process. |
process_open_fds | The number of file descriptors opened by the process. |
process_resident_memory_bytes | The resident memory size of the process in bytes. |
process_start_time_seconds | The process startup duration in seconds. |
process_virtual_memory_bytes | The number of virtual memory bytes for the process. |
process_virtual_memory_max_bytes | The maximum number of virtual memory bytes for the process. |
registered_metric_total | The total number of registered metrics. |
registered_metrics_total | The total number of registered metrics. |
rest_client_exec_plugin_certificate_rotation_age_bucket | The distribution of certificate rotation age for REST client exec plugin. |
rest_client_exec_plugin_certificate_rotation_age_count | The count of certificate rotation age for REST client exec plugin. |
rest_client_exec_plugin_certificate_rotation_age_sum | The sum of certificate rotation age for REST client exec plugin. |
rest_client_rate_limiter_duration_seconds_bucket | The distribution of rate limiter durations for REST client. |
rest_client_rate_limiter_duration_seconds_count | The count of rate limiter durations for REST client. |
rest_client_rate_limiter_duration_seconds_sum | The sum of rate limiter durations for REST client. |
rest_client_request_duration_seconds_bucket | The distribution of request durations in seconds for REST client. |
rest_client_request_duration_seconds_count | The count of request durations in seconds for REST client. |
rest_client_request_duration_seconds_sum | The sum of request durations in seconds for REST client. |
rest_client_request_retries_total | The total number of request retries for REST client. |
rest_client_request_size_bytes_bucket | The distribution of request sizes in bytes for REST client. |
rest_client_request_size_bytes_count | The count of request sizes in bytes for REST client. |
rest_client_request_size_bytes_sum | The sum of request sizes in bytes for REST client. |
rest_client_requests_total | The total number of requests for REST client. |
rest_client_response_size_bytes_bucket | The distribution of response sizes in bytes for REST client. |
rest_client_response_size_bytes_count | The count of response sizes in bytes for REST client. |
rest_client_response_size_bytes_sum | The sum of response sizes in bytes for REST client. |
rest_client_transport_cache_entries | The number of transport cache entries for REST client. |
rest_client_transport_create_calls_total | The total number of transport create calls for REST client. |
scheduler_binding_duration_seconds_bucket | The distribution of binding durations in seconds for the scheduler. |
scheduler_binding_duration_seconds_count | The count of binding durations in seconds for the scheduler. |
scheduler_binding_duration_seconds_sum | The sum of binding durations in seconds for the scheduler. |
scheduler_e2e_scheduling_duration_seconds_bucket | The distribution of end-to-end scheduling durations for the scheduler. |
scheduler_e2e_scheduling_duration_seconds_count | The count of end-to-end scheduling durations for the scheduler. |
scheduler_e2e_scheduling_duration_seconds_sum | The sum of end-to-end scheduling durations for the scheduler. |
scheduler_framework_extension_point_duration_seconds_bucket | The distribution of extension point durations for the scheduler framework. |
scheduler_framework_extension_point_duration_seconds_count | The count of extension point durations for the scheduler framework. |
scheduler_framework_extension_point_duration_seconds_sum | The sum of extension point durations for the scheduler framework. |
scheduler_goroutines | The number of goroutines for the scheduler. |
scheduler_pending_pods | The number of pending pods for the scheduler. |
scheduler_plugin_evaluation_total | The total number of plugin evaluations for the scheduler. |
scheduler_plugin_execution_duration_seconds_bucket | The distribution of execution durations in seconds for the scheduler plugins. |
scheduler_plugin_execution_duration_seconds_count | The count of execution durations in seconds for the scheduler plugins. |
scheduler_plugin_execution_duration_seconds_sum | The sum of execution durations in seconds for the scheduler plugins. |
scheduler_pod_preemption_victims_bucket | The distribution of preemption victims for the scheduler. |
scheduler_pod_preemption_victims_count | The count of preemption victims for the scheduler. |
scheduler_pod_preemption_victims_sum | The sum of preemption victims for the scheduler. |
scheduler_pod_scheduling_attempts_bucket | The distribution of pod scheduling attempts for the scheduler. |
scheduler_pod_scheduling_attempts_count | The count of pod scheduling attempts for the scheduler. |
scheduler_pod_scheduling_attempts_sum | The sum of pod scheduling attempts for the scheduler. |
scheduler_pod_scheduling_duration_seconds_bucket | The distribution of pod scheduling durations in seconds for the scheduler. |
scheduler_pod_scheduling_duration_seconds_count | The count of pod scheduling durations in seconds for the scheduler. |
scheduler_pod_scheduling_duration_seconds_sum | The sum of pod scheduling durations in seconds for the scheduler. |
scheduler_pod_scheduling_sli_duration_seconds_bucket | The distribution of SLI durations for pod scheduling. |
scheduler_pod_scheduling_sli_duration_seconds_count | The count of SLI durations for pod scheduling. |
scheduler_pod_scheduling_sli_duration_seconds_sum | The sum of SLI durations for pod scheduling. |
scheduler_preemption_attempts_total | The total number of preemption attempts for the scheduler. |
scheduler_preemption_victims_bucket | The distribution of preemption victims for the scheduler. |
scheduler_preemption_victims_count | The count of preemption victims for the scheduler. |
scheduler_preemption_victims_sum | The sum of preemption victims for the scheduler. |
scheduler_queue_incoming_pods_total | The total number of incoming pods for the scheduler. |
scheduler_schedule_attempts_total | The total number of scheduling attempts for the scheduler. |
scheduler_scheduler_cache_size | The scheduler cache size. |
scheduler_scheduler_goroutines | The number of goroutines for the scheduler. |
scheduler_scheduling_algorithm_duration_seconds_bucket | The distribution of scheduling algorithm durations in seconds. |
scheduler_scheduling_algorithm_duration_seconds_count | The count of scheduling algorithm durations in seconds. |
scheduler_scheduling_algorithm_duration_seconds_sum | The sum of scheduling algorithm durations in seconds. |
scheduler_scheduling_algorithm_predicate_evaluation_seconds_bucket | The distribution of predicate evaluation seconds for the scheduling algorithm. |
scheduler_scheduling_algorithm_predicate_evaluation_seconds_count | The count of predicate evaluation seconds for the scheduling algorithm. |
scheduler_scheduling_algorithm_predicate_evaluation_seconds_sum | The sum of predicate evaluation seconds for the scheduling algorithm. |
scheduler_scheduling_algorithm_preemption_evaluation_seconds_bucket | The distribution of preemption evaluation seconds for the scheduling algorithm. |
scheduler_scheduling_algorithm_preemption_evaluation_seconds_count | The count of preemption evaluation seconds for the scheduling algorithm. |
scheduler_scheduling_algorithm_preemption_evaluation_seconds_sum | The sum of preemption evaluation seconds for the scheduling algorithm. |
scheduler_scheduling_algorithm_priority_evaluation_seconds_bucket | The distribution of priority evaluation durations in seconds for the scheduling algorithm. |
scheduler_scheduling_algorithm_priority_evaluation_seconds_count | The count of priority evaluation durations in seconds for the scheduling algorithm. |
scheduler_scheduling_algorithm_priority_evaluation_seconds_sum | The sum of priority evaluation durations in seconds for the scheduling algorithm. |
scheduler_scheduling_attempt_duration_seconds_bucket | The distribution of scheduling attempt durations. |
scheduler_scheduling_attempt_duration_seconds_count | The count of scheduling attempt durations. |
scheduler_scheduling_attempt_duration_seconds_sum | The sum of scheduling attempt durations. |
scheduler_scheduling_duration_seconds | The distribution of scheduling durations in seconds. |
scheduler_scheduling_duration_seconds_count | The count of scheduling durations in seconds. |
scheduler_scheduling_duration_seconds_sum | The sum of scheduling durations in seconds. |
scheduler_total_preemption_attempts | The total number of preemption attempts by the scheduler. |
scheduler_unschedulable_pods | The number of unscheduled pods by the scheduler. |
scheduler_volume_scheduling_duration_seconds_bucket | The distribution of volume scheduling durations in seconds. |
scheduler_volume_scheduling_duration_seconds_count | The count of volume scheduling durations in seconds. |
scheduler_volume_scheduling_duration_seconds_sum | The sum of volume scheduling durations in seconds. |
scheduler_volume_scheduling_stage_error_total | The number of errors that are returned during volume scheduling. |
scrape_duration_seconds | The scrape duration in seconds. |
scrape_samples_post_metric_relabeling | The number of scraped samples after metric relabeling. |
scrape_samples_scraped | The number of scraped samples. |
scrape_series_added | The number of new series added during the scrape. |
up | The connectivity of metric collection. |
workqueue_adds_total | The total number of additions to the work queue. |
workqueue_depth | The work queue depth. |
workqueue_longest_running_processor_seconds | The longest running processor duration in seconds for the work queue. |
workqueue_queue_duration_seconds_bucket | The distribution of queue durations in seconds for the work queue. |
workqueue_queue_duration_seconds_count | The count of queue durations in seconds for the work queue. |
workqueue_queue_duration_seconds_sum | The sum of queue durations in seconds for the work queue. |
workqueue_retries_total | The total number of retries in the work queue. |
workqueue_unfinished_work_seconds | The unfinished work duration in seconds for the work queue. |
workqueue_work_duration_seconds_bucket | The distribution of work durations for the work queue. |
workqueue_work_duration_seconds_count | The count of work durations for the work queue. |
workqueue_work_duration_seconds_sum | The sum of work durations for the work queue. |