When you use Prometheus Service of Application Real-Time Monitoring Service (ARMS), you are charged based on the number of reported data entries on billable metrics. The metrics are classified into two types: basic metrics and custom metrics. Custom metrics refer to non-basic metrics. Basic metrics are free of charge. You are charged for custom metrics starting from January 6, 2020.
The following tables describe the basic metrics of Kubernetes clusters that are supported by Prometheus Service.
- The following table describes the job that collects data about the status of a Prometheus instance and the basic metrics that are collected.
Job name Metric name Description _arms-prom/kubelet/1 promhttp_metric_handler_requests_in_flight - go_memstats_mallocs_total A counter value that shows the number of allocated heap objects. You can call the rate()
function to calculate the allocation rate of heap objects.go_memstats_lookups_total A counter value that shows the number of dereferenced pointers. You can call the rate()
function to calculate the dereferencing rate of pointers.go_memstats_last_gc_time_seconds The timestamp when the last garbage collection (GC) was completed. go_memstats_heap_sys_bytes The number of memory bytes allocated for the heap from the operating system, including the virtual address space that is reserved but not used. go_memstats_heap_released_bytes The number of free spans that have been returned to the operating system. go_memstats_heap_objects The number of objects allocated on the heap. These objects change with GC and the allocation of new objects. go_memstats_heap_inuse_bytes The number of bytes occupied by the spans in use. go_memstats_heap_idle_bytes The number of memory bytes occupied by idle spans. go_memstats_heap_alloc_bytes The number of memory bytes allocated for heap objects, including not only all reachable heap objects, but also the unreachable objects that are not removed during GC. go_memstats_gc_sys_bytes The amount of memory occupied by GC metadata. go_memstats_gc_cpu_fraction The percentage of CPU time consumed by GC since the program was started. go_memstats_frees_total A counter value that shows the number of removed heap objects. You can call the rate()
function to calculate the removal rate of heap objects. You can use thego_memstats_mallocs_total - go_memstats_frees_total
formula to calculate the number of surviving heap objects.go_memstats_buck_hash_sys_bytes The amount of memory occupied by the hash tables used for profiling. go_memstats_alloc_bytes_total The value of the metric increases as objects are allocated in the heap, but does not decrease when objects are removed. Similar to Prometheus counters, the rate()
function can be called to query the memory consumption rate.go_memstats_alloc_bytes The number of memory bytes allocated for heap objects, including not only all reachable heap objects, but also the unreachable objects that are not removed during GC. scrape_duration_seconds - go_info The information about the Go version. The value is obtained by calling the runtime.Version()
function.go_goroutines The value is obtained by calling the runtime.NumGoroutine()
function based on the sched scheduler structure and the global allglen variable. All fields in the sched structure may concurrently change. Therefore, the system checks whether the value is less than 1. If the value is less than 1, 1 is returned.scrape_samples_post_metric_relabeling - go_gc_duration_seconds_sum - go_gc_duration_seconds_count - blackbox_exporter_config_last_reload_successful - blackbox_exporter_config_last_reload_success_timestamp_seconds - scrape_samples_scraped - blackbox_exporter_build_info - arms_prometheus_target_scrapes_sample_out_of_order_total - arms_prometheus_target_scrapes_sample_out_of_bounds_total - arms_prometheus_target_scrapes_sample_duplicate_timestamp_total - scrape_series_added - arms_prometheus_target_scrapes_exceeded_sample_limit_total - arms_prometheus_target_scrapes_cache_flush_forced_total_arms-prom/kubelet/1 - arms_prometheus_target_scrape_pools_total - statsd_metric_mapper_cache_gets_total - statsd_metric_mapper_cache_hits_total - statsd_metric_mapper_cache_length - arms_prometheus_target_scrape_pools_failed_total - up - arms_prometheus_target_scrape_pool_reloads_total - arms_prometheus_target_scrape_pool_reloads_failed_total - - The following table describes the job that collects data about an API server and the basic metrics that are collected.
Job name Metric name apiserver apiserver_request_duration_seconds_bucket (obsolete by default) apiserver_admission_controller_admission_duration_seconds_bucket apiserver_request_total rest_client_requests_total apiserver_admission_webhook_admission_duration_seconds_bucket apiserver_current_inflight_requests up apiserver_admission_webhook_admission_duration_seconds_count scrape_samples_post_metric_relabeling scrape_samples_scraped scrape_series_added scrape_duration_seconds - The following table describes the job that collects data from Ingresses and the basic metrics that are collected.
Job name Metric name Description arms-ack-ingress nginx_ingress_controller_request_duration_seconds_bucket - nginx_ingress_controller_response_duration_seconds_bucket (obsolete by default) - nginx_ingress_controller_response_size_bucket (obsolete by default) - nginx_ingress_controller_request_size_bucket - nginx_ingress_controller_bytes_sent_bucket - go_gc_duration_seconds The value is obtained by calling the debug.ReadGCStats()
function. When the function is called, thePauseQuantile
field of the GCStats structure is set to 5. The function will return the minimum percentile, 25%, 50%, 75%, and the maximum percentile of the GC pause time. Then, the Prometheus Go client creates a summary metric based on the returned percentile of the GC pause time, NumGC, and PauseTotal variables.nginx_ingress_controller_nginx_process_connections - nginx_ingress_controller_request_duration_seconds_sum - nginx_ingress_controller_request_duration_seconds_count (obsolete by default) - nginx_ingress_controller_bytes_sent_sum - nginx_ingress_controller_request_size_sum - nginx_ingress_controller_response_duration_seconds_count - nginx_ingress_controller_response_duration_seconds_sum (obsolete by default) - nginx_ingress_controller_response_size_count (obsolete by default) - nginx_ingress_controller_bytes_sent_count - nginx_ingress_controller_response_size_sum - nginx_ingress_controller_request_size_count - promhttp_metric_handler_requests_total - nginx_ingress_controller_nginx_process_connections_total - go_memstats_mcache_sys_bytes The amount of memory allocated from the operating system for the mcache structure. go_memstats_lookups_total A counter value that shows the number of dereferenced pointers. You can call the rate()
function to calculate the dereferencing rate of pointers.go_threads The value is obtained by calling the runtime.CreateThreadProfile()
function based on the global allm variable.go_memstats_sys_bytes The number of memory bytes that Go has obtained from the system. go_memstats_last_gc_time_seconds The timestamp when the last GC was completed. go_memstats_heap_sys_bytes The number of memory bytes allocated for the heap from the operating system, including the virtual address space that is reserved but not used. go_memstats_heap_objects The number of objects allocated on the heap. These objects change with GC and the allocation of new objects. go_memstats_heap_inuse_bytes The number of bytes occupied by the spans in use. go_memstats_heap_idle_bytes The number of memory bytes occupied by idle spans. go_memstats_heap_alloc_bytes The number of memory bytes allocated for heap objects, including not only all reachable heap objects, but also the unreachable objects that are not removed during GC. go_memstats_gc_sys_bytes The amount of memory occupied by GC metadata. promhttp_metric_handler_requests_in_flight - go_memstats_stack_sys_bytes The number of stack memory bytes obtained from the operating system. The value is obtained based on the value of the go_memstats_stack_inuse_bytes
metric plus the size of the OS thread stack.go_memstats_stack_inuse_bytes The size of the used memory on a stack memory span on which at least one stack object is allocated. go_memstats_gc_cpu_fraction The percentage of CPU time consumed by GC since the program was started. go_memstats_frees_total A counter value that shows the number of removed heap objects. You can call the rate()
function to calculate the removal rate of heap objects. You can use thego_memstats_mallocs_total - go_memstats_frees_total
formula to calculate the number of surviving heap objects.go_memstats_buck_hash_sys_bytes The amount of memory occupied by the hash tables used for profiling. go_memstats_alloc_bytes_total The value of the metric increases as objects are allocated in the heap, but does not decrease when objects are removed. Similar to Prometheus counters, the rate()
function can be called to query the memory consumption rate.go_memstats_alloc_bytes The number of memory bytes allocated for heap objects, including not only all reachable heap objects, but also the unreachable objects that are not removed during GC. nginx_ingress_controller_nginx_process_num_procs - go_info The information about the Go version. The value is obtained by calling the runtime.Version()
function.go_memstats_mallocs_total A counter value that shows the number of allocated heap objects. You can call the rate()
function to calculate the allocation rate of heap objects.go_memstats_other_sys_bytes The size of the memory used for other runtime allocations. go_goroutines The value is obtained by calling the runtime.NumGoroutine()
function based on the sched scheduler structure and the global allglen variable. All fields in the sched structure may concurrently change. Therefore, the system checks whether the value is less than 1. If the value is less than 1, 1 is returned.scrape_samples_post_metric_relabeling - scrape_samples_scraped - process_virtual_memory_max_bytes - process_virtual_memory_bytes The virtual set size (VSS). The value indicates all allocated memory, including the memory that is allocated but not used, and the memory that is shared and swapped out. scrape_duration_seconds - go_memstats_heap_released_bytes The number of free spans that have been returned to the operating system. go_gc_duration_seconds_sum - go_memstats_next_gc_bytes The heap memory size during the next GC cycle. GC is used to guarantee that the value is no less than the value of the go_memstats_heap_alloc_bytes
metric.go_gc_duration_seconds_count - nginx_ingress_controller_config_hash - nginx_ingress_controller_config_last_reload_successful - nginx_ingress_controller_config_last_reload_successful_timestamp_seconds - nginx_ingress_controller_ingress_upstream_latency_seconds_count - nginx_ingress_controller_ingress_upstream_latency_seconds_sum - process_start_time_seconds The value is obtained based on the start_time
parameter. Thestart_time
parameter specifies the time when a process starts. Unit: jiffy. The data comes from the/proc/stat
directory. You can divide the value of thestart_time
parameter by USER_HZ to calculate the value, which is measured in seconds.nginx_ingress_controller_nginx_process_cpu_seconds_total - scrape_series_added - nginx_ingress_controller_nginx_process_oldest_start_time_seconds - nginx_ingress_controller_nginx_process_read_bytes_total - nginx_ingress_controller_nginx_process_requests_total - nginx_ingress_controller_nginx_process_resident_memory_bytes - nginx_ingress_controller_nginx_process_virtual_memory_bytes - nginx_ingress_controller_nginx_process_write_bytes_total - nginx_ingress_controller_requests - go_memstats_mcache_inuse_bytes The amount of memory used by the mcache structure. nginx_ingress_controller_success - process_resident_memory_bytes The RSS. The value indicates the actual memory used by processes, including the shared memory. The memory that is allocated but not used, or the memory that is swapped out is not included. process_open_fds The value is obtained by calculating the total number of files in the /proc/PID/fd
directory. It shows the total number of regular files, sockets, and pseudo-terminals opened by Go processes.process_max_fds The value is obtained by reading the value of the Max Open Files
row in the/proc/{PID}/limits
file. The value is a soft limit. The soft limit is the value that the kernel uses to limit the resources. The hard limit is the maximum value of the soft limit.process_cpu_seconds_total The value is obtained based on the utime
parameter (the number of ticks executed by the Go process in user mode) and thestime
parameter (the number of ticks executed by the Go process in kernel mode or when the system is called. Unit of the parameters: jiffy, which measures the tick time between two system timer interruptions. The value of the process_cpu_seconds_total metric is the sum of utime and stime divided by USER_HZ. The total number of program ticks divided by the tick rate (ticks per second. Unit: Hz) is the total time (unit: seconds) that the operating system has been running the process.go_memstats_mspan_sys_bytes The amount of memory allocated from the operating system for the mspan structure. up - go_memstats_mspan_inuse_bytes The amount of memory used by the mspan structure. nginx_ingress_controller_ssl_expire_time_seconds - nginx_ingress_controller_leader_election_status - - The following table describes the job that collects data from CoreDNS and the basic metrics that are collected.
Job name Metric name Description arms-ack-coredns coredns_forward_request_duration_seconds_bucket - coredns_dns_request_size_bytes_bucket - coredns_dns_response_size_bytes_bucket - coredns_kubernetes_dns_programming_duration_seconds_bucket - coredns_dns_request_duration_seconds_bucket - coredns_plugin_enabled - coredns_health_request_duration_seconds_bucket - go_gc_duration_seconds The value is obtained by calling the debug.ReadGCStats()
function. When the function is called, thePauseQuantile
field of the GCStats structure is set to 5. The function will return the minimum percentile, 25%, 50%, 75%, and the maximum percentile of the GC pause time. Then, the Prometheus Go client creates a summary metric based on the returned percentile of the GC pause time, NumGC, and PauseTotal variables.coredns_forward_responses_total - coredns_forward_request_duration_seconds_sum - coredns_forward_request_duration_seconds_count - coredns_dns_requests_total - coredns_forward_conn_cache_misses_total - coredns_dns_responses_total - coredns_cache_entries - coredns_cache_hits_total - coredns_forward_conn_cache_hits_total - coredns_forward_requests_total - coredns_dns_request_size_bytes_sum - coredns_dns_response_size_bytes_count - coredns_dns_response_size_bytes_sum - coredns_dns_request_size_bytes_count - scrape_duration_seconds - scrape_samples_scraped - scrape_series_added - up - scrape_samples_post_metric_relabeling - go_memstats_lookups_total A counter value that shows the number of dereferenced pointers. You can call the rate()
function to calculate the dereferencing rate of pointers.go_memstats_last_gc_time_seconds The timestamp when the last GC was completed. go_memstats_heap_sys_bytes The number of memory bytes allocated for the heap from the operating system, including the virtual address space that is reserved but not used. coredns_build_info - go_memstats_heap_released_bytes The number of free spans that have been returned to the operating system. go_memstats_heap_objects The number of objects allocated on the heap. These objects change with GC and the allocation of new objects. go_memstats_heap_inuse_bytes The number of bytes occupied by the spans in use. go_memstats_heap_idle_bytes The number of memory bytes occupied by idle spans. go_memstats_heap_alloc_bytes The number of memory bytes allocated for heap objects, including not only all reachable heap objects, but also the unreachable objects that are not removed during GC. go_memstats_gc_sys_bytes The amount of memory occupied by GC metadata. go_memstats_sys_bytes The number of memory bytes that Go has obtained from the system. go_memstats_stack_sys_bytes The number of stack memory bytes obtained from the operating system. The value is obtained based on the value of the go_memstats_stack_inuse_bytes
metric plus the size of the OS thread stack.go_memstats_mallocs_total A counter value that shows the number of allocated heap objects. You can call the rate()
function to calculate the allocation rate of heap objects.go_memstats_gc_cpu_fraction The percentage of CPU time consumed by GC since the program was started. go_memstats_stack_inuse_bytes The size of the used memory on a stack memory span on which at least one stack object is allocated. go_memstats_frees_total A counter value that shows the number of removed heap objects. You can call the rate()
function to calculate the removal rate of heap objects. You can use thego_memstats_mallocs_total - go_memstats_frees_total
formula to calculate the number of surviving heap objects.go_memstats_buck_hash_sys_bytes The amount of memory occupied by the hash tables used for profiling. go_memstats_alloc_bytes_total The value of the metric increases as objects are allocated in the heap, but does not decrease when objects are removed. Similar to Prometheus counters, the rate()
function can be called to query the memory consumption rate.go_memstats_alloc_bytes The number of memory bytes allocated for heap objects, The value is the same as the value of the go_memstats_heap_alloc_bytes
metric. The heap objects include not only all reachable heap objects, but also the unreachable objects that are not removed during GC.coredns_cache_misses_total - go_memstats_other_sys_bytes The size of the memory used for other runtime allocations. go_memstats_mcache_inuse_bytes The amount of memory used by the mcache structure. go_goroutines The value is obtained by calling the runtime.NumGoroutine()
function based on the sched scheduler structure and the global allglen variable. All fields in the sched structure may concurrently change. Therefore, the system checks whether the value is less than 1. If the value is less than 1, 1 is returned.process_virtual_memory_max_bytes - process_virtual_memory_bytes The VSS. The value indicates all allocated memory, including the memory that is allocated but not used, and the memory that is shared and swapped out. go_gc_duration_seconds_sum - go_gc_duration_seconds_countarms-ack-coredns - go_memstats_next_gc_bytes The heap memory size during the next GC cycle. GC is used to guarantee that the value is no less than the value of the go_memstats_heap_alloc_bytes
metric.coredns_dns_request_duration_seconds_count - coredns_reload_failed_total - coredns_panics_total - coredns_local_localhost_requests_total - coredns_kubernetes_dns_programming_duration_seconds_sum - coredns_kubernetes_dns_programming_duration_seconds_count - coredns_dns_request_duration_seconds_sum - coredns_hosts_reload_timestamp_seconds - oredns_health_request_failures_total - process_start_time_seconds The value is obtained based on the start_time
parameter. Thestart_time
parameter specifies the time when a process starts. Unit: jiffy. The data comes from the/proc/stat
directory. You can divide the value of thestart_time
parameter by USER_HZ to calculate the value, which is measured in seconds.process_resident_memory_bytes The RSS. The value indicates the actual memory used by processes, including the shared memory. The memory that is allocated but not used, or the memory that is swapped out is not included. process_open_fds The value is obtained by calculating the total number of files in the /proc/PID/fd
directory. It shows the total number of regular files, sockets, and pseudo-terminals opened by Go processes.process_max_fds The value is obtained by reading the value of the Max Open Files
row in the/proc/{PID}/limits
file. The value is a soft limit. The soft limit is the value that the kernel uses to limit the resources. The hard limit is the maximum value of the soft limit.process_cpu_seconds_total The value is obtained based on the utime
parameter (the number of ticks executed by the Go process in user mode) and thestime
parameter (the number of ticks executed by the Go process in kernel mode or when the system is called. Unit of the parameters: jiffy, which measures the tick time between two system timer interruptions. The value of the process_cpu_seconds_total metric is the sum of utime and stime divided by USER_HZ. The total number of program ticks divided by the tick rate (ticks per second. Unit: Hz) is the total time (unit: seconds) that the operating system has been running the process.coredns_health_request_duration_seconds_sum - coredns_health_request_duration_seconds_count - go_memstats_mspan_sys_bytes The amount of memory allocated from the operating system for the mspan structure. coredns_forward_max_concurrent_rejects_total - coredns_forward_healthcheck_broken_total - go_memstats_mcache_sys_bytes The amount of memory allocated from the operating system for the mcache structure. go_memstats_mspan_inuse_bytes The amount of memory used by the mspan structure. go_threads The value is obtained by calling the runtime.CreateThreadProfile()
function based on the global allm variable.go_info The information about the Go version. The value is obtained by calling the runtime.Version()
function. - The following table describes the job that collects data from kube-state-metrics and the basic metrics that are collected.
Job name Metric name _kube-state-metrics kube_pod_container_status_waiting_reason kube_pod_status_phase kube_pod_container_status_last_terminated_reason kube_pod_container_status_terminated_reason kube_pod_status_ready kube_node_status_condition kube_pod_container_status_running kube_pod_container_status_restarts_total kube_pod_container_info kube_pod_container_status_waiting kube_pod_container_status_terminated kube_pod_labels kube_pod_owner kube_pod_info kube_pod_container_resource_limits kube_persistentvolume_status_phase kube_pod_container_resource_requests_memory_bytes kube_pod_container_resource_requests_cpu_cores kube_pod_container_resource_limits_memory_bytes kube_node_status_capacity kube_service_info kube_pod_container_resource_limits_cpu_cores kube_deployment_status_replicas_updated kube_deployment_status_replicas_unavailable kube_deployment_spec_replicas kube_deployment_created kube_deployment_metadata_generation kube_deployment_status_replicas kube_deployment_labels kube_deployment_status_observed_generation kube_deployment_status_replicas_available kube_deployment_spec_strategy_rollingupdate_max_unavailable kube_daemonset_status_desired_number_scheduled kube_daemonset_updated_number_scheduled kube_daemonset_status_number_ready kube_daemonset_status_number_misscheduled kube_daemonset_status_number_available kube_daemonset_status_current_number_scheduled kube_daemonset_created kube_node_status_allocatable_cpu_cores kube_node_status_capacity_memory_bytes kube_node_spec_unschedulable kube_node_status_allocatable_memory_bytes kube_node_labels kube_node_info kube_namespace_labels kube_node_status_capacity_cpu_cores kube_node_status_capacity_pods kube_node_status_allocatable_pods kube_node_spec_taint kube_statefulset_status_replicas kube_statefulset_replicas kube_statefulset_created up scrape_samples_scraped scrape_duration_seconds scrape_samples_post_metric_relabeling scrape_series_added - The following table describes the job that collects data from kubelet and the basic metrics that are collected.
Job name Metric name Description _arms/kubelet/metric rest_client_request_duration_seconds_bucket - apiserver_client_certificate_expiration_seconds_bucket - kubelet_pod_worker_duration_seconds_bucket - kubelet_pleg_relist_duration_seconds_bucket - workqueue_queue_duration_seconds_bucket - rest_client_requests_total - go_gc_duration_seconds The value is obtained by calling the debug.ReadGCStats()
function. When the function is called, thePauseQuantile
field of the GCStats structure is set to 5. The function will return the minimum percentile, 25%, 50%, 75%, and the maximum percentile of the GC pause time. Then, the Prometheus Go client creates a summary metric based on the returned percentile of the GC pause time, NumGC, and PauseTotal variables.process_cpu_seconds_total The value is obtained based on the utime
parameter (the number of ticks executed by the Go process in user mode) and thestime
parameter (the number of ticks executed by the Go process in kernel mode or when the system is called. Unit of the parameters: jiffy, which measures the tick time between two system timer interruptions. The value of the process_cpu_seconds_total metric is the sum of utime and stime divided by USER_HZ. The total number of program ticks divided by the tick rate (ticks per second. Unit: Hz) is the total time (unit: seconds) that the operating system has been running the process.process_resident_memory_bytes The RSS. The value indicates the actual memory used by processes, including the shared memory. The memory that is allocated but not used, or the memory that is swapped out is not included. kubernetes_build_info - kubelet_node_name - kubelet_certificate_manager_client_ttl_seconds - kubelet_certificate_manager_client_expiration_renew_errors - scrape_duration_seconds - go_goroutines The value is obtained by calling the runtime.NumGoroutine()
function based on the sched scheduler structure and the global allglen variable. All fields in the sched structure may concurrently change. Therefore, the system checks whether the value is less than 1. If the value is less than 1, 1 is returned.crape_samples_post_metric_relabeling - scrape_samples_scraped - scrape_series_added - up - apiserver_client_certificate_expiration_seconds_count - workqueue_adds_total - workqueue_depth - - The following table describes the job that collects data from cAdvisor and the basic metrics that are collected.
Job name Metric name _arms/kubelet/cadvisor container_memory_failures_total (obsolete by default) container_memory_rss container_spec_memory_limit_bytes container_memory_failcnt container_memory_cache container_memory_swap container_memory_usage_bytes container_memory_max_usage_bytes container_cpu_load_average_10s container_fs_reads_total (obsolete by default) container_fs_writes_total (obsolete by default) container_network_transmit_errors_total container_network_receive_bytes_total container_network_transmit_packets_total container_network_receive_errors_total container_network_receive_bytes_total container_network_receive_errors_total container_network_transmit_errors_total container_memory_working_set_bytes container_cpu_usage_seconds_total container_fs_reads_bytes_total container_fs_writes_bytes_total container_spec_cpu_quota container_cpu_cfs_periods_total container_cpu_cfs_throttled_periods_total container_cpu_cfs_throttled_seconds_total container_fs_inodes_free container_fs_io_time_seconds_total container_fs_io_time_weighted_seconds_total container_fs_limit_bytes container_tasks_state (obsolete by default) container_fs_read_seconds_total (obsolete by default) container_fs_write_seconds_total (obsolete by default) container_fs_usage_bytes container_fs_inodes_total container_fs_io_current scrape_duration_seconds scrape_samples_scraped machine_cpu_cores machine_memory_bytes scrape_samples_post_metric_relabeling scrape_series_added up _arms-prom/kube-apiserver/cadvisor scrape_duration_seconds up scrape_samples_scraped scrape_samples_post_metric_relabeling scrape_series_added - The following table describes the job that collects data from the Container Service for Kubernetes (ACK) scheduler and the basic metrics that are collected.
Job name Metric name ack-scheduler rest_client_request_duration_seconds_bucket scheduler_pod_scheduling_attempts_bucket rest_client_requests_total scheduler_pending_pods scheduler_scheduler_cache_size up - The following table describes the job that collects data from etcd and the basic metrics that are collected.
Job name Metric name etcd etcd_disk_backend_commit_duration_seconds_bucket up etcd_server_has_leader etcd_debugging_mvcc_keys_total etcd_debugging_mvcc_db_total_size_in_bytes etcd_server_leader_changes_seen_total - The following table describes the job that collects data from nodes and the basic metrics that are collected.
Job name Metric name Description node-exporter node_filesystem_size_bytes - node_filesystem_readonly - node_filesystem_free_bytes - node_filesystem_avail_bytes - node_cpu_seconds_total - node_network_receive_bytes_total - node_network_receive_errs_total - node_network_transmit_bytes_total - node_network_receive_packets_total - node_network_transmit_drop_total - node_network_transmit_errs_total - node_network_up - node_network_transmit_packets_total - node_network_receive_drop_total - go_gc_duration_seconds The value is obtained by calling the debug.ReadGCStats()
function. When the function is called, thePauseQuantile
field of the GCStats structure is set to 5. The function will return the minimum percentile, 25%, 50%, 75%, and the maximum percentile of the GC pause time. Then, the Prometheus Go client creates a summary metric based on the returned percentile of the GC pause time, NumGC, and PauseTotal variables.node_load5 - node_filefd_allocated - node_exporter_build_info - node_disk_written_bytes_total - node_disk_writes_completed_total - node_disk_write_time_seconds_total - node_nf_conntrack_entries - node_nf_conntrack_entries_limit - node_processes_max_processes - node_processes_pids - node_sockstat_TCP_alloc - node_sockstat_TCP_inuse - node_sockstat_TCP_tw - node_timex_offset_seconds - node_timex_sync_status - node_uname_info - node_vmstat_pgfault - node_vmstat_pgmajfault - node_vmstat_pgpgin - node_vmstat_pgpgout - node_disk_reads_completed_total - node_disk_read_time_seconds_total - process_cpu_seconds_total The value is obtained based on the utime
parameter (the number of ticks executed by the Go process in user mode) and thestime
parameter (the number of ticks executed by the Go process in kernel mode or when the system is called. Unit of the parameters: jiffy, which measures the tick time between two system timer interruptions. The value of the process_cpu_seconds_total metric is the sum of utime and stime divided by USER_HZ. The total number of program ticks divided by the tick rate (ticks per second. Unit: Hz) is the total time (unit: seconds) that the operating system has been running the process.node_disk_read_bytes_total - node_disk_io_time_weighted_seconds_total - node_disk_io_time_seconds_total - node_disk_io_now - node_context_switches_total - node_boot_time_seconds - process_resident_memory_bytes The RSS. The value indicates the actual memory used by processes, including the shared memory. The memory that is allocated but not used, or the memory that is swapped out is not included. node_intr_total - node_load1 - go_goroutines The value is obtained by calling the runtime.NumGoroutine()
function based on the sched scheduler structure and the global allglen variable. All fields in the sched structure may concurrently change. Therefore, the system checks whether the value is less than 1. If the value is less than 1, 1 is returned.scrape_duration_seconds - node_load15 - scrape_samples_post_metric_relabeling - node_netstat_Tcp_PassiveOpens - scrape_samples_scraped - node_netstat_Tcp_CurrEstab - scrape_series_added - node_netstat_Tcp_ActiveOpens - node_memory_MemTotal_bytes - node_memory_MemFree_bytes - node_memory_MemAvailable_bytes - node_memory_Cached_bytes - up - node_memory_Buffers_bytes - - The following table describes the job that collects data from GPUs and the basic metrics that are collected.
Job name Metric name Description gpu-exporter go_gc_duration_seconds The value is obtained by calling the debug.ReadGCStats()
function. When the function is called, thePauseQuantile
field of the GCStats structure is set to 5. The function will return the minimum percentile, 25%, 50%, 75%, and the maximum percentile of the GC pause time. Then, the Prometheus Go client creates a summary metric based on the returned percentile of the GC pause time, NumGC, and PauseTotal variables.promhttp_metric_handler_requests_total - scrape_series_added - up - scrape_duration_seconds - scrape_samples_scraped - scrape_samples_post_metric_relabeling - go_memstats_mcache_inuse_bytes The amount of memory used by the mcache structure. process_virtual_memory_max_bytes - process_virtual_memory_bytes The VSS. The value indicates all allocated memory, including the memory that is allocated but not used, and the memory that is shared and swapped out. process_start_time_seconds The value is obtained based on the start_time
parameter. Thestart_time
parameter specifies the time when a process starts. Unit: jiffy. The data comes from the/proc/stat
directory. You can divide the value of thestart_time
parameter by USER_HZ to calculate the value, which is measured in seconds.go_memstats_next_gc_bytes The heap memory size during the next GC cycle. GC is used to guarantee that the value is no less than the value of the go_memstats_heap_alloc_bytes
metric.go_memstats_heap_objects The number of objects allocated on the heap. These objects change with GC and the allocation of new objects. process_resident_memory_bytes The RSS. The value indicates the actual memory used by processes, including the shared memory. The memory that is allocated but not used, or the memory that is swapped out is not included. process_open_fds The value is obtained by calculating the total number of files in the /proc/PID/fd
directory. It shows the total number of regular files, sockets, and pseudo-terminals opened by Go processes.process_max_fds The value is obtained by reading the value of the Max Open Files
row in the/proc/{PID}/limits
file. The value is a soft limit. The soft limit is the value that the kernel uses to limit the resources. The hard limit is the maximum value of the soft limit.go_memstats_other_sys_bytes The size of the memory used for other runtime allocations. go_gc_duration_seconds_count - go_memstats_heap_alloc_bytes The number of memory bytes allocated for heap objects, including not only all reachable heap objects, but also the unreachable objects that are not removed during GC. process_cpu_seconds_total The value is obtained based on the utime
parameter (the number of ticks executed by the Go process in user mode) and thestime
parameter (the number of ticks executed by the Go process in kernel mode or when the system is called. Unit of the parameters: jiffy, which measures the tick time between two system timer interruptions. The value of the process_cpu_seconds_total metric is the sum of utime and stime divided by USER_HZ. The total number of program ticks divided by the tick rate (ticks per second. Unit: Hz) is the total time (unit: seconds) that the operating system has been running the process.nvidia_gpu_temperature_celsius (obsolete by default) - go_memstats_stack_inuse_bytes The size of the used memory on a stack memory span on which at least one stack object is allocated. nvidia_gpu_power_usage_milliwatts (obsolete by default) - nvidia_gpu_num_devices (obsolete by default) - nvidia_gpu_memory_used_bytes (obsolete by default) - nvidia_gpu_memory_total_bytes (obsolete by default) - go_memstats_stack_sys_bytes The number of stack memory bytes obtained from the operating system. The value is obtained based on the value of the go_memstats_stack_inuse_bytes
metric plus the size of the OS thread stack.nvidia_gpu_memory_allocated_bytes (obsolete by default) - nvidia_gpu_duty_cycle (obsolete by default) - nvidia_gpu_allocated_num_devices (obsolete by default) - promhttp_metric_handler_requests_in_flight - go_memstats_sys_bytes The number of memory bytes that Go has obtained from the system. go_memstats_gc_sys_bytes The amount of memory occupied by GC metadata. go_memstats_gc_cpu_fraction The percentage of CPU time consumed by GC since the program was started. go_memstats_heap_released_bytes The number of free spans that have been returned to the operating system. go_memstats_frees_total A counter value that shows the number of removed heap objects. You can call the rate()
function to calculate the removal rate of heap objects. You can use thego_memstats_mallocs_total - go_memstats_frees_total
formula to calculate the number of surviving heap objects.go_threads The value is obtained by calling the runtime.CreateThreadProfile()
function based on the global allm variable.go_memstats_mspan_sys_bytes The amount of memory allocated from the operating system for the mspan structure. go_memstats_buck_hash_sys_bytes The amount of memory occupied by the hash tables used for profiling. go_memstats_alloc_bytes_total The value of the metric increases as objects are allocated in the heap, but does not decrease when objects are removed. Similar to Prometheus counters, the rate()
function can be called to query the memory consumption rate.go_memstats_heap_sys_bytes The number of memory bytes allocated for the heap from the operating system, including the virtual address space that is reserved but not used. go_memstats_mspan_inuse_bytes The amount of memory used by the mspan structure. go_memstats_alloc_bytes The number of memory bytes allocated for heap objects, The value is the same as the value of the go_memstats_heap_alloc_bytes
metric. The heap objects include not only all reachable heap objects, but also the unreachable objects that are not removed during GC.go_info The information about the Go version. The value is obtained by calling the runtime.Version()
function.go_memstats_last_gc_time_seconds The timestamp when the last GC was completed. go_memstats_heap_inuse_bytes The number of bytes occupied by the spans in use. go_memstats_mcache_sys_bytes The amount of memory allocated from the operating system for the mcache structure. go_memstats_lookups_total A counter value that shows the number of dereferenced pointers. You can call the rate()
function to calculate the dereferencing rate of pointers.go_memstats_mallocs_total A counter value that shows the number of allocated heap objects. You can call the rate()
function to calculate the allocation rate of heap objects.go_gc_duration_seconds_sum - go_goroutines The value is obtained by calling the runtime.NumGoroutine()
function based on the sched scheduler structure and the global allglen variable. All fields in the sched structure may concurrently change. Therefore, the system checks whether the value is less than 1. If the value is less than 1, 1 is returned.go_memstats_heap_idle_bytes The number of memory bytes occupied by idle spans. - The following table describes the job that collects data from persistent volumes (PVs) and the basic metrics that are collected.
Job name Metric name k8s-csi-cluster-pv cluster_pvc_detail_num_total cluster_pv_detail_num_total cluster_pv_status_num_total cluster_scrape_collector_success cluster_scrape_collector_duration_seconds alibaba_cloud_storage_operator_build_info cluster_pvc_status_num_total scrape_duration_seconds scrape_samples_post_metric_relabeling scrape_samples_scraped scrape_series_added up k8s-csi-node-pv cluster_scrape_collector_duration_seconds cluster_scrape_collector_success alibaba_cloud_csi_driver_build_info up scrape_series_added scrape_samples_post_metric_relabeling scrape_samples_scraped scrape_duration_seconds
The following table describes the basic metrics that are supported by a Prometheus instance that is used to monitor an Alibaba Cloud service.
Category | Metric name | Description |
---|---|---|
ECS | cpu_util_lization | The CPU utilization of an Elastic Compute Service (ECS) instance. |
internet_in_rate | The average rate of inbound traffic from the Internet to an ECS instance. | |
internet_out_rate | The average rate of outbound traffic from an ECS instance to the Internet. | |
disk_read_bps | The bit rate of reads to all disks of an ECS instance. | |
disk_write_bps | The number of reads per second to all disks of an ECS instance. | |
vpc_public_ip_internet_in_Rate | The average rate of inbound traffic from the Internet to the IP address of an ECS instance. | |
vpc_public_ip_internet_out_Rate | The utilization of outbound bandwidth from the IP address of an ECS instance to the Internet. | |
cpu_total | (Agent) cpu.total | |
memory_totalspace | (Agent) memory.total.space | |
memory_usedutilization | (Agent) memory.used.utilization | |
diskusage_utilization | (Agent) disk.usage.utilization_device | |
RDS | cpu_usage_average | The CPU utilization. |
disk_usage | The disk usage. | |
iops_usage | The IOPS usage. | |
connection_usage | The connection utilization. | |
data_delay | The latency of read-only instances. | |
memory_usage | The memory usage. | |
mysql_network_in_new | The inbound bandwidth of an ApsaraDB RDS for MySQL instance. | |
mysql_network_out_new | The outbound bandwidth of an ApsaraDB RDS for MySQL instance. | |
mysql_active_sessions | MySQL_ActiveSessions | |
sqlserver_network_in_new | The inbound bandwidth of an ApsaraDB RDS for SQL Server instance. | |
sqlserver_network_out_new | The outbound bandwidth of an ApsaraDB RDS for SQL Server instance. | |
NAT | snat_connection | The number of SNAT connections. |
snat_connection_drop_limit | The cumulative number of SNAT connections dropped due to the limit on the number of concurrent connections. | |
snat_connection_drop_rate_limit | The cumulative number of SNAT connections dropped due to the limit on the number of new connections. | |
net_rx_rate | The inbound bandwidth. | |
net_tx_rate | The outbound bandwidth. | |
net_rx_pkgs | The rate of inbound packets. | |
net_tx_pkgs | The rate of outbound packets. | |
RocketMQ | consumer_lag_gid | The number of accumulated messages. |
receive_message_count_gid | The number of messages received per minute by a consumer group. | |
send_message_count_gid | The number of messages sent per minute by a producer group. | |
consumer_lag_topic | The number of accumulated messages of a topic or group. | |
receive_message_count_topic | The number of messages of a topic received per minute by a consumer group. | |
send_message_count_topic | The number of messages of a topic sent per minute by a consumer group. | |
receive_message_count | The number of messages received per minute. | |
send_message_count | The number of messages sent per minute. | |
SLB | healthy_server_count | The number of healthy backend ECS instances. |
unhealthy_server_count | The number of unhealthy backend ECS instances. | |
packet_tx | The number of inbound packets per second. | |
packet_rx | The number of outbound packets per second. | |
traffic_rx_new | The inbound bandwidth. | |
traffic_tx_new | The outbound bandwidth. | |
active_connection | The number of active connections over TCP. | |
inactive_connection | The number of inactive connections on a port. | |
new_connection | The number of new connections over TCP. | |
max_connection | The number of concurrent connections on a port. | |
instance_active_connection | The number of active connections established to an instance. | |
instance_new_connection | The number of new connections established to an instance per second. | |
instance_max_connection | The maximum number of concurrent connections established to an instance per second. | |
instance_drop_connection | The number of connections that are dropped per second on an instance. | |
instance_traffic_rx | The inbound traffic per second of an instance. Unit: bit. | |
instance_traffic_tx | The outbound traffic per second of an instance. Unit: bit. | |
E-MapReduce (EMR) | active_applications | The number of active jobs. |
active_users | The number of active users. | |
aggregate_containers_allocated | The total number of allocated containers. | |
aggregate_containers_released | The total number of released containers. | |
allocated_containers | The number of allocated containers. | |
apps_completed | The number of completed jobs. | |
apps_failed | The number of failed jobs | |
apps_killed | The number of terminated jobs. | |
apps_pending | The number of pending jobs. | |
apps_running | The number of running jobs. | |
apps_submitted | The number of submitted jobs. | |
available_mb | The size of the memory available to the current queue. | |
available_vcores | The number of vCores available to the current queue. | |
pending_containers | The number of pending containers. | |
reserved_containers | The number of reserved containers. | |
EIP | net_rx_rate | The inbound bandwidth. |
net_tx_rate | The outbound bandwidth. | |
net_rx_pkgs_rate | The rate of inbound packets. | |
net_tx_pkgs_rate | The rate of outbound packets. | |
out_ratelimit_drop_speed | The rate at which packets are dropped due to throttling. | |
OSS | availability | The availability. |
request_valid_rate | The ratio of valid requests. | |
success_rate | The ratio of successful requests. | |
network_error_rate | The ratio of failed requests due to network issues. | |
total_request_count | The total number of requests. | |
valid_count | The number of valid requests. | |
internet_send | The outbound traffic over the Internet. | |
internet_recv | The inbound traffic over the Internet. | |
intranet_send | The outbound traffic over the internal network. | |
intranet_recv | The inbound traffic over the internal network. | |
success_count | The total number of successful requests. | |
network_error_count | The total number of failed requests due to network issues. | |
client_timeout_count | The total number of failed requests due to client timeouts. | |
Elasticsearch | node_cpu_utilization | The CPU utilization of a node. |
node_heap_memory_utilization | The heap memory utilization of a node. | |
node_stats_exception_log_count | The number of exceptions. | |
node_stats_full_gc_collection_count | The number of full heap garbage collections (full GCs). | |
node_disk_utilization | The disk usage of a node. | |
node_load_1m | The average load of a node over the last 1 minute. | |
cluster_query_qps | The queries per second (QPS) of a cluster. | |
cluster_index_qps | ClusterIndexQPS | |
Logstash | cpu_percent | The CPU utilization of a node. |
node_heap_memory | The memory usage of a node. | |
node_disk_usage | The disk usage of a node. | |
DRDS | cpu_utilization | The CPU utilization. |
connection_count | The number of connections. | |
logic_qps | The logical QPS. | |
logic_rt | The logical response time (RT). | |
memory_utilization | The memory usage. | |
network_input_traffic | The inbound bandwidth. | |
network_output_traffic | The outbound bandwidth. | |
physics_qps | The physical QPS. | |
physics_rt | The physical RT. | |
thread_count | The number of active threads. | |
com_insert_select | The number of INSERT and SELECT statements that are executed per second on a private ApsaraDB RDS for MySQL instance. | |
com_replace | The number of REPLACE statements that are executed per second on a private ApsaraDB RDS for MySQL instance. | |
com_replace_select | The number of REPLACE and SELECT statements that are executed per second on a private ApsaraDB RDS for MySQL instance. | |
com_select | The number of SELECT statements that are executed per second on a private ApsaraDB RDS for MySQL instance. | |
com_update | The number of UPDATE statements that are executed per second on a private ApsaraDB RDS for MySQL instance. | |
conn_usage | The connection usage of a private ApsaraDB RDS for MySQL instance. | |
cpu_usage | The CPU utilization of a private ApsaraDB RDS for MySQL instance. | |
disk_usage | The disk usage of a private ApsaraDB RDS for MySQL instance. | |
ibuf_dirty_ratio | The dirty page ratio of the buffer pool of a private ApsaraDB RDS for MySQL instance. | |
ibuf_pool_reads | The number of physical reads per second on a private ApsaraDB RDS for MySQL instance. | |
ibuf_read_hit | The read hit ratio of the buffer pool of a private ApsaraDB RDS for MySQL instance. | |
ibuf_request_r | The number of logical reads per second on a private ApsaraDB RDS for MySQL instance. | |
ibuf_request_w | The number of logical writes per second on a private ApsaraDB RDS for MySQL instance. | |
ibuf_use_ratio | The utilization of the buffer pool of a private ApsaraDB RDS for MySQL instance. | |
inno_data_read | The amount of data read per second on a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
inno_data_written | The amount of data written per second to a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
inno_row_delete | The number of rows deleted per second from a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
inno_row_insert | The number of rows inserted per second to a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
inno_row_readed | The number of rows read per second on a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
inno_row_update | The number of rows updated per second on a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
innodb_log_write_requests | The number of write requests per second to the logs of a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
innodb_log_writes | The number of logical writes per second to the logs of a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
innodb_os_log_fsyncs | The number of times fsync is called per second to write data to the logs of a private ApsaraDB RDS for MySQL instance that uses InnoDB. | |
input_traffic_ps | The inbound bandwidth of a private ApsaraDB RDS for MySQL instance. | |
iops_usage | The IOPS usage of a private ApsaraDB RDS for MySQL instance. | |
mem_usage | The memory usage of a private ApsaraDB RDS for MySQL instance. | |
output_traffic_ps | The outbound bandwidth of a private ApsaraDB RDS for MySQL instance. | |
qps | The QPS of a private ApsaraDB RDS for MySQL instance. | |
slave_lag | The latency of a private read-only ApsaraDB RDS for MySQL instance. | |
slow_queries | The slow queries per second of a private ApsaraDB RDS for MySQL instance. | |
tb_tmp_disk | The number of temporary tables created per second on a private ApsaraDB RDS for MySQL instance. | |
Kafka | instance_disk_capacity | The disk usage of an instance. |
instance_message_input | The number of messages produced on an instance. | |
instance_message_output | The number of messages consumed on an instance. | |
topic_message_input | The number of messages produced in a topic. | |
topic_message_output | The number of messages consumed in a topic. | |
MongoDB | cpu_utilization | The CPU utilization. |
memory_utilization | The memory usage. | |
disk_utilization | The disk usage. | |
iops_utilization | The IOPS usage. | |
qps | The QPS. | |
connect_amount | The number of used connections. | |
instance_disk_amount | The disk space occupied by an instance. | |
data_disk_amount | The disk space occupied by data. | |
log_disk_amount | The disk space occupied by logs. | |
intranet_in | The inbound traffic over the internal network. | |
intranet_out | The outbound traffic over the internal network. | |
number_requests | The number of requests. | |
op_insert | The number of insert operations. | |
op_query | The number of query operations. | |
op_update | The number of update operations. | |
op_delete | The number of delete operations. | |
op_getmore | The number of getMore operations. | |
op_command | The number of operations performed by running commands. | |
PolarDB | active_connections | The number of active connections. |
blks_read_delta | The number of reads to a data block. | |
cluster_active_sessions | The number of active connections. | |
cluster_connection_utilization | The connection utilization. | |
cluster_cpu_utilization | The CPU utilization. | |
cluster_data_io | The I/O throughput per second of a storage engine. | |
cluster_data_iops | The IOPS of a storage engine. | |
cluster_mem_hit_ratio | The cache hit ratio. | |
cluster_memory_utilization | The memory usage. | |
cluster_qps | The QPS. | |
cluster_slow_queries_ps | The number of slow queries per second. | |
cluster_tps | The number of transactions per second. | |
conn_usage | The connection usage. | |
cpu_total | The CPU utilization. | |
db_age | The maximum database age. | |
instance_connection_utilization | The connection usage of an instance. | |
instance_cpu_utilization | The CPU utilization of an instance. | |
instance_input_bandwidth | The inbound bandwidth of an instance. | |
instance_memory_utilization | The memory usage of an instance. | |
instance_output_bandwidth | The outbound bandwidth of an instance. | |
mem_usage | The memory usage. | |
pls_data_size | The disk data size of a PolarDB for PostgreSQL cluster. | |
pls_iops | pg IOPS | |
pls_iops_read | The read IOPS of a PolarDB for PostgreSQL cluster. | |
pls_iops_write | The write IOPS of a PolarDB for PostgreSQL cluster. | |
pls_pg_wal_dir_size | The size of write-ahead logging (WAL) files of a PolarDB for PostgreSQL cluster. | |
pls_throughput | The I/O throughput of a PolarDB for PostgreSQL cluster. | |
pls_throughput_read | The read I/O throughput of a PolarDB for PostgreSQL cluster. | |
pls_throughput_write | The write I/O throughput of a PolarDB for PostgreSQL cluster. | |
swell_time | The point in time at which data bloat occurs in a PolarDB for PostgreSQL cluster. | |
tps | pg TPS | |
cluster_iops | The IOPS. | |
Redis | intranet_in_ratio | The bandwidth utilization of writes. |
intranet_out_ratio | The bandwidth utilization of reads. | |
failed_count | The number of failed operations. | |
cpu_usage | The CPU utilization. | |
used_memory | The memory usage. | |
used_connection | The number of used connections. | |
used_qps | The number of used QPS. |
The following table describes the basic metrics of Message Queue for Apache RocketMQ supported by Prometheus Service.
Category | Metric name | Description |
---|---|---|
Producer | rocketmq_producer_requests | The number of API calls that are made to send messages. |
rocketmq_producer_messages | The number of sent messages. | |
rocketmq_producer_message_size_bytes | The total size of sent messages. | |
rocketmq_producer_send_success_rate | The success rate of message sending. | |
rocketmq_producer_failure_api_calls | The number of failed API calls that are made to send messages. | |
rocketmq_producer_send_rt_milliseconds_avg | The average time required to send messages. | |
rocketmq_producer_send_rt_milliseconds_min | The minimum time required to send messages. | |
rocketmq_producer_send_rt_milliseconds_max | The maximum time required to send messages. | |
rocketmq_producer_send_rt_milliseconds_p95 | The 95th percentile of the time required to send messages. | |
rocketmq_producer_send_rt_milliseconds_p99 | The 99th percentile of the time required to send messages. | |
Consumer | rocketmq_consumer_requests | The number of API calls that are made to consume messages. |
rocketmq_consumer_send_back_requests | The number of API calls that are made to return messages after consumers fail to consume messages. | |
rocketmq_consumer_send_back_messages | The messages returned from consumers after consumers fail to consume messages. | |
rocketmq_consumer_messages | The number of consumed messages. | |
rocketmq_consumer_message_size_bytes | The total size of messages consumed within 1 minute. | |
rocketmq_consumer_ready_and_inflight_messages | The number of lagging messages, including ready messages and inflight messages. | |
rocketmq_consumer_ready_messages | The number of ready messages. | |
rocketmq_consumer_inflight_messages | The number of inflight messages. | |
rocketmq_consumer_queue_time_milliseconds | The queuing duration of messages. | |
rocketmq_consumer_message_await_time_milliseconds_avg | The average time required for consumer clients to allocate resources to process messages. | |
rocketmq_consumer_message_await_time_milliseconds_min | The minimum time required for consumer clients to allocate resources to process messages. | |
rocketmq_consumer_message_await_time_milliseconds_max | The maximum time required for consumer clients to allocate resources to process messages. | |
rocketmq_consumer_message_await_time_milliseconds_p95 | The 95th percentile of the time required for consumer clients to allocate resources to process messages. | |
rocketmq_consumer_message_await_time_milliseconds_p99 | The 99th percentile of the time required for consumer clients to allocate resources to process messages. | |
rocketmq_consumer_message_process_time_milliseconds_avg | The average time required for consumers to process messages. | |
rocketmq_consumer_message_process_time_milliseconds_min | The minimum time required for consumers to process messages. | |
rocketmq_consumer_message_process_time_milliseconds_max | The maximum time required for consumers to process messages. | |
rocketmq_consumer_message_process_time_milliseconds_p95 | The 95th percentile of the time required for consumers to process messages. | |
rocketmq_consumer_message_process_time_milliseconds_p99 | The 99th percentile of the time required for consumers to process messages. | |
rocketmq_consumer_consume_success_rate | The success rate of message consumption. | |
rocketmq_consumer_failure_api_calls | The number of failed API calls that are made to consume messages. | |
rocketmq_consumer_to_dlq_messages | The number of dead-letter messages. | |
Overview | rabbitmq_instance_api_total | The number of instance-level API calls that are initiated within seconds. |
rabbitmq_connections_opened_total | The total number of opened connections. | |
rabbitmq_connections_closed_total | The total number of closed connections. | |
rabbitmq_channels_opened_total | The total number of opened channels. | |
rabbitmq_channels_closed_total | The total number of closed channels. | |
rabbitmq_queues_declared_total | The total number of declared queues. | |
rabbitmq_queues_deleted_total | The total number of deleted queues. | |
rabbitmq_exchange_declared_total | - | |
rabbitmq_exchange_deleted_total | - | |
rabbitmq_exchange_bind_total | - | |
rabbitmq_exchange_unbind_total | - | |
rabbitmq_queue_bind_total | - | |
rabbitmq_queue_unbind_total | - | |
rabbitmq_connections | The number of connections that are being opened. | |
rabbitmq_channels | The number of channels that are being opened. | |
Connections | rabbitmq_connection_channels | The number of channels on connections. |
Exchange | rabbitmq_exchange_messages_published_in_total | The number of inbound messages. |
rabbitmq_exchange_messages_published_out_total | The number of outbound messages. | |
Queues | rabbitmq_queue_messages_published_total | The total number of messages published to queues. |
rabbitmq_queue_messages_ready | The number of messages that are ready to be delivered to consumers. | |
rabbitmq_queue_messages_unacked | The number of messages that are being scheduled. | |
rabbitmq_queue_deliver_total | The total number of messages that have been delivered to consumers but not yet consumed. | |
rabbitmq_queue_get_total | - | |
rabbitmq_queue_ack_total | - | |
rabbitmq_queue_uack_total | - | |
rabbitmq_queue_recover_total | - | |
rabbitmq_queue_reject_total | - | |
rabbitmq_queue_consumers | The number of consumers in queues. |