When you use Prometheus Service of Application Real-Time Monitoring Service (ARMS), you are charged based on the number of reported data entries on billable metrics. The metrics are classified into basic metrics and custom metrics. Basic metrics are free of charge. You are charged for custom metrics starting from January 6, 2020.

The following tables describe the basic metrics of Kubernetes clusters that are supported by Prometheus Service.

  • The following table describes the job that collects data about the status of a Prometheus instance and the basic metrics that are collected.
    Job name Metric name
    _arms-prom/kubelet/1 promhttp_metric_handler_requests_in_flight
    go_memstats_mallocs_total
    go_memstats_lookups_total
    go_memstats_last_gc_time_seconds
    go_memstats_heap_sys_bytes
    go_memstats_heap_released_bytes
    go_memstats_heap_objects
    go_memstats_heap_inuse_bytes
    go_memstats_heap_idle_bytes
    go_memstats_heap_alloc_bytes
    go_memstats_gc_sys_bytes
    go_memstats_gc_cpu_fraction
    go_memstats_frees_total
    go_memstats_buck_hash_sys_bytes
    go_memstats_alloc_bytes_total
    go_memstats_alloc_bytes
    scrape_duration_seconds
    go_info
    go_goroutines
    scrape_samples_post_metric_relabeling
    go_gc_duration_seconds_sum
    go_gc_duration_seconds_count
    blackbox_exporter_config_last_reload_successful
    blackbox_exporter_config_last_reload_success_timestamp_seconds
    scrape_samples_scraped
    blackbox_exporter_build_info
    arms_prometheus_target_scrapes_sample_out_of_order_total
    arms_prometheus_target_scrapes_sample_out_of_bounds_total
    arms_prometheus_target_scrapes_sample_duplicate_timestamp_total
    scrape_series_added
    arms_prometheus_target_scrapes_exceeded_sample_limit_total
    arms_prometheus_target_scrapes_cache_flush_forced_total_arms-prom/kubelet/1
    arms_prometheus_target_scrape_pools_total
    statsd_metric_mapper_cache_gets_total
    statsd_metric_mapper_cache_hits_total
    statsd_metric_mapper_cache_length
    arms_prometheus_target_scrape_pools_failed_total
    up
    arms_prometheus_target_scrape_pool_reloads_total
    arms_prometheus_target_scrape_pool_reloads_failed_total
  • The following table describes the job that collects data about an API server and the basic metrics that are collected.
    Job name Metric name
    apiserver apiserver_request_duration_seconds_bucket (obsolete by default)
    apiserver_admission_controller_admission_duration_seconds_bucket
    apiserver_request_total
    rest_client_requests_total
    apiserver_admission_webhook_admission_duration_seconds_bucket
    apiserver_current_inflight_requests
    up
    apiserver_admission_webhook_admission_duration_seconds_count
    scrape_samples_post_metric_relabeling
    scrape_samples_scraped
    scrape_series_added
    scrape_duration_seconds
  • The following table describes the job that collects data from Ingresses and the basic metrics that are collected.
    Job name Metric name
    arms-ack-ingress nginx_ingress_controller_request_duration_seconds_bucket
    nginx_ingress_controller_response_duration_seconds_bucket
    nginx_ingress_controller_response_size_bucket
    nginx_ingress_controller_request_size_bucket
    nginx_ingress_controller_bytes_sent_bucket
    go_gc_duration_seconds
    nginx_ingress_controller_nginx_process_connections
    nginx_ingress_controller_request_duration_seconds_sum
    nginx_ingress_controller_request_duration_seconds_count
    nginx_ingress_controller_bytes_sent_sum
    nginx_ingress_controller_request_size_sum
    nginx_ingress_controller_response_duration_seconds_count
    nginx_ingress_controller_response_duration_seconds_sum
    nginx_ingress_controller_response_size_count
    nginx_ingress_controller_bytes_sent_count
    nginx_ingress_controller_response_size_sum
    nginx_ingress_controller_request_size_count
    promhttp_metric_handler_requests_total
    nginx_ingress_controller_nginx_process_connections_total
    go_memstats_mcache_sys_bytes
    go_memstats_lookups_total
    go_threads
    go_memstats_sys_bytes
    go_memstats_last_gc_time_seconds
    go_memstats_heap_sys_bytes
    go_memstats_heap_objects
    go_memstats_heap_inuse_bytes
    go_memstats_heap_idle_bytes
    go_memstats_heap_alloc_bytes
    go_memstats_gc_sys_bytes
    promhttp_metric_handler_requests_in_flight
    go_memstats_stack_sys_bytes
    go_memstats_stack_inuse_bytes
    go_memstats_gc_cpu_fraction
    go_memstats_frees_total
    go_memstats_buck_hash_sys_bytes
    go_memstats_alloc_bytes_total
    go_memstats_alloc_bytes
    nginx_ingress_controller_nginx_process_num_procs
    go_info
    go_memstats_mallocs_total
    go_memstats_other_sys_bytes
    go_goroutines
    scrape_samples_post_metric_relabeling
    scrape_samples_scraped
    process_virtual_memory_max_bytes
    process_virtual_memory_bytes
    scrape_duration_seconds
    go_memstats_heap_released_bytes
    go_gc_duration_seconds_sum
    go_memstats_next_gc_bytes
    go_gc_duration_seconds_count
    nginx_ingress_controller_config_hash
    nginx_ingress_controller_config_last_reload_successful
    nginx_ingress_controller_config_last_reload_successful_timestamp_seconds
    nginx_ingress_controller_ingress_upstream_latency_seconds_count
    nginx_ingress_controller_ingress_upstream_latency_seconds_sum
    process_start_time_seconds
    nginx_ingress_controller_nginx_process_cpu_seconds_total
    scrape_series_added
    nginx_ingress_controller_nginx_process_oldest_start_time_seconds
    nginx_ingress_controller_nginx_process_read_bytes_total
    nginx_ingress_controller_nginx_process_requests_total
    nginx_ingress_controller_nginx_process_resident_memory_bytes
    nginx_ingress_controller_nginx_process_virtual_memory_bytes
    nginx_ingress_controller_nginx_process_write_bytes_total
    nginx_ingress_controller_requests
    go_memstats_mcache_inuse_bytes
    nginx_ingress_controller_success
    process_resident_memory_bytes
    process_open_fds
    process_max_fds
    process_cpu_seconds_total
    go_memstats_mspan_sys_bytes
    up
    go_memstats_mspan_inuse_bytes
    nginx_ingress_controller_ssl_expire_time_seconds
    nginx_ingress_controller_leader_election_status
  • The following table describes the job that collects data from CoreDNS and the basic metrics that are collected.
    Job name Metric name
    arms-ack-coredns coredns_forward_request_duration_seconds_bucket
    coredns_dns_request_size_bytes_bucket
    coredns_dns_response_size_bytes_bucket
    coredns_kubernetes_dns_programming_duration_seconds_bucket
    coredns_dns_request_duration_seconds_bucket
    coredns_plugin_enabled
    coredns_health_request_duration_seconds_bucket
    go_gc_duration_seconds
    coredns_forward_responses_total
    coredns_forward_request_duration_seconds_sum
    coredns_forward_request_duration_seconds_count
    coredns_dns_requests_total
    coredns_forward_conn_cache_misses_total
    coredns_dns_responses_total
    coredns_cache_entries
    coredns_cache_hits_total
    coredns_forward_conn_cache_hits_total
    coredns_forward_requests_total
    coredns_dns_request_size_bytes_sum
    coredns_dns_response_size_bytes_count
    coredns_dns_response_size_bytes_sum
    coredns_dns_request_size_bytes_count
    scrape_duration_seconds
    scrape_samples_scraped
    scrape_series_added
    up
    scrape_samples_post_metric_relabeling
    go_memstats_lookups_total
    go_memstats_last_gc_time_seconds
    go_memstats_heap_sys_bytes
    coredns_build_info
    go_memstats_heap_released_bytes
    go_memstats_heap_objects
    go_memstats_heap_inuse_bytes
    go_memstats_heap_idle_bytes
    go_memstats_heap_alloc_bytes
    go_memstats_gc_sys_bytes
    go_memstats_sys_bytes
    go_memstats_stack_sys_bytes
    go_memstats_mallocs_total
    go_memstats_gc_cpu_fraction
    go_memstats_stack_inuse_bytes
    go_memstats_frees_total
    go_memstats_buck_hash_sys_bytes
    go_memstats_alloc_bytes_total
    go_memstats_alloc_bytes
    coredns_cache_misses_total
    go_memstats_other_sys_bytes
    go_memstats_mcache_inuse_bytes
    go_goroutines
    process_virtual_memory_max_bytes
    process_virtual_memory_bytes
    go_gc_duration_seconds_sum
    go_gc_duration_seconds_countarms-ack-coredns
    go_memstats_next_gc_bytes
    coredns_dns_request_duration_seconds_count
    coredns_reload_failed_total
    coredns_panics_total
    coredns_local_localhost_requests_total
    coredns_kubernetes_dns_programming_duration_seconds_sum
    coredns_kubernetes_dns_programming_duration_seconds_count
    coredns_dns_request_duration_seconds_sum
    coredns_hosts_reload_timestamp_seconds
    oredns_health_request_failures_total
    process_start_time_seconds
    process_resident_memory_bytes
    process_open_fds
    process_max_fds
    process_cpu_seconds_total
    coredns_health_request_duration_seconds_sum
    coredns_health_request_duration_seconds_count
    go_memstats_mspan_sys_bytes
    coredns_forward_max_concurrent_rejects_total
    coredns_forward_healthcheck_broken_total
    go_memstats_mcache_sys_bytes
    go_memstats_mspan_inuse_bytes
    go_threads
    go_info
  • The following table describes the job that collects data from kube-state-metrics and the basic metrics that are collected.
    Job name Metric name
    _kube-state-metrics kube_pod_container_status_waiting_reason
    kube_pod_status_phase
    kube_pod_container_status_last_terminated_reason
    kube_pod_container_status_terminated_reason
    kube_pod_status_ready
    kube_node_status_condition
    kube_pod_container_status_running
    kube_pod_container_status_restarts_total
    kube_pod_container_info
    kube_pod_container_status_waiting
    kube_pod_container_status_terminated
    kube_pod_labels
    kube_pod_owner
    kube_pod_info
    kube_pod_container_resource_limits
    kube_persistentvolume_status_phase
    kube_pod_container_resource_requests_memory_bytes
    kube_pod_container_resource_requests_cpu_cores
    kube_pod_container_resource_limits_memory_bytes
    kube_node_status_capacity
    kube_service_info
    kube_pod_container_resource_limits_cpu_cores
    kube_deployment_status_replicas_updated
    kube_deployment_status_replicas_unavailable
    kube_deployment_spec_replicas
    kube_deployment_created
    kube_deployment_metadata_generation
    kube_deployment_status_replicas
    kube_deployment_labels
    kube_deployment_status_observed_generation
    kube_deployment_status_replicas_available
    kube_deployment_spec_strategy_rollingupdate_max_unavailable
    kube_daemonset_status_desired_number_scheduled
    kube_daemonset_updated_number_scheduled
    kube_daemonset_status_number_ready
    kube_daemonset_status_number_misscheduled
    kube_daemonset_status_number_available
    kube_daemonset_status_current_number_scheduled
    kube_daemonset_created
    kube_node_status_allocatable_cpu_cores
    kube_node_status_capacity_memory_bytes
    kube_node_spec_unschedulable
    kube_node_status_allocatable_memory_bytes
    kube_node_labels
    kube_node_info
    kube_namespace_labels
    kube_node_status_capacity_cpu_cores
    kube_node_status_capacity_pods
    kube_node_status_allocatable_pods
    kube_node_spec_taint
    kube_statefulset_status_replicas
    kube_statefulset_replicas
    kube_statefulset_created
    up
    scrape_samples_scraped
    scrape_duration_seconds
    scrape_samples_post_metric_relabeling
    scrape_series_added
  • The following table describes the job that collects data from kubelet and the basic metrics that are collected.
    Job name Metric name
    _arms/kubelet/metric rest_client_request_duration_seconds_bucket
    apiserver_client_certificate_expiration_seconds_bucket
    kubelet_pod_worker_duration_seconds_bucket
    kubelet_pleg_relist_duration_seconds_bucket
    workqueue_queue_duration_seconds_bucket
    rest_client_requests_total
    go_gc_duration_seconds
    process_cpu_seconds_total
    process_resident_memory_bytes
    kubernetes_build_info
    kubelet_node_name
    kubelet_certificate_manager_client_ttl_seconds
    kubelet_certificate_manager_client_expiration_renew_errors
    scrape_duration_seconds
    go_goroutines
    crape_samples_post_metric_relabeling
    scrape_samples_scraped
    scrape_series_added
    up
    apiserver_client_certificate_expiration_seconds_count
    workqueue_adds_total
    workqueue_depth
  • The following table describes the job that collects data from cAdvisor and the basic metrics that are collected.
    Job name Metric name
    _arms/kubelet/cadvisor container_memory_failures_total (obsolete by default)
    container_memory_rss
    container_spec_memory_limit_bytes
    container_memory_failcnt
    container_memory_cache
    container_memory_swap
    container_memory_usage_bytes
    container_memory_max_usage_bytes
    container_cpu_load_average_10s
    container_fs_reads_total (obsolete by default)
    container_fs_writes_total (obsolete by default)
    container_network_transmit_errors_total
    container_network_receive_bytes_total
    container_network_transmit_packets_total
    container_network_receive_errors_total
    container_network_receive_bytes_total
    container_network_receive_errors_total
    container_network_transmit_errors_total
    container_memory_working_set_bytes
    container_cpu_usage_seconds_total
    container_fs_reads_bytes_total
    container_fs_writes_bytes_total
    container_spec_cpu_quota
    container_cpu_cfs_periods_total
    container_cpu_cfs_throttled_periods_total
    container_cpu_cfs_throttled_seconds_total
    container_fs_inodes_free
    container_fs_io_time_seconds_total
    container_fs_io_time_weighted_seconds_total
    container_fs_limit_bytes
    container_tasks_state (obsolete by default)
    container_fs_read_seconds_total (obsolete by default)
    container_fs_write_seconds_total (obsolete by default)
    container_fs_usage_bytes
    container_fs_inodes_total
    container_fs_io_current
    scrape_duration_seconds
    scrape_samples_scraped
    machine_cpu_cores
    machine_memory_bytes
    scrape_samples_post_metric_relabeling
    scrape_series_added
    up
    _arms-prom/kube-apiserver/cadvisor scrape_duration_seconds
    up
    scrape_samples_scraped
    scrape_samples_post_metric_relabeling
    scrape_series_added
  • The following table describes the job that collects data from the Container Service for Kubernetes (ACK) scheduler and the basic metrics that are collected.
    Job name Metric name
    ack-scheduler rest_client_request_duration_seconds_bucket
    scheduler_pod_scheduling_attempts_bucket
    rest_client_requests_total
    scheduler_pending_pods
    scheduler_scheduler_cache_size
    up
  • The following table describes the job that collects data from etcd and the basic metrics that are collected.
    Job name Metric name
    etcd etcd_disk_backend_commit_duration_seconds_bucket
    up
    etcd_server_has_leader
    etcd_debugging_mvcc_keys_total
    etcd_debugging_mvcc_db_total_size_in_bytes
    etcd_server_leader_changes_seen_total
  • The following table describes the job that collects data from nodes and the basic metrics that are collected.
    Job name Metric name
    node-exporter node_filesystem_size_bytes
    node_filesystem_readonly
    node_filesystem_free_bytes
    node_filesystem_avail_bytes
    node_cpu_seconds_total
    node_network_receive_bytes_total
    node_network_receive_errs_total
    node_network_transmit_bytes_total
    node_network_receive_packets_total
    node_network_transmit_drop_total
    node_network_transmit_errs_total
    node_network_up
    node_network_transmit_packets_total
    node_network_receive_drop_total
    go_gc_duration_seconds
    node_load5
    node_filefd_allocated
    node_exporter_build_info
    node_disk_written_bytes_total
    node_disk_writes_completed_total
    node_disk_write_time_seconds_total
    node_nf_conntrack_entries
    node_nf_conntrack_entries_limit
    node_processes_max_processes
    node_processes_pids
    node_sockstat_TCP_alloc
    node_sockstat_TCP_inuse
    node_sockstat_TCP_tw
    node_timex_offset_seconds
    node_timex_sync_status
    node_uname_info
    node_vmstat_pgfault
    node_vmstat_pgmajfault
    node_vmstat_pgpgin
    node_vmstat_pgpgout
    node_disk_reads_completed_total
    node_disk_read_time_seconds_total
    process_cpu_seconds_total
    node_disk_read_bytes_total
    node_disk_io_time_weighted_seconds_total
    node_disk_io_time_seconds_total
    node_disk_io_now
    node_context_switches_total
    node_boot_time_seconds
    process_resident_memory_bytes
    node_intr_total
    node_load1
    go_goroutines
    scrape_duration_seconds
    node_load15
    scrape_samples_post_metric_relabeling
    node_netstat_Tcp_PassiveOpens
    scrape_samples_scraped
    node_netstat_Tcp_CurrEstab
    scrape_series_added
    node_netstat_Tcp_ActiveOpens
    node_memory_MemTotal_bytes
    node_memory_MemFree_bytes
    node_memory_MemAvailable_bytes
    node_memory_Cached_bytes
    up
    node_memory_Buffers_bytes
  • The following table describes the job that collects data from GPUs and the basic metrics that are collected.
    Job name Metric name
    gpu-exporter go_gc_duration_seconds
    promhttp_metric_handler_requests_total
    scrape_series_added
    up
    scrape_duration_seconds
    scrape_samples_scraped
    scrape_samples_post_metric_relabeling
    go_memstats_mcache_inuse_bytes
    process_virtual_memory_max_bytes
    process_virtual_memory_bytes
    process_start_time_seconds
    go_memstats_next_gc_bytes
    go_memstats_heap_objects
    process_resident_memory_bytes
    process_open_fds
    process_max_fds
    go_memstats_other_sys_bytes
    go_gc_duration_seconds_count
    go_memstats_heap_alloc_bytes
    process_cpu_seconds_total
    nvidia_gpu_temperature_celsius
    go_memstats_stack_inuse_bytes
    nvidia_gpu_power_usage_milliwatts
    nvidia_gpu_num_devices
    nvidia_gpu_memory_used_bytes
    nvidia_gpu_memory_total_bytes
    go_memstats_stack_sys_bytes
    nvidia_gpu_memory_allocated_bytes
    nvidia_gpu_duty_cycle
    nvidia_gpu_allocated_num_devices
    promhttp_metric_handler_requests_in_flight
    go_memstats_sys_bytes
    go_memstats_gc_sys_bytes
    go_memstats_gc_cpu_fraction
    go_memstats_heap_released_bytes
    go_memstats_frees_total
    go_threads
    go_memstats_mspan_sys_bytes
    go_memstats_buck_hash_sys_bytes
    go_memstats_alloc_bytes_total
    go_memstats_heap_sys_bytes
    go_memstats_mspan_inuse_bytes
    go_memstats_alloc_bytes
    go_info
    go_memstats_last_gc_time_seconds
    go_memstats_heap_inuse_bytes
    go_memstats_mcache_sys_bytes
    go_memstats_lookups_total
    go_memstats_mallocs_total
    go_gc_duration_seconds_sum
    go_goroutines
    go_memstats_heap_idle_bytes
  • The following table describes the job that collects data from persistent volumes (PVs) and the basic metrics that are collected.
    Job name Metric name
    k8s-csi-cluster-pv cluster_pvc_detail_num_total
    cluster_pv_detail_num_total
    cluster_pv_status_num_total
    cluster_scrape_collector_success
    cluster_scrape_collector_duration_seconds
    alibaba_cloud_storage_operator_build_info
    cluster_pvc_status_num_total
    scrape_duration_seconds
    scrape_samples_post_metric_relabeling
    scrape_samples_scraped
    scrape_series_added
    up
    k8s-csi-node-pv cluster_scrape_collector_duration_seconds
    cluster_scrape_collector_success
    alibaba_cloud_csi_driver_build_info
    up
    scrape_series_added
    scrape_samples_post_metric_relabeling
    scrape_samples_scraped
    scrape_duration_seconds

The following table describes the basic metrics that are supported by a Prometheus instance that is used to monitor an Alibaba Cloud service.

Category Metric name Description
ECS cpu_util_lization The CPU utilization of an Elastic Compute Service (ECS) instance.
internet_in_rate The average rate of inbound traffic from the Internet to an ECS instance.
internet_out_rate The average rate of outbound traffic from an ECS instance to the Internet.
disk_read_bps The bit rate of reads to all disks of an ECS instance.
disk_write_bps The number of reads per second to all disks of an ECS instance.
vpc_public_ip_internet_in_Rate The average rate of inbound traffic from the Internet to the IP address of an ECS instance.
vpc_public_ip_internet_out_Rate The utilization of outbound bandwidth from the IP address of an ECS instance to the Internet.
cpu_total (Agent) cpu.total
memory_totalspace (Agent) memory.total.space
memory_usedutilization (Agent) memory.used.utilization
diskusage_utilization (Agent) disk.usage.utilization_device
RDS cpu_usage_average The CPU utilization.
disk_usage The disk usage.
iops_usage The IOPS usage.
connection_usage The connection usage.
data_delay The latency of read-only instances.
memory_usage The memory usage.
mysql_network_in_new The inbound bandwidth of an ApsaraDB RDS for MySQL instance.
mysql_network_out_new The outbound bandwidth of an ApsaraDB RDS for MySQL instance.
mysql_active_sessions MySQL_ActiveSessions
sqlserver_network_in_new The inbound bandwidth of an ApsaraDB RDS for SQL Server instance.
sqlserver_network_out_new The outbound bandwidth of an ApsaraDB RDS for SQL Server instance.
NAT snat_connection The number of SNAT connections.
snat_connection_drop_limit The cumulative number of SNAT connections dropped due to the limit on the number of concurrent connections.
snat_connection_drop_rate_limit The cumulative number of SNAT connections dropped due to the limit on the number of new connections.
net_rx_rate The inbound bandwidth.
net_tx_rate The outbound bandwidth.
net_rx_pkgs The rate of inbound packets.
net_tx_pkgs The rate of outbound packets.
RocketMQ consumer_lag_gid The number of accumulated messages.
receive_message_count_gid The number of messages received per minute by a consumer group.
send_message_count_gid The number of messages sent per minute by a producer group.
consumer_lag_topic The number of accumulated messages of a topic or group.
receive_message_count_topic The number of messages of a topic received per minute by a consumer group.
send_message_count_topic The number of messages of a topic sent per minute by a consumer group.
receive_message_count The number of messages received per minute.
send_message_count The number of messages sent per minute.
SLB healthy_server_count The number of healthy backend ECS instances.
unhealthy_server_count The number of unhealthy backend ECS instances.
packet_tx The number of inbound packets per second.
packet_rx The number of outbound packets per second.
traffic_rx_new The inbound bandwidth.
traffic_tx_new The outbound bandwidth.
active_connection The number of active connections over TCP.
inactive_connection The number of inactive connections on a port.
new_connection The number of new connections over TCP.
max_connection The number of concurrent connections on a port.
instance_active_connection The number of active connections established to an instance.
instance_new_connection The number of new connections established to an instance per second.
instance_max_connection The maximum number of concurrent connections established to an instance per second.
instance_drop_connection The number of connections that are dropped per second on an instance.
instance_traffic_rx The inbound traffic per second of an instance. Unit: bit.
instance_traffic_tx The outbound traffic per second of an instance. Unit: bit.
E-MapReduce (EMR) active_applications The number of active jobs.
active_users The number of active users.
aggregate_containers_allocated The total number of allocated containers.
aggregate_containers_released The total number of released containers.
allocated_containers The number of allocated containers.
apps_completed The number of completed jobs.
apps_failed The number of failed jobs
apps_killed The number of terminated jobs.
apps_pending The number of pending jobs.
apps_running The number of running jobs.
apps_submitted The number of submitted jobs.
available_mb The size of the memory available to the current queue.
available_vcores The number of vCores available to the current queue.
pending_containers The number of pending containers.
reserved_containers The number of reserved containers.
EIP net_rx_rate The inbound bandwidth.
net_tx_rate The outbound bandwidth.
net_rx_pkgs_rate The rate of inbound packets.
net_tx_pkgs_rate The rate of outbound packets.
out_ratelimit_drop_speed The rate at which packets are dropped due to throttling.
OSS availability The availability.
request_valid_rate The ratio of valid requests.
success_rate The ratio of successful requests.
network_error_rate The ratio of failed requests due to network issues.
total_request_count The total number of requests.
valid_count The number of valid requests.
internet_send The outbound traffic over the Internet.
internet_recv The inbound traffic over the Internet.
intranet_send The outbound traffic over the internal network.
intranet_recv The inbound traffic over the internal network.
success_count The total number of successful requests.
network_error_count The total number of failed requests due to network issues.
client_timeout_count The total number of failed requests due to client timeouts.
Elasticsearch node_cpu_utilization The CPU utilization of a node.
node_heap_memory_utilization The heap memory utilization of a node.
node_stats_exception_log_count The number of exceptions.
node_stats_full_gc_collection_count The number of full heap garbage collections (full GCs).
node_disk_utilization The disk usage of a node.
node_load_1m The average load of a node over the last 1 minute.
cluster_query_qps The queries per second (QPS) of a cluster.
cluster_index_qps ClusterIndexQPS
Logstash cpu_percent The CPU utilization of a node.
node_heap_memory The memory usage of a node.
node_disk_usage The disk usage of a node.
DRDS cpu_utilization The CPU utilization.
connection_count The number of connections.
logic_qps The logical QPS.
logic_rt The logical response time (RT).
memory_utilization The memory usage.
network_input_traffic The inbound bandwidth.
network_output_traffic The outbound bandwidth.
physics_qps The physical QPS.
physics_rt The physical RT.
thread_count The number of active threads.
com_insert_select The number of INSERT and SELECT statements that are executed per second on a private ApsaraDB RDS for MySQL instance.
com_replace The number of REPLACE statements that are executed per second on a private ApsaraDB RDS for MySQL instance.
com_replace_select The number of REPLACE and SELECT statements that are executed per second on a private ApsaraDB RDS for MySQL instance.
com_select The number of SELECT statements that are executed per second on a private ApsaraDB RDS for MySQL instance.
com_update The number of UPDATE statements that are executed per second on a private ApsaraDB RDS for MySQL instance.
conn_usage The connection usage of a private ApsaraDB RDS for MySQL instance.
cpu_usage The CPU utilization of a private ApsaraDB RDS for MySQL instance.
disk_usage The disk usage of a private ApsaraDB RDS for MySQL instance.
ibuf_dirty_ratio The dirty page ratio of the buffer pool of a private ApsaraDB RDS for MySQL instance.
ibuf_pool_reads The number of physical reads per second on a private ApsaraDB RDS for MySQL instance.
ibuf_read_hit The read hit ratio of the buffer pool of a private ApsaraDB RDS for MySQL instance.
ibuf_request_r The number of logical reads per second on a private ApsaraDB RDS for MySQL instance.
ibuf_request_w The number of logical writes per second on a private ApsaraDB RDS for MySQL instance.
ibuf_use_ratio The utilization of the buffer pool of a private ApsaraDB RDS for MySQL instance.
inno_data_read The amount of data read per second on a private ApsaraDB RDS for MySQL instance that uses InnoDB.
inno_data_written The amount of data written per second to a private ApsaraDB RDS for MySQL instance that uses InnoDB.
inno_row_delete The number of rows deleted per second from a private ApsaraDB RDS for MySQL instance that uses InnoDB.
inno_row_insert The number of rows inserted per second to a private ApsaraDB RDS for MySQL instance that uses InnoDB.
inno_row_readed The number of rows read per second on a private ApsaraDB RDS for MySQL instance that uses InnoDB.
inno_row_update The number of rows updated per second on a private ApsaraDB RDS for MySQL instance that uses InnoDB.
innodb_log_write_requests The number of write requests per second to the logs of a private ApsaraDB RDS for MySQL instance that uses InnoDB.
innodb_log_writes The number of logical writes per second to the logs of a private ApsaraDB RDS for MySQL instance that uses InnoDB.
innodb_os_log_fsyncs The number of times fsync is called per second to write data to the logs of a private ApsaraDB RDS for MySQL instance that uses InnoDB.
input_traffic_ps The inbound bandwidth of a private ApsaraDB RDS for MySQL instance.
iops_usage The IOPS usage of a private ApsaraDB RDS for MySQL instance.
mem_usage The memory usage of a private ApsaraDB RDS for MySQL instance.
output_traffic_ps The outbound bandwidth of a private ApsaraDB RDS for MySQL instance.
qps The QPS of a private ApsaraDB RDS for MySQL instance.
slave_lag The latency of a private read-only ApsaraDB RDS for MySQL instance.
slow_queries The slow queries per second of a private ApsaraDB RDS for MySQL instance.
tb_tmp_disk The number of temporary tables created per second on a private ApsaraDB RDS for MySQL instance.
Kafka instance_disk_capacity The disk usage of an instance.
instance_message_input The number of messages produced on an instance.
instance_message_output The number of messages consumed on an instance.
topic_message_input The number of messages produced in a topic.
topic_message_output The number of messages consumed in a topic.
MongoDB cpu_utilization The CPU utilization.
memory_utilization The memory usage.
disk_utilization The disk usage.
iops_utilization The IOPS usage.
qps The QPS.
connect_amount The number of used connections.
instance_disk_amount The disk space occupied by an instance.
data_disk_amount The disk space occupied by data.
log_disk_amount The disk space occupied by logs.
intranet_in The inbound traffic over the internal network.
intranet_out The outbound traffic over the internal network.
number_requests The number of requests.
op_insert The number of insert operations.
op_query The number of query operations.
op_update The number of update operations.
op_delete The number of delete operations.
op_getmore The number of getMore operations.
op_command The number of operations performed by running commands.
PolarDB active_connections The number of active connections.
blks_read_delta The number of reads to a data block.
cluster_active_sessions The number of active connections.
cluster_connection_utilization The connection usage.
cluster_cpu_utilization The CPU utilization.
cluster_data_io The I/O throughput per second of a storage engine.
cluster_data_iops The IOPS of a storage engine.
cluster_mem_hit_ratio The cache hit ratio.
cluster_memory_utilization The memory usage.
cluster_qps The QPS.
cluster_slow_queries_ps The number of slow queries per second.
cluster_tps The number of transactions per second.
conn_usage The connection usage.
cpu_total The CPU utilization.
db_age The maximum database age.
instance_connection_utilization The connection usage of an instance.
instance_cpu_utilization The CPU utilization of an instance.
instance_input_bandwidth The inbound bandwidth of an instance.
instance_memory_utilization The memory usage of an instance.
instance_output_bandwidth The outbound bandwidth of an instance.
mem_usage The memory usage.
pls_data_size The disk data size of a PolarDB for PostgreSQL cluster.
pls_iops pg IOPS
pls_iops_read The read IOPS of a PolarDB for PostgreSQL cluster.
pls_iops_write The write IOPS of a PolarDB for PostgreSQL cluster.
pls_pg_wal_dir_size The size of write-ahead logging (WAL) files of a PolarDB for PostgreSQL cluster.
pls_throughput The I/O throughput of a PolarDB for PostgreSQL cluster.
pls_throughput_read The read I/O throughput of a PolarDB for PostgreSQL cluster.
pls_throughput_write The write I/O throughput of a PolarDB for PostgreSQL cluster.
swell_time The point in time at which data bloat occurs in a PolarDB for PostgreSQL cluster.
tps pg TPS
cluster_iops The IOPS.
Redis intranet_in_ratio The bandwidth utilization of writes.
intranet_out_ratio The bandwidth utilization of reads.
failed_count The number of failed operations.
cpu_usage The CPU utilization.
used_memory The memory usage.
used_connection The number of used connections.
used_qps The number of used QPS.

The following table describes the basic metrics of Message Queue for Apache RocketMQ supported by Prometheus Service.

Category Metric name Description
Producer rocketmq_producer_requests The number of API calls that are made to send messages.
rocketmq_producer_messages The number of sent messages.
rocketmq_producer_message_size_bytes The total size of sent messages.
rocketmq_producer_send_success_rate The success rate of message sending.
rocketmq_producer_failure_api_calls The number of failed API calls that are made to send messages.
rocketmq_producer_send_rt_milliseconds_avg The average time required to send messages.
rocketmq_producer_send_rt_milliseconds_min The minimum time required to send messages.
rocketmq_producer_send_rt_milliseconds_max The maximum time required to send messages.
rocketmq_producer_send_rt_milliseconds_p95 The 95th percentile of the time required to send messages.
rocketmq_producer_send_rt_milliseconds_p99 The 99th percentile of the time required to send messages.
Consumer rocketmq_consumer_requests The number of API calls that are made to consume messages.
rocketmq_consumer_send_back_requests The number of API calls that are made to return messages after consumers fail to consume messages.
rocketmq_consumer_send_back_messages The messages returned from consumers after consumers fail to consume messages.
rocketmq_consumer_messages The number of consumed messages.
rocketmq_consumer_message_size_bytes The total size of messages consumed within 1 minute.
rocketmq_consumer_ready_and_inflight_messages The number of lagging messages, including ready messages and inflight messages.
rocketmq_consumer_ready_messages The number of ready messages.
rocketmq_consumer_inflight_messages The number of inflight messages.
rocketmq_consumer_queue_time_milliseconds The queuing duration of messages.
rocketmq_consumer_message_await_time_milliseconds_avg The average time required for consumer clients to allocate resources to process messages.
rocketmq_consumer_message_await_time_milliseconds_min The minimum time required for consumer clients to allocate resources to process messages.
rocketmq_consumer_message_await_time_milliseconds_max The maximum time required for consumer clients to allocate resources to process messages.
rocketmq_consumer_message_await_time_milliseconds_p95 The 95th percentile of the time required for consumer clients to allocate resources to process messages.
rocketmq_consumer_message_await_time_milliseconds_p99 The 99th percentile of the time required for consumer clients to allocate resources to process messages.
rocketmq_consumer_message_process_time_milliseconds_avg The average time required for consumers to process messages.
rocketmq_consumer_message_process_time_milliseconds_min The minimum time required for consumers to process messages.
rocketmq_consumer_message_process_time_milliseconds_max The maximum time required for consumers to process messages.
rocketmq_consumer_message_process_time_milliseconds_p95 The 95th percentile of the time required for consumers to process messages.
rocketmq_consumer_message_process_time_milliseconds_p99 The 99th percentile of the time required for consumers to process messages.
rocketmq_consumer_consume_success_rate The success rate of message consumption.
rocketmq_consumer_failure_api_calls The number of failed API calls that are made to consume messages.
rocketmq_consumer_to_dlq_messages The number of dead-letter messages.