Enable Prometheus monitoring for Agent Sandbox

ACS Agent Sandbox exposes Prometheus metrics for instance lifecycle, resource status, and runtimes via its two core components: Sandbox Controller and Sandbox Manager. You can collect these metrics with either Managed Service for Prometheus or a self-hosted Prometheus instance and visualize them on a Grafana dashboard.

Prerequisites

On your cluster's Add-ons page, ensure the following components meet the minimum version requirements:
- ack-agent-sandbox-controller: v0.5.14 or later.
- ack-sandbox-manager: v0.6.1 or later.

Alibaba Cloud Prometheus

Go to the ARMS Prometheus Integration Management page. In the upper-left corner, select the region of your cluster. On the Integrated Environments tab, locate your cluster and click its instance name to open the instance details page.
On the instance details page, click Add Integration next to Addon Type. In the panel that appears, find and click Agent Sandbox Monitoring, keep the default integration name, and then click OK.

Self-managed Prometheus

Configure scrape rules

Agent Sandbox monitoring involves scraping metrics from two components:

Sandbox Controller: Exposes metrics through the /metrics endpoint on the Kubernetes API server.
Sandbox Manager: Deployed in the sandbox-system namespace and exposes metrics on HTTP port 8080 at the /metrics path.

Open Source Prometheus

In prometheus.yml, add the following scrape jobs to the scrape_configs section.

Sandbox Controller

scrape_configs:
- job_name: agent-sandbox-controller
  scrape_interval: 30s
  scrape_timeout: 30s
  metrics_path: /metrics
  scheme: https
  honor_labels: true
  honor_timestamps: true
  params:
    hosting: ["true"]
    job: ["agent-sandbox-controller"]
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names: [default]
  authorization:
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    insecure_skip_verify: false
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    server_name: kubernetes
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_component]
    separator: ;
    regex: apiserver
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_service_label_provider]
    separator: ;
    regex: kubernetes
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: https
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_component]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: https
    action: replace

Sandbox Manager

scrape_configs:
- job_name: sandbox-manager
  scrape_interval: 30s
  scrape_timeout: 30s
  metrics_path: /metrics
  scheme: http
  honor_labels: true
  honor_timestamps: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - sandbox-system
  relabel_configs:
  - source_labels:
    - __meta_kubernetes_endpoint_port_name
    separator: ;
    regex: manager
    replacement: $1
    action: keep

Prometheus Operator

The community edition of Prometheus Operator uses the ServiceMonitor custom resource to define scrape rules.

Sandbox Controller

Save the following content as sandbox-controller-servicemonitor.yaml.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    release: ack-prometheus-operator # Change this as needed based on the labelSelector configuration of your Prometheus Operator.
  name: sandbox-controller
  namespace: monitoring
spec:
  endpoints:
    - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      bearerTokenSecret:
        key: ''
      honorLabels: true
      honorTimestamps: true
      interval: 30s
      params:
        hosting:
          - 'true'
        job:
          - agent-sandbox-controller
      path: /metrics
      port: https
      relabelings:
        - action: keep
          regex: https
          sourceLabels:
            - __meta_kubernetes_endpoint_port_name
        - action: replace
          sourceLabels:
            - __meta_kubernetes_namespace
          targetLabel: namespace
        - action: replace
          regex: Node;(.*)
          replacement: '${1}'
          separator: ;
          sourceLabels:
            - __meta_kubernetes_endpoint_address_target_kind
            - __meta_kubernetes_endpoint_address_target_name
          targetLabel: node
        - action: replace
          regex: Pod;(.*)
          replacement: '${1}'
          separator: ;
          sourceLabels:
            - __meta_kubernetes_endpoint_address_target_kind
            - __meta_kubernetes_endpoint_address_target_name
          targetLabel: pod
        - action: replace
          sourceLabels:
            - __meta_kubernetes_service_name
          targetLabel: service
        - action: replace
          regex: ^$
          sourceLabels:
            - __meta_kubernetes_service_label_component
          targetLabel: __tmp_job_fallback
        - action: replace
          regex: (.+);
          replacement: '${1}'
          separator: ;
          sourceLabels:
            - __meta_kubernetes_service_name
            - __meta_kubernetes_service_label_component
          targetLabel: job
        - action: replace
          replacement: https
          targetLabel: endpoint
      scheme: https
      scrapeTimeout: 30s
      tlsConfig:
        ca: {}
        caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        cert: {}
        serverName: kubernetes
  jobLabel: component
  namespaceSelector:
    matchNames:
      - default
  selector:
    matchLabels:
      component: apiserver
      provider: kubernetes

Create the ServiceMonitor resource.

kubectl apply -f sandbox-controller-servicemonitor.yaml

Sandbox Manager

Save the following ServiceMonitor content as a YAML file and run kubectl apply -f to create the resource.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    release: ack-prometheus-operator # This must match the serviceMonitorSelector of your Prometheus Operator for discovery and scraping to work. Change this value as needed based on your actual labelSelector.
    app.kubernetes.io/instance: ack-sandbox-manager
    app.kubernetes.io/name: ack-sandbox-manager
    component: sandbox-manager
  name: sandbox-manager
  namespace: sandbox-system
spec:
  endpoints:
  - interval: 30s
    path: /metrics
    port: manager
  namespaceSelector:
    matchNames:
    - sandbox-system
  selector:
    matchLabels:
      app.kubernetes.io/instance: ack-sandbox-manager
      app.kubernetes.io/name: ack-sandbox-manager
      component: sandbox-manager

View monitoring dashboards

Alibaba Cloud Prometheus

Log on to the Alibaba Cloud console. In the left navigation bar, select Operations > Prometheus Monitoring. On the Others tab, you can view the following Sandbox monitoring dashboards.

Sandbox Instance: View the status, lifecycle, and resource usage of specific Sandbox instances.
Sandbox Controller: View cloud-side lifecycle management for Sandbox resources, including resource statistics and management performance.
Sandbox Manager: View the execution status of Sandbox resource declarations, such as the execution performance of the E2B protocol.

Self-managed Prometheus

If you use self-managed Prometheus, you can import the following Grafana dashboard JSON templates and configure the data source.

Name	Version	Description	Download
Sandbox Instance	v1.0.0	Monitors the metadata, current status, and resource usage of Sandbox instances.	Sandbox Instance-v1.0.0.json
Sandbox Controller	v1.0.0	Monitors cloud-side lifecycle management for Sandbox resources, including resource statistics and management performance.	Sandbox Controller-v1.0.0.json
Sandbox Manager	v1.0.0	Monitors the execution status of Sandbox resource declarations, such as the execution performance of the E2B protocol.	Sandbox Manager-v1.0.0.json

Billing

Enabling Alibaba Cloud Prometheus Monitoring may incur additional charges. For details, see the billing overview.

Metrics

This section lists the Prometheus metrics exposed by the Sandbox Controller and Sandbox Manager components. You can use these metrics to configure alerting rules or create custom dashboards.

Sandbox controller metrics

The Sandbox Controller manages the lifecycle of Sandbox instances and SandboxSet resources. The following metrics are exposed from the controller's/metrics endpoint.

Sandbox instance metrics

Use these metrics to monitor the basic information, lifecycle status, and readiness of each Sandbox instance in the cluster.

Note

Status-based metrics, such assandbox_status_unpaused,sandbox_status_unpaused_time, andsandbox_status_inplace_updating, generate a time series only when an instance enters the corresponding state. If no instances in the cluster are in that state (for example, if a pause operation has never been performed), querying the corresponding metric returns no data. This is expected and does not indicate a scrape failure.

Metric name	Type	Description	Labels
`sandbox_created`	Gauge	The Unix timestamp of the Sandbox instance's creation.	`name`, `namespace`
`sandbox_status_phase`	Gauge	The current phase of the Sandbox instance. The value is 1 for the current phase. Possible values for the`phase` label include: Pending, Running, Paused, Resuming, Failed, Succeeded, and Terminating.	`name`, `namespace`, `phase`
`sandbox_status_ready`	Gauge	Indicates if the Sandbox instance is ready. `1` for true; `0` for false.	`name`, `namespace`
`sandbox_status_ready_time`	Gauge	The Unix timestamp of when the instance last entered the Ready state.	`name`, `namespace`
`sandbox_status_inplace_updating`	Gauge	Indicates if the `InplaceUpdate` condition of the Sandbox instance is `False`. `1` if `False`; `0` if `True`.	`name`, `namespace`
`sandbox_status_unpaused`	Gauge	Indicates if the `SandboxPaused` condition of the Sandbox instance is `False`. `1` if `False`; `0` if `True`.	`name`, `namespace`
`sandbox_status_unpaused_time`	Gauge	The Unix timestamp when the `SandboxPaused` condition transitioned to `False`.	`name`, `namespace`
`sandbox_status_inplace_updating_time`	Gauge	The Unix timestamp when the `InplaceUpdate` condition transitioned to `False`.	`name`, `namespace`

Sandbox instance resource metrics

Each Sandbox instance corresponds to a pod. Its resource metrics are the same as the cluster's cAdvisor pod resource metrics. For detailed metric descriptions, see Container cluster basic metrics.

SandboxSet metrics

Use these metrics to monitor the replica status of SandboxSet resources and to identify issues such as insufficient replicas or scaling anomalies.

Metric name	Type	Description	Labels
`sandboxset_replicas`	Gauge	The current number of SandboxSet replicas.	`name`, `namespace`
`sandboxset_available_replicas`	Gauge	The current number of available SandboxSet replicas.	`name`, `namespace`
`sandboxset_desired_replicas`	Gauge	The desired number of SandboxSet replicas.	`name`, `namespace`

Controller runtime metrics

Use these metrics to monitor the reconcile performance and error conditions of the controller-runtime framework. This helps you determine if the controller is healthy.

Metric name	Type	Description	Labels
`controller_runtime_reconcile_total`	Counter	Total number of reconcile operations for each controller.	`controller`, `result`
`controller_runtime_reconcile_errors_total`	Counter	Total number of reconcile errors for each controller.	`controller`
`controller_runtime_terminal_reconcile_errors_total`	Counter	Total number of terminal reconcile errors for each controller.	`controller`
`controller_runtime_active_workers`	Gauge	Current number of active workers for each controller.	`controller`
`controller_runtime_webhook_requests_total`	Counter	Total number of admission requests, broken down by HTTP status code.	`webhook`, `code`

Workqueue metrics

Use these metrics to monitor the backlog and processing status of the controller's workqueue, and to identify issues such as queue buildup or stuck threads.

Metric name	Type	Description	Labels
`workqueue_depth`	Gauge	Current depth of the workqueue.	`controller`, `name`
`workqueue_unfinished_work_seconds`	Gauge	The number of seconds of work that is in progress but not yet complete. A high value may indicate a stuck thread.	`controller`, `name`
`workqueue_longest_running_processor_seconds`	Gauge	The number of seconds the longest-running processor in the workqueue has been active.	`controller`, `name`

API server request metrics

Use these metrics to monitor the controller's requests to the Kubernetes API Server. This helps you troubleshoot throttling or connection issues.

Metric name	Type	Description	Labels
`rest_client_requests_total`	Counter	Total number of HTTP requests, categorized by status code and method.	`code`, `method`

Process and runtime metrics

Use these metrics to monitor the resource usage of the controller process, including memory, garbage collection (GC), and goroutines to identify potential resource leaks.

Metric name	Type	Description
`up`	Gauge	Indicates the scrape connectivity status. `1` if the target is up and healthy.
`go_goroutines`	Gauge	The current number of goroutines.
`go_gc_duration_seconds`	Summary	A summary of the stop-the-world garbage collection pause durations.
`process_resident_memory_bytes`	Gauge	Resident memory size of the process, in bytes.
`process_open_fds`	Gauge	The number of open file descriptors for the process.
`go_memstats_alloc_bytes`	Gauge	Number of heap bytes allocated and still in use.
`go_memstats_sys_bytes`	Gauge	Total bytes of memory obtained from the operating system.
`go_memstats_heap_inuse_bytes`	Gauge	Number of heap bytes that are in use.
`go_memstats_heap_objects`	Gauge	The current number of allocated objects.
`go_memstats_heap_alloc_bytes`	Gauge	Alias for `go_memstats_alloc_bytes`. This metric is identical to `go_memstats_alloc_bytes`.
`go_memstats_heap_idle_bytes`	Gauge	Number of heap bytes waiting to be used.
`go_memstats_heap_released_bytes`	Gauge	Number of heap bytes released to the operating system.
`go_memstats_heap_sys_bytes`	Gauge	Number of heap bytes obtained from the system.
`go_memstats_alloc_bytes_total`	Counter	Total number of bytes allocated to date (including freed bytes).
`go_memstats_next_gc_bytes`	Gauge	The heap size threshold in bytes that triggers the next garbage collection.
`go_memstats_last_gc_time_seconds`	Gauge	The Unix timestamp of the last garbage collection.
`go_memstats_gc_sys_bytes`	Gauge	Bytes of memory used for garbage collection system metadata.
`go_memstats_buck_hash_sys_bytes`	Gauge	Bytes of memory used by the profiling bucket hash table.
`go_memstats_mspan_sys_bytes`	Gauge	Bytes of memory used for `mspan` structures.
`go_memstats_mcache_sys_bytes`	Gauge	Bytes of memory used for `mcache` structures.
`go_memstats_other_sys_bytes`	Gauge	Bytes of memory used for other system allocations.
`go_memstats_stack_sys_bytes`	Gauge	Bytes of memory obtained from the system for stack space.

Sandbox manager metrics

Sandbox Manager handles sandbox resource claims, lifecycle operations (such as clone, delete, pause, resume, and snapshot), and the routing proxy. The Manager's /metrics endpoint exposes the following metrics.

Sandbox claim metrics

Use these metrics to monitor sandbox resource claim operations, including success rate, duration, and retry count.

Metric name	Type	Description	Label
`sandbox_claim_total`	Counter	Total number of claim operations.	-
`sandbox_claim_creation_responses`	Counter	The number of sandbox creation requests, broken down by result.	`result`
`sandbox_claim_duration_seconds`	Histogram	Duration of claim operations, in seconds.	-
`sandbox_claim_retries`	Histogram	Number of retries per claim operation.	-

Lifecycle operation metrics

Use these metrics to monitor the duration and outcome of Sandbox lifecycle operations. This helps you evaluate performance and troubleshoot failures.

Metric name	Type	Description	Label
`sandbox_clone_duration_seconds`	Histogram	Duration of the Sandbox clone operation, in seconds.	-
`sandbox_delete_duration_seconds`	Histogram	Duration of the Sandbox delete operation, in seconds.	-
`sandbox_delete_responses`	Counter	Total Sandbox delete requests by result.	`result`
`sandbox_pause_duration_seconds`	Histogram	Duration of the Sandbox pause operation, in seconds.	-
`sandbox_resume_duration_seconds`	Histogram	Duration of the Sandbox resume operation, in seconds.	-
`sandbox_snapshot_duration_seconds`	Histogram	Duration of the Sandbox snapshot creation, in seconds.	-

Route and network metrics

Use these metrics to monitor the proxy routing table and peer nodes in Sandbox Manager, and the performance of route synchronization.

Metric name	Type	Description	Tag
`sandbox_routes`	Gauge	Current number of routes in the proxy routing table.	-
`sandbox_peers`	Gauge	Current number of connected peer nodes.	-
`sandbox_route_sync_duration_seconds`	Histogram	Duration of route synchronization, in seconds.	-
`sandbox_route_sync_total`	Counter	Total number of route synchronizations.	-

controller-runtime metrics (Manager)

Use these metrics to monitor reconciliation performance and errors in the controller-runtime framework.

Parameter	Type	Description	Label
`controller_runtime_reconcile_total`	Counter	Total number of reconciliations for each controller.	`controller`, `result`
`controller_runtime_reconcile_errors_total`	Counter	Total number of reconciliation errors for each controller.	`controller`
`controller_runtime_terminal_reconcile_errors_total`	Counter	Total number of terminal reconciliation errors for each controller.	`controller`
`controller_runtime_active_workers`	Gauge	Current number of active workers for each controller.	`controller`
`controller_runtime_reconcile_time_seconds`	Histogram	Reconciliation duration in seconds.	`controller`
`controller_runtime_max_concurrent_reconciles`	Gauge	Maximum number of concurrent reconciles for each controller.	`controller`
`controller_runtime_reconcile_panics_total`	Counter	Total number of reconciliation panics for each controller.	`controller`
`controller_runtime_webhook_panics_total`	Counter	Total number of Webhook panics.	-

Work queue metrics (manager)

Use these metrics to monitor the backlog and processing status of the controller's work queue.

Metric name	Type	Description	Labels
`workqueue_depth`	Gauge	Current depth of the work queue.	`controller`, `name`
`workqueue_unfinished_work_seconds`	Gauge	Total runtime, in seconds, of work currently in progress.	`controller`, `name`
`workqueue_longest_running_processor_seconds`	Gauge	Runtime in seconds of the longest-running processor for the work queue.	`controller`, `name`
`workqueue_adds_total`	Counter	Total number of items added to the work queue.	`controller`, `name`
`workqueue_retries_total`	Counter	Total number of retries handled by the work queue.	`controller`, `name`
`workqueue_queue_duration_seconds`	Histogram	The duration an item spends in the work queue before being processed.	`controller`, `name`
`workqueue_work_duration_seconds`	Histogram	The time taken to process an item from the work queue.	`controller`, `name`

API Server request metrics (Manager)

Use these metrics to monitor requests from the Manager to the Kubernetes API Server.

Metric name	Type	Description	Label
`rest_client_requests_total`	Counter	Number of HTTP requests, broken down by status code, method, and host.	`code`, `method`, `host`

Process and runtime metrics (Manager)

These metrics monitor the resource usage of the Manager process.

Metric name	Type	Description	Label
`process_cpu_seconds_total`	counter	Total CPU time used by the process, in seconds.	-
`process_resident_memory_bytes`	gauge	Resident memory size of the process, in bytes.	-
`process_open_fds`	gauge	Number of open file descriptors for the process.	-
`process_max_fds`	gauge	Maximum number of file descriptors for the process.	-
`process_virtual_memory_bytes`	gauge	Virtual memory size of the process, in bytes.	-
`process_virtual_memory_max_bytes`	gauge	Maximum virtual memory size for the process, in bytes.	-
`process_start_time_seconds`	gauge	Start time of the process, in seconds since the Unix epoch.	-
`process_network_receive_bytes_total`	counter	Total bytes received by the process over the network.	-
`process_network_transmit_bytes_total`	counter	Total bytes transmitted by the process over the network.	-
`go_goroutines`	gauge	Current number of goroutines.	-
`go_threads`	gauge	Number of OS threads created.	-
`go_info`	gauge	Information about the Go environment.	`version`
`go_gc_duration_seconds`	summary	Distribution of stop-the-world GC pause durations.	`quantile`
`go_memstats_alloc_bytes`	gauge	Bytes of allocated heap objects.	-
`go_memstats_alloc_bytes_total`	counter	Total bytes allocated for heap objects (includes freed bytes).	-
`go_memstats_sys_bytes`	gauge	Total bytes of memory obtained from the operating system.	-
`go_memstats_heap_alloc_bytes`	gauge	Bytes of allocated heap objects.	-
`go_memstats_heap_idle_bytes`	gauge	Bytes of memory in idle spans.	-
`go_memstats_heap_inuse_bytes`	gauge	Bytes of memory in in-use spans.	-
`go_memstats_heap_objects`	gauge	Number of allocated heap objects.	-
`go_memstats_heap_released_bytes`	gauge	Bytes of memory released to the operating system.	-
`go_memstats_heap_sys_bytes`	gauge	Bytes of memory obtained from the operating system for the heap.	-
`go_memstats_stack_sys_bytes`	gauge	Bytes of memory obtained from the operating system for stack space.	-
`go_memstats_stack_inuse_bytes`	gauge	Bytes of memory in use by the stack allocator.	-
`go_memstats_mspan_sys_bytes`	gauge	Bytes of memory used for mspan structures.	-
`go_memstats_mspan_inuse_bytes`	gauge	Bytes of memory in use by mspan structures.	-
`go_memstats_mcache_sys_bytes`	gauge	Bytes of memory used for mcache structures.	-
`go_memstats_mcache_inuse_bytes`	gauge	Bytes of memory in use by mcache structures.	-
`go_memstats_buck_hash_sys_bytes`	gauge	Bytes of memory used by the profiling bucket hash table.	-
`go_memstats_gc_sys_bytes`	gauge	Bytes of memory used for garbage collection system metadata.	-
`go_memstats_other_sys_bytes`	gauge	Bytes of memory used for other system allocations.	-
`go_memstats_next_gc_bytes`	gauge	Target heap size in bytes for the next garbage collection cycle.	-
`go_memstats_last_gc_time_seconds`	gauge	Timestamp of the last completed garbage collection cycle, in seconds since the Unix epoch.	-
`go_memstats_frees_total`	counter	Cumulative count of heap objects freed.	-
`go_memstats_mallocs_total`	counter	Cumulative count of heap objects allocated.	-
`go_gc_cycles_automatic_gc_cycles_total`	counter	Total number of automatic garbage collection cycles initiated by the Go runtime.	-
`go_gc_cycles_forced_gc_cycles_total`	counter	Total number of garbage collection cycles forced by the application.	-
`go_gc_cycles_total_gc_cycles_total`	counter	Total number of completed garbage collection cycles.	-
`go_gc_gogc_percent`	gauge	User-configured heap growth target percentage.	-
`go_gc_gomemlimit_bytes`	gauge	User-configured Go runtime memory limit, in bytes.	-
`go_gc_heap_goal_bytes`	gauge	Target heap size at the end of a garbage collection cycle, in bytes.	-
`go_gc_heap_live_bytes`	gauge	Heap memory occupied by objects that were marked live by the previous garbage collection cycle.	-
`go_gc_heap_objects_objects`	gauge	Number of objects on the heap (live or unswept).	-
`go_gc_heap_tiny_allocs_objects_total`	counter	Total number of tiny allocations combined into blocks.	-
`go_gc_heap_allocs_bytes_total`	counter	Cumulative memory allocated to the heap by the application.	-
`go_gc_heap_allocs_objects_total`	counter	Cumulative count of heap allocations triggered by the application.	-
`go_gc_heap_frees_bytes_total`	counter	Cumulative heap memory freed by the garbage collector.	-
`go_gc_heap_frees_objects_total`	counter	Cumulative count of heap allocations freed by the garbage collector.	-
`go_gc_heap_allocs_by_size_bytes`	histogram	Distribution of heap allocations by approximate size.	`le`
`go_gc_heap_frees_by_size_bytes`	histogram	Distribution of freed heap allocations by approximate size.	`le`
`go_gc_scan_globals_bytes`	gauge	Total scannable global variable space, in bytes.	-
`go_gc_scan_heap_bytes`	gauge	Total scannable heap space, in bytes.	-
`go_gc_scan_stack_bytes`	gauge	Total stack bytes scanned in the previous garbage collection cycle.	-
`go_gc_scan_total_bytes`	gauge	Total scannable space, in bytes.	-
`go_gc_stack_starting_size_bytes`	gauge	Stack size of new goroutines, in bytes.	-
`go_gc_limiter_last_enabled_gc_cycle`	gauge	The GC cycle when the GC CPU limiter was last enabled.	-
`go_gc_pauses_seconds`	histogram	Distribution of GC pause durations (deprecated).	`le`
`go_sched_gomaxprocs_threads`	gauge	The current `runtime.GOMAXPROCS` setting.	-
`go_sched_goroutines_goroutines`	gauge	Number of active goroutines.	-
`go_sched_latencies_seconds`	histogram	Distribution of time goroutines spend waiting in the scheduler's run queue.	`le`
`go_sched_pauses_stopping_gc_seconds`	histogram	Distribution of GC-related stop-the-world stop latencies.	`le`
`go_sched_pauses_stopping_other_seconds`	histogram	Distribution of non-GC-related stop-the-world stop latencies.	`le`
`go_sched_pauses_total_gc_seconds`	histogram	Distribution of GC-related stop-the-world pause latencies.	`le`
`go_sched_pauses_total_other_seconds`	histogram	Distribution of non-GC-related stop-the-world pause latencies.	`le`
`go_sync_mutex_wait_total_seconds_total`	counter	Cumulative time goroutines have spent blocked on `sync.Mutex` or `sync.RWMutex`, in seconds.	-
`go_cgo_go_to_c_calls_calls_total`	counter	Total number of CGO calls from Go to C in the current process.	-
`go_cpu_classes_gc_mark_assist_cpu_seconds_total`	counter	Estimated total CPU time goroutines spent assisting the GC mark phase, in seconds.	-
`go_cpu_classes_gc_mark_dedicated_cpu_seconds_total`	counter	Estimated total CPU time spent on dedicated processors for the GC mark phase, in seconds.	-
`go_cpu_classes_gc_mark_idle_cpu_seconds_total`	counter	Estimated total CPU time spent on idle CPU resources for the GC mark phase, in seconds.	-
`go_cpu_classes_gc_pause_cpu_seconds_total`	counter	Estimated total CPU time the application spent paused for GC, in seconds.	-
`go_cpu_classes_gc_total_cpu_seconds_total`	counter	Estimated total CPU time spent running GC work, in seconds.	-
`go_cpu_classes_idle_cpu_seconds_total`	counter	Estimated total available CPU time not spent running any Go or Go runtime code, in seconds.	-
`go_cpu_classes_scavenge_assist_cpu_seconds_total`	counter	Estimated total CPU time spent returning unused memory in response to memory pressure, in seconds.	-
`go_cpu_classes_scavenge_background_cpu_seconds_total`	counter	Estimated total CPU time spent returning unused memory in the background, in seconds.	-
`go_cpu_classes_scavenge_total_cpu_seconds_total`	counter	Estimated total CPU time spent returning unused memory, in seconds.	-
`go_cpu_classes_total_cpu_seconds_total`	counter	Estimated total available CPU time for user Go code or the Go runtime, in seconds.	-
`go_cpu_classes_user_cpu_seconds_total`	counter	Estimated total CPU time spent running user Go code, in seconds.	-
`go_memory_classes_heap_free_bytes`	gauge	Memory that is completely free and eligible to be returned to the operating system, but has not yet been returned, in bytes.	-
`go_memory_classes_heap_objects_bytes`	gauge	Memory occupied by live objects and dead objects that have not yet been marked as free, in bytes.	-
`go_memory_classes_heap_released_bytes`	gauge	Memory that is completely free and has been returned to the operating system, in bytes.	-
`go_memory_classes_heap_stacks_bytes`	gauge	Memory allocated from the heap and reserved for stack space, in bytes.	-
`go_memory_classes_heap_unused_bytes`	gauge	Memory reserved for heap objects but not currently in use, in bytes.	-
`go_memory_classes_metadata_mcache_free_bytes`	gauge	Memory reserved for runtime mcache structures but not in use, in bytes.	-
`go_memory_classes_metadata_mcache_inuse_bytes`	gauge	Memory occupied by runtime mcache structures that are currently in use, in bytes.	-
`go_memory_classes_metadata_mspan_free_bytes`	gauge	Memory reserved for runtime mspan structures but not in use, in bytes.	-
`go_memory_classes_metadata_mspan_inuse_bytes`	gauge	Memory occupied by runtime mspan structures that are currently in use, in bytes.	-
`go_memory_classes_metadata_other_bytes`	gauge	Memory reserved for or used by runtime metadata, in bytes.	-
`go_memory_classes_os_stacks_bytes`	gauge	Stack memory allocated by the underlying operating system, in bytes.	-
`go_memory_classes_other_bytes`	gauge	Memory used for trace buffers, debug structures, and similar data, in bytes.	-
`go_memory_classes_profiling_buckets_bytes`	gauge	Memory used for profiling stack trace hash maps, in bytes.	-
`go_memory_classes_total_bytes`	gauge	Total memory mapped by the Go runtime into the current process (read-write), in bytes.	-

Certificate monitoring metrics

Use these metrics to monitor certificate read operations.

Metric name	Type	Description
`certwatcher_read_certificate_total`	Counter	Counts the total number of certificate reads.
`certwatcher_read_certificate_errors_total`	Counter	Counts the total number of certificate read errors.