Limitations
Only monitoring data from the previous 30 days is available.
Root account metrics
Some metrics—such as the Query metric—are tied to the root account. The root account is a dedicated internal account for managing StarRocks instances. You cannot view or access it directly.
View metrics
Log on to the E-MapReduce console.
In the left navigation pane, choose EMR Serverless > StarRocks.
In the top menu bar, select the region where your instance is deployed.
Click the ID of the instance you want to monitor.
Click the Monitoring and Alerting tab.
Set the Resource Group and Select Time parameters to filter the metrics you want to view. The Resource Group parameter supports the following values:
| Value | Description |
|---|
default_wg | Default resource group for query tasks |
default_mv_wg | Default resource group for materialized views |
Metrics reference
Metrics are organized into three top-level sections: Instance, Compute group, and Storage.
Instance
Overview
| Metric | Description |
|---|
| FE Availability | Availability of frontend nodes (FEs) |
| BE/CN Availability | Availability of backend nodes (BEs) or compute nodes (CNs) |
| FE Count | Number of FEs |
| BE or CN Count | Number of BEs or CNs |
| Disk Usage (Avg) | Average disk usage across all BEs in the instance |
| Storage | Actual storage used by StarRocks. Available only for compute-storage separation scenarios. Updated with approximately a one-hour delay. |
| Compaction Score (Max) | Highest compaction score of each FE. Available only for shared-nothing instances. |
| FE Detection | FE status detected via HTTP requests. On = detection succeeded; Off = detection failed. |
| BE/CN Node Status | BE/CN node status as reported by FE. If the number of alive nodes looks abnormal, run SHOW COMPUTE NODES to inspect node details. |
Query
| Metric | Description |
|---|
| Queries per minute | Number of query tasks per minute |
| Number of query faults per minute | Number of query errors per minute |
| Query latency p99 | Query latency at the 99th percentile |
| Slow Query | Number of slow queries per minute |
FE
| Metric | Description |
|---|
| FE transaction resolution statistics | Transaction status statistics per FE or across all FEs, per minute |
| FE Disk Usage | Disk space used per FE or across all FEs. Updated every hour. |
FE CPU
| Metric | Description |
|---|
| CPU Util | CPU utilization of each FE |
| FE CPU Load 1min | Average CPU load of each FE over the previous minute |
FE mem
| Metric | Description |
|---|
| JVM Heap Usage | Ratio of used memory to maximum memory in the JVM heap |
| JVM Young GC | Frequency and duration of garbage collection in the young generation space |
| JVM Heap | JVM heap memory usage |
| JVM Old GC | Frequency and duration of garbage collection in the old generation space of the Java Virtual Machine (JVM) |
FE net
| Metric | Description |
|---|
| Network Receive Rate | Data received per second |
| Net Out | Data sent per second |
| FE Connections | Number of active connections to each FE |
Resource group
| Metric | Description |
|---|
| Query | Number of query tasks running on the selected resource group, per minute |
| Query Latency p99 | Query latency at the 99th percentile for the selected resource group |
| Query (Resource Group) | Number of query tasks running across all resource groups, per minute |
Materialized view
| Metric | Description |
|---|
| MV Status | Status of each materialized view. 0 = active; 1 = inactive. |
| MV Refresh Duration p99 | Time required to refresh materialized views, at the 99th percentile |
| MV Jobs (Total) | Total number of refresh tasks |
| MV Jobs (Successful) | Number of successful refresh tasks |
| Purge job failed | Number of failed refresh tasks |
| Purge Job Empty | Number of refresh tasks canceled because no new data was available |
| MV Jobs (Running) | Number of refresh tasks currently in progress |
| Purge job pending | Number of refresh tasks waiting to run |
| MV Hit Count | Number of queries rewritten using each materialized view, excluding queries run directly on materialized views |
| MV Query Count | Number of queries rewritten using each materialized view, including queries run directly on materialized views |
Tables
| Metric | Description |
|---|
| DataBase Tables | Distribution of tables across databases in the instance |
| Table Count | Total number of tables in the instance |
| Tablet Count | Total number of tablets in the instance |
| Table Scan Bytes | Total data scanned from non-system tables. Unit: bytes. |
| Table Load Bytes | Total data imported to non-system tables. Unit: bytes. |
Others
| Metric | Description |
|---|
| Transfer Progress | Progress of table migration. Applies only to cluster migration scenarios. |
Compute group
Overview
| Metric | Description |
|---|
| CPU Util (Avg) | Average CPU utilization across all BEs or CNs |
| Mem Util (Avg) | Average memory utilization across all BEs or CNs |
| Disk Usage (Max) | Maximum disk usage across all data disks on all BEs or CNs |
| BE/CN Node Status | BE/CN status detected via HTTP requests. On = detection succeeded; Off = detection failed. |
Compaction
| Metric | Description |
|---|
| Maximum Compaction Score | Highest compaction score across all FEs |
| Mem (Compaction) | Memory used by compaction tasks |
| Compaction Bytes | Data compacted per minute during base and cumulative compaction |
| Compaction Rowsets | Rowsets compacted per minute during base and cumulative compaction |
BE/CN
| Metric | Description |
|---|
| Query Scan Bytes | Data scanned per query on each BE |
| Query Scan Rows | Rows scanned per query on each BE |
| Request Statistics | Total requests on specific nodes, including create table, publish version, and clone table operations |
| Engine Requests (Failed) | Failed requests on BEs, including create table, publish version, and clone table operations |
| Transaction Requests | Transaction phase statistics per minute |
BE/CN CPU
| Metric | Description |
|---|
| CPU Util | CPU utilization |
| BE/CN CPU Load 1min | Average CPU load of specific nodes over the previous minute |
BE/CN mem
| Metric | Description |
|---|
| Memory utilization | Node memory utilization, including BE/CN process memory, UDF memory, and reserved memory for BE/CN |
| Process Mem (BE/CN) | Memory used by the BE/CN process |
| Process memory | Process memory based on kernel-collected memory items. Items outside the collection scope are labeled Other. For details, see Memory_management. |
| Node Mem | Node memory divided into three components: pod available memory (Pod Avail Mem), process memory (Process Mem), and non-process memory (Non Process Mem) |
| Node mem (BE/CN) | Total node memory, the 81% memory threshold, node memory usage, and process memory usage. The available memory for BE/CN is jointly restricted by the 0.9 coefficient in the StarRocks code and the mem_limit configuration parameter (default: 0.9). By default, the actual available memory for BE/CN is 81% of total node memory. |
BE/CN disk
| Metric | Description |
|---|
| Disk usage | Ratio of used disk space to total capacity, including data, trash, and other categories |
| Used disk space | Absolute disk space used |
| Disk Usage (Data) | Disk space occupied by data files on specific nodes |
| Disk Usage (Data) | Disk usage of data files on specific nodes |
BE/CN disk IO
| Metric | Description |
|---|
| Read Traffic (SUM) | Read traffic across all disks per second on specific nodes |
| Disk IO (Write) | Write traffic across all disks per second on specific nodes |
| Disk IOPS (Read) | Read operations per second across all disks on specific nodes |
| Disk IOPS (Write) | Write operations per second across all disks on specific nodes |
| Disk IO Latency (Read) | Average read latency across all disks |
| Disk IO Latency (Write) | Average write latency across all disks |
| IO Util (Max) | Percentage of time that an I/O device (such as a disk or network interface) is busy |
BE/CN net
| Metric | Description |
|---|
| Net (In) | Data received per second |
| Net (Out) | Data sent per second |
| TCP connection count | Number of TCP connections |
Cache
Note These metrics are available only for compute-storage separation scenarios.
| Metric | Description |
|---|
| FSLIB Cache Hit Ratio | Cache hit ratio per minute |
| FSLIB Cache Hit/Miss | Number of cache hits per minute |
Storage
Note These metrics are available only for StarRocks shared-data instances.
| Metric | Description |
|---|
| Storage | Amount of fully managed data. Unit: GiB. |
| Storage IO | Read and write traffic of fully managed data |
Resource group
| Metric | Description |
|---|
| Resource Group Use CPU Cores | Number of CPU cores used by a specific resource group. The value is an estimated average across two consecutive sampling periods. Available for StarRocks V3.1.4 and later. |
| Resource Group CPU Usage (v2.x) | Ratio of CPU time consumed by a specific resource group to total CPU time |
| Resource Group Mem Usage | Memory used by a specific resource group |
| Running tasks | Number of query tasks running on a specific resource group |
| Resource Group Concurrency Overflow | Number of queries that have reached the concurrency limit in a specific resource group |
| Number of times the large query limit is triggered | Number of times the large query limit has been reached in a specific resource group |
Others
| Metric | Description |
|---|
| Page Cache Hit Rate | Number of requests that hit the page cache |
| Publish Version Latency P99 | Time consumed to publish a version when data is written to StarRocks, at the 99th percentile |
Storage
Data storage
Note These metrics are available only for StarRocks shared-data instances.
| Metric | Description |
|---|
| Storage | Amount of fully managed data. Unit: GiB. The value is updated with a delay of about one hour. |
| Storage IO | Read and write traffic of fully managed data |
Disk usage
Compute-storage separation
| Metric | Description |
|---|
| Disk usage | Disk usage |
| Used disk space | Amount of disk space used |
In-memory computing
| Metric | Description |
|---|
| Free space percentage | Percentage of available space on specific nodes |
| Disk Usage (Avail) | Available disk space on specific nodes |
| Disk Usage (Data) | Disk space occupied by data files on specific nodes |
| Disk Usage (Sum) | Usage across available space, cache, and data files on the disk |
Disk IO
| Metric | Description |
|---|
| Disk IO (Read) | Read traffic across all disks per second on specific nodes |
| Disk IO (Write) | Write traffic across all disks per second on specific nodes |
| Disk IOPS (Read) | Read operations per second across all disks on specific nodes |
| Disk IOPS (Write) | Write operations per second across all disks on specific nodes |
| Disk IO Latency (Read) | Average read latency across all disks |
| Disk IO Latency (Write) | Average write latency across all disks |
| IO Util (Max) | Percentage of time that an I/O device (such as a disk or network interface) is busy |