All Products
Search
Document Center

E-MapReduce:View monitoring reports

Last Updated:Mar 26, 2026

Use the Monitoring and Alerting tab to track key performance metrics for your EMR Serverless StarRocks instances and identify issues quickly.

Limitations

Only monitoring data from the previous 30 days is available.

Root account metrics

Some metrics—such as the Query metric—are tied to the root account. The root account is a dedicated internal account for managing StarRocks instances. You cannot view or access it directly.

View metrics

  1. Log on to the E-MapReduce console.

  2. In the left navigation pane, choose EMR Serverless > StarRocks.

  3. In the top menu bar, select the region where your instance is deployed.

  4. Click the ID of the instance you want to monitor.

  5. Click the Monitoring and Alerting tab.

  6. Set the Resource Group and Select Time parameters to filter the metrics you want to view. The Resource Group parameter supports the following values:

    ValueDescription
    default_wgDefault resource group for query tasks
    default_mv_wgDefault resource group for materialized views

Metrics reference

Metrics are organized into three top-level sections: Instance, Compute group, and Storage.

Instance

Overview

MetricDescription
FE AvailabilityAvailability of frontend nodes (FEs)
BE/CN AvailabilityAvailability of backend nodes (BEs) or compute nodes (CNs)
FE CountNumber of FEs
BE or CN CountNumber of BEs or CNs
Disk Usage (Avg)Average disk usage across all BEs in the instance
StorageActual storage used by StarRocks. Available only for compute-storage separation scenarios. Updated with approximately a one-hour delay.
Compaction Score (Max)Highest compaction score of each FE. Available only for shared-nothing instances.
FE DetectionFE status detected via HTTP requests. On = detection succeeded; Off = detection failed.
BE/CN Node StatusBE/CN node status as reported by FE. If the number of alive nodes looks abnormal, run SHOW COMPUTE NODES to inspect node details.

Query

MetricDescription
Queries per minuteNumber of query tasks per minute
Number of query faults per minuteNumber of query errors per minute
Query latency p99Query latency at the 99th percentile
Slow QueryNumber of slow queries per minute

FE

MetricDescription
FE transaction resolution statisticsTransaction status statistics per FE or across all FEs, per minute
FE Disk UsageDisk space used per FE or across all FEs. Updated every hour.

FE CPU

MetricDescription
CPU UtilCPU utilization of each FE
FE CPU Load 1minAverage CPU load of each FE over the previous minute

FE mem

MetricDescription
JVM Heap UsageRatio of used memory to maximum memory in the JVM heap
JVM Young GCFrequency and duration of garbage collection in the young generation space
JVM HeapJVM heap memory usage
JVM Old GCFrequency and duration of garbage collection in the old generation space of the Java Virtual Machine (JVM)

FE net

MetricDescription
Network Receive RateData received per second
Net OutData sent per second
FE ConnectionsNumber of active connections to each FE

Resource group

MetricDescription
QueryNumber of query tasks running on the selected resource group, per minute
Query Latency p99Query latency at the 99th percentile for the selected resource group
Query (Resource Group)Number of query tasks running across all resource groups, per minute

Materialized view

MetricDescription
MV StatusStatus of each materialized view. 0 = active; 1 = inactive.
MV Refresh Duration p99Time required to refresh materialized views, at the 99th percentile
MV Jobs (Total)Total number of refresh tasks
MV Jobs (Successful)Number of successful refresh tasks
Purge job failedNumber of failed refresh tasks
Purge Job EmptyNumber of refresh tasks canceled because no new data was available
MV Jobs (Running)Number of refresh tasks currently in progress
Purge job pendingNumber of refresh tasks waiting to run
MV Hit CountNumber of queries rewritten using each materialized view, excluding queries run directly on materialized views
MV Query CountNumber of queries rewritten using each materialized view, including queries run directly on materialized views

Tables

MetricDescription
DataBase TablesDistribution of tables across databases in the instance
Table CountTotal number of tables in the instance
Tablet CountTotal number of tablets in the instance
Table Scan BytesTotal data scanned from non-system tables. Unit: bytes.
Table Load BytesTotal data imported to non-system tables. Unit: bytes.

Others

MetricDescription
Transfer ProgressProgress of table migration. Applies only to cluster migration scenarios.

Compute group

Overview

MetricDescription
CPU Util (Avg)Average CPU utilization across all BEs or CNs
Mem Util (Avg)Average memory utilization across all BEs or CNs
Disk Usage (Max)Maximum disk usage across all data disks on all BEs or CNs
BE/CN Node StatusBE/CN status detected via HTTP requests. On = detection succeeded; Off = detection failed.

Compaction

MetricDescription
Maximum Compaction ScoreHighest compaction score across all FEs
Mem (Compaction)Memory used by compaction tasks
Compaction BytesData compacted per minute during base and cumulative compaction
Compaction RowsetsRowsets compacted per minute during base and cumulative compaction

BE/CN

MetricDescription
Query Scan BytesData scanned per query on each BE
Query Scan RowsRows scanned per query on each BE
Request StatisticsTotal requests on specific nodes, including create table, publish version, and clone table operations
Engine Requests (Failed)Failed requests on BEs, including create table, publish version, and clone table operations
Transaction RequestsTransaction phase statistics per minute

BE/CN CPU

MetricDescription
CPU UtilCPU utilization
BE/CN CPU Load 1minAverage CPU load of specific nodes over the previous minute

BE/CN mem

MetricDescription
Memory utilizationNode memory utilization, including BE/CN process memory, UDF memory, and reserved memory for BE/CN
Process Mem (BE/CN)Memory used by the BE/CN process
Process memoryProcess memory based on kernel-collected memory items. Items outside the collection scope are labeled Other. For details, see Memory_management.
Node MemNode memory divided into three components: pod available memory (Pod Avail Mem), process memory (Process Mem), and non-process memory (Non Process Mem)
Node mem (BE/CN)Total node memory, the 81% memory threshold, node memory usage, and process memory usage. The available memory for BE/CN is jointly restricted by the 0.9 coefficient in the StarRocks code and the mem_limit configuration parameter (default: 0.9). By default, the actual available memory for BE/CN is 81% of total node memory.

BE/CN disk

MetricDescription
Disk usageRatio of used disk space to total capacity, including data, trash, and other categories
Used disk spaceAbsolute disk space used
Disk Usage (Data)Disk space occupied by data files on specific nodes
Disk Usage (Data)Disk usage of data files on specific nodes

BE/CN disk IO

MetricDescription
Read Traffic (SUM)Read traffic across all disks per second on specific nodes
Disk IO (Write)Write traffic across all disks per second on specific nodes
Disk IOPS (Read)Read operations per second across all disks on specific nodes
Disk IOPS (Write)Write operations per second across all disks on specific nodes
Disk IO Latency (Read)Average read latency across all disks
Disk IO Latency (Write)Average write latency across all disks
IO Util (Max)Percentage of time that an I/O device (such as a disk or network interface) is busy

BE/CN net

MetricDescription
Net (In)Data received per second
Net (Out)Data sent per second
TCP connection countNumber of TCP connections

Cache

Note

These metrics are available only for compute-storage separation scenarios.

MetricDescription
FSLIB Cache Hit RatioCache hit ratio per minute
FSLIB Cache Hit/MissNumber of cache hits per minute

Storage

Note

These metrics are available only for StarRocks shared-data instances.

MetricDescription
StorageAmount of fully managed data. Unit: GiB.
Storage IORead and write traffic of fully managed data

Resource group

MetricDescription
Resource Group Use CPU CoresNumber of CPU cores used by a specific resource group. The value is an estimated average across two consecutive sampling periods. Available for StarRocks V3.1.4 and later.
Resource Group CPU Usage (v2.x)Ratio of CPU time consumed by a specific resource group to total CPU time
Resource Group Mem UsageMemory used by a specific resource group
Running tasksNumber of query tasks running on a specific resource group
Resource Group Concurrency OverflowNumber of queries that have reached the concurrency limit in a specific resource group
Number of times the large query limit is triggeredNumber of times the large query limit has been reached in a specific resource group

Others

MetricDescription
Page Cache Hit RateNumber of requests that hit the page cache
Publish Version Latency P99Time consumed to publish a version when data is written to StarRocks, at the 99th percentile

Storage

Data storage

Note

These metrics are available only for StarRocks shared-data instances.

MetricDescription
StorageAmount of fully managed data. Unit: GiB. The value is updated with a delay of about one hour.
Storage IORead and write traffic of fully managed data

Disk usage

Compute-storage separation

MetricDescription
Disk usageDisk usage
Used disk spaceAmount of disk space used

In-memory computing

MetricDescription
Free space percentagePercentage of available space on specific nodes
Disk Usage (Avail)Available disk space on specific nodes
Disk Usage (Data)Disk space occupied by data files on specific nodes
Disk Usage (Sum)Usage across available space, cache, and data files on the disk

Disk IO

MetricDescription
Disk IO (Read)Read traffic across all disks per second on specific nodes
Disk IO (Write)Write traffic across all disks per second on specific nodes
Disk IOPS (Read)Read operations per second across all disks on specific nodes
Disk IOPS (Write)Write operations per second across all disks on specific nodes
Disk IO Latency (Read)Average read latency across all disks
Disk IO Latency (Write)Average write latency across all disks
IO Util (Max)Percentage of time that an I/O device (such as a disk or network interface) is busy