All Products
Search
Document Center

E-MapReduce:View monitoring reports

Last Updated:Jul 10, 2025

EMR Serverless StarRocks provides monitoring and alerting features that allow you to view the status and key performance metrics of EMR Serverless StarRocks instances in real time. This helps you identify issues efficiently.

Limits

Only monitoring data from the previous 30 days is available.

Precautions

Some metrics are related to the root account, such as the Query metric. The root account is a dedicated account used to manage StarRocks instances. Users cannot view or use the root account.

Procedure

  1. Go to the homepage of E-MapReduce (EMR) Serverless StarRocks.

    1. Log on to the EMR console.

    2. In the left-side navigation pane, choose EMR Serverless > StarRocks.

    3. In the top navigation bar, select a region based on your business requirements.

  2. Click the ID of the instance.

  3. Click the Monitoring And Alerting tab.

  4. On the Monitoring And Alerting tab, configure the Resource Group and Select Time parameters to view specific metrics.

    Valid values of the Resource Group parameter:

    • default_wg: the default resource group used by query tasks.

    • default_mv_wg: the default resource group used by materialized views.

Metrics

Instance

  • Overview

    Metric

    Description

    FE Availability

    The availability of frontend nodes (FEs).

    BE/CN Availability

    The availability of backend nodes (BEs) or compute nodes (CNs).

    FE Count

    The number of FEs.

    BE or CN Count

    The number of BEs or CNs.

    Disk Usage (Avg)

    The average disk usage of all BEs in the StarRocks instance.

    Storage

    The actual storage space used by StarRocks. This metric is available only for compute-storage separation scenarios. The value of the metric is updated with a delay of about one hour.

    Compaction Score (Max)

    The highest Compaction Score of each FE. This parameter is available only for StarRocks shared-nothing instances.

    FE Detection

    The detection status of FEs. EMR Serverless StarRocks detects the status of FEs by sending HTTP requests. The value On indicates that the detection result is normal, and the value Off indicates that the detection fails.

    BE/CN Node Status

    The status of BE/CN nodes reported by FE. If the number of Alive nodes is abnormal, you can use the SHOW COMPUTE NODES command to view node details.

  • Query

    Metric

    Description

    Queries per minute

    The number of query tasks per minute.

    Number of query faults per minute

    The number of query errors per minute.

    Query latency p99

    The query latency.

    Slow Query

    The number of slow queries per minute.

  • FE

    Metric

    Description

    FE transaction resolution statistics

    The statistics on the transaction status of each FE or all FEs per minute.

    FE Disk Usage

    The data disk used by each FE or all FEs. The metric value is updated every hour.

  • FE CPU

    Metric

    Description

    CPU Util

    The CPU utilization of each FE.

    FE CPU Load 1min

    The average CPU load of each FE in the previous minute.

  • FE Mem

    Metric

    Description

    JVM Heap Usage

    The ratio of used memory to maximum memory in the JVM heap.

    JVM Young GC

    The number of times and the time when garbage collection is performed in the young generation space.

    JVM Heap

    The usage of JVM heap memory.

    JVM Old GC

    The number of times and the time when garbage collection is performed in the old generation space of a Java virtual machine (JVM).

  • FE Net

    Metric

    Description

    Network Receive Rate

    The amount of data that is received per second.

    Net Out

    The amount of data that is sent per second.

    FE Connections

    The number of active connections to each FE.

  • Resource Group

    Metric

    Description

    Query

    The number of query tasks that run on the selected resource group per minute.

    Query Latency p99

    The query latency.

    Query (Resource Group)

    The number of query tasks that run on all the resource groups per minute.

  • Materialized View

    Metric

    Description

    MV Status

    The status of materialized views. Valid values: 0 and 1. The value 0 indicates that the materialized view is active, and the value 1 indicates that the materialized view is inactive.

    MV Refresh Duration p99

    The amount of time required to refresh materialized views.

    MV Jobs (Total)

    The total number of refresh tasks.

    MV Jobs (Successful)

    The number of successful refresh tasks.

    Purge job failed

    The number of failed refresh tasks.

    Purge Job Empty

    The number of refresh tasks that are canceled because no new data is available.

    MV Jobs (Running)

    The number of refresh tasks that are in progress.

    Purge job pending

    The number of refresh tasks that wait to run.

    MV Hit Count

    The number of queries that are rewritten on each materialized view, excluding the queries that are directly run on materialized views.

    MV Query Count

    The number of queries that are rewritten on each materialized view, including the queries that are directly run on materialized views.

  • Tables

    Metric

    Description

    DataBase Tables

    The distribution of tables across databases in the instance.

    Table Count

    The number of tables in the instance.

    Tablet Count

    The number of tablets in the instance.

    Table Scan Bytes

    The total amount of data scanned from non-system tables. Unit: bytes.

    Table Load Bytes

    The total amount of data imported to non-system tables. Unit: bytes.

  • Others

    Metric

    Description

    Transfer Progress

    The progress of table migration. This metric is applicable only to cluster migration scenarios.

Compute group

  • Overview

    Metric

    Description

    CPU Util (Avg)

    The average CPU utilization of all BEs or CNs.

    Mem Util (Avg)

    The average memory usage of all BEs or CNs.

    Disk Usage (Max)

    The maximum usage of multiple data disks of all BEs or CNs.

    BE/CN Node Status

    The detection status of BEs or CNs. EMR Serverless StarRocks detects the status of BEs or CNs by sending HTTP requests. The value On indicates that the detection result is normal, and the value Off indicates that the detection fails.

  • Compaction

    Metric

    Description

    Maximum Compaction Score

    The highest compaction score of the FEs.

    Mem (Compaction)

    The memory used by compaction tasks.

    Compaction Bytes

    The amount of data that is compacted per minute during the base compaction and cumulative compaction process.

    Compaction Rowsets

    The number of rowsets that are compacted per minute during the base compaction and cumulative compaction process.

  • BE/CN

    Metric

    Description

    Query Scan Bytes

    The amount of data scanned during the queries on each BE.

    Query Scan Rows

    The number of rows scanned during the queries on each BE.

    Request Statistics

    The total number of requests on specific nodes, such as the requests to create tables, publish versions, and clone tables.

    Engine Requests (Failed)

    The number of failed requests on BEs, such as the requests to create tables, publish versions, and clone tables.

    Transaction Requests

    The statistics of transaction phases per minute.

  • BE/CN CPU

    Metric

    Description

    CPU Util

    The CPU utilization.

    BE/CN CPU Load 1min

    The average CPU load of specific nodes in the previous minute.

  • BE/CN Mem

    Metric

    Description

    Memory utilization

    Node memory utilization includes BE/CN process memory, memory used by UDFs, reserved memory for BE/CN, etc.

    Process Mem (BE/CN)

    Memory usage of the BE/CN process.

    Process memory

    The process memory depends on the memory items collected by the kernel. Memory items that are not fully collected and fall outside the collection scope are labeled as "Other". For more memory information, see Memory_management.

    Node Mem

    Divided into three components: pod available memory (Pod Avail Mem), process memory (Process Mem), and non-process memory (Non Process Mem).

    Node mem (BE/CN)

    BE/CN node memory includes: total node memory, 81% node memory threshold, node memory usage, and process memory usage. The upper limit of BE/CN available memory is jointly restricted by the 0.9 coefficient in the StarRocks code and the mem_limit configuration parameter (default: 0.9). By default, the actual available memory for BE/CN is 81% of total node memory.

  • BE/CN Disk

    Metric

    Description

    Disk usage

    The ratio of used disk space to total capacity, including Data, Trash, etc.

    Used disk space

    The absolute capacity of used disk space.

    Disk Usage (Data)

    The disk space occupied by data files on specific nodes.

    Disk Usage (Data)

    The disk usage of data files on specific nodes.

  • BE/CN Disk IO

    Metric

    Description

    Read Traffic (SUM)

    The read traffic of all disks per second on specific nodes.

    Disk IO (Write)

    The write traffic of all disks per second on specific nodes.

    Disk IOPS (Read)

    The number of read operations on all disks per second on specific nodes.

    Disk IOPS (Write)

    The number of write operations on all disks per second on specific nodes.

    Disk IO Latency (Read)

    The average read latency of all disks.

    Disk IO Latency (Write)

    The average write latency of all disks.

    IO Util (Max)

    The percentage of time that an I/O device, such as a disk or a network interface, is busy over a period of time.

  • BE/CN Net

    Metric

    Description

    Net (In)

    The amount of data that is received per second.

    Net (Out)

    The amount of data that is sent per second.

    TCP connection count

    The number of TCP connections.

  • Cache

    Note

    The metrics described in the following table are available only for compute-storage separation scenarios.

    Metric

    Description

    FSLIB Cache Hit Ratio

    The cache hit ratio per minute.

    FSLIB Cache Hit/Miss

    The number of cache hits per minute.

  • Storage

    Note

    The metrics described in the following table are available only for StarRocks shared-data instances.

    Metric

    Description

    Storage

    The amount of fully managed data. Unit: GiB.

    Storage IO

    The read and write traffic of fully managed data.

  • Resource Group

    Metric

    Description

    Resource Group Use CPU Cores

    The number of CPU cores used by a specific resource group. The value is an estimated average value within two consecutive sampling periods. This metric is available for StarRocks instances of V3.1.4 and later.

    Resource Group CPU Usage (v2.x)

    The ratio of the CPU time consumed by a specific resource group to the total CPU time.

    Resource Group Mem Usage

    The memory used by a specific resource group.

    Running tasks

    The number of query tasks that are running on a specific resource group.

    Resource Group Concurrency Overflow

    The number of queries that reach the concurrency limit in a specific resource group.

    Number of times the large query limit is triggered

    The number of times that the large query limit is reached in a specific resource group.

  • Others

    Metric

    Description

    Page Cache Hit Rate

    The number of requests that hit the page cache.

    Publish Version Latency P99

    The amount of time that is consumed to publish a version when data is written to StarRocks.

Storage

  • Data Storage

    Metric

    Description

    Storage

    The amount of fully managed data. Unit: GiB. This metric is available only for StarRocks shared-data instances. The value of the metric is updated with a delay of about one hour.

    Storage IO

    The read and write traffic of fully managed data. This metric is available only for StarRocks shared-data instances.

  • Disk Usage

    • Compute-storage separation

      Metric

      Description

      Disk usage

      The disk usage.

      Used disk space

      The amount of disk space used.

    • In-memory computing

      Metric

      Description

      Free space percentage

      The percentage of the available space of specific nodes.

      Disk Usage (Avail)

      The available disk space of specific nodes.

      Disk Usage (Data)

      The disk space occupied by data files on specific nodes.

      Disk Usage (Data)

      The disk usage of data files on specific nodes.

      Disk Usage (Sum)

      The usage of the available, cache, and data files on the disk.

      Disk Usage (Sum)

  • Disk IO

    Metric

    Description

    Disk IO (Read)

    The read traffic of all disks per second on specific nodes.

    Disk IO (Write)

    The write traffic of all disks per second on specific nodes.

    Disk IOPS (Read)

    The number of read operations on all disks per second on specific nodes.

    Disk IOPS (Write)

    The number of write operations on all disks per second on specific nodes.

    Disk IO Latency (Read)

    The average read latency of all disks.

    Disk IO Latency (Write)

    The average write latency of all disks.

    IO Util (Max)

    The percentage of time that an I/O device, such as a disk or a network interface, is busy over a period of time.