All Products
Search
Document Center

E-MapReduce:View monitoring reports

Last Updated:Jan 09, 2025

E-MapReduce (EMR) Serverless StarRocks provides the monitoring report feature that allows you to obtain and view the status and key performance metrics of EMR Serverless StarRocks instances in real time. This helps you identify issues in an efficient manner.

Limits

Only report data within the previous 30 days is retained. After the retention period expires, data is no longer retained.

Precautions

Some metrics are specific to the root account, such as the metric Queries per Minute. The root account is a dedicated account used to manage StarRocks instances. Users cannot view and use the root account.

Procedure

  1. Go to the Instances tab of the E-MapReduce (EMR) console.

    1. Log on to the EMR console.

    2. In the left-side navigation pane, choose EMR Serverless > StarRocks.

    3. In the top navigation bar, select a region based on your business requirements.

  2. On the Instances tab, find the EMR Serverless StarRocks instance that you want to manage and click its name.

  3. On the instance details page, click the Monitoring and Alerting tab.

  4. On the Monitoring Reports tab, filter metrics by resource group and time period.

    • You can view the monitoring data of the following resource groups:

      • default_wg: the default resource group of query tasks.

      • default_mv_wg: the default resource group of materialized views.

Metrics

Instance

  • Overview

    Metric

    Description

    FE Availability

    The availability of frontends (FEs).

    FEs

    The number of FEs.

    FE Detection Status

    The detection status of FEs. EMR Serverless StarRocks detects the status of FEs by sending HTTP requests. On indicates that the detection result is normal, and Off indicates that the detection fails.

    BE or CN Availability

    The availability of backends (BEs) or compute nodes (CNs).

    BEs or CNs

    The number of BEs or CNs.

    BE or CN Detection Status

    The detection status of BEs or CNs. EMR Serverless StarRocks detects the status of BEs or CNs by sending HTTP requests. On indicates that the detection result is normal, and Off indicates that the detection fails.

    Disk Usage (Avg)

    The average disk usage of all BEs in the cluster.

    Compaction Score (Max)

    The highest compaction score of all FEs.

    Queries per Minute

    The number of SELECT queries that are run per minute on each FE.

    Storage Size

    The total amount of data that can be stored. Unit: GiB.

    Note

    This metric is applicable only to compute-storage separation scenarios. The latency of data updates is 1 hour.

  • Query

    Metric

    Description

    Queries per Minute

    The number of query tasks per minute.

    Queries per Minute (Resource Group)

    The number of query tasks per minute in the selected resource group.

    Query Latency

    The 99th percentile query latency.

    Query Latency (Resource Group)

    The 99th percentile query latency in the selected resource group.

    Query Failures per Minute

    The number of query failures per minute.

  • FE

    Metric

    Description

    FE CPU Utilization

    The CPU utilization of each FE.

    FE CPU Load 1min

    The average CPU load of each FE in the previous minute.

    FE Memory Usage

    The memory usage of each FE.

    FE Available Memory

    The size of available memory on each FE.

    FE Connections

    The number of active connections to each FE.

    FE Transaction Status Statistics

    The transaction status statistics per minute of each FE.

  • Materialized View

    Metric

    Description

    Materialized View Status

    The status of each materialized view. Valid values: 0 and 1. A value of 0 indicates that the materialized view is active, and a value of 1 indicates that the materialized view is inactive.

    99th Percentile Refresh Duration

    The amount of time required to refresh each materialized view.

    Refresh Tasks

    The total number of refresh tasks.

    Successful Refresh Tasks

    The number of successful refresh tasks.

    Failed Refresh Tasks

    The number of failed refresh tasks.

    Empty Refresh Tasks

    The number of refresh tasks that are canceled because no new data is available.

    Running Refresh Tasks

    The number of refresh tasks that are in progress.

    Pending Refresh Tasks

    The number of refresh tasks that wait to be run.

    Query Rewritings

    The number of queries that are rewritten on each materialized view, excluding the queries that are directly run on materialized views.

    Queries

    The number of queries that are rewritten on each materialized view, including the queries that are directly run on materialized views.

  • Database and Table Information

    Metric

    Description

    Database Table Distribution

    The distribution of tables across databases in the instance.

    Tables

    The number of tables in each database in the instance.

    Tablets

    The number of tablets on each BE in the instance.

    Data Scanned from Tables

    The total amount of data scanned from each non-system table. Unit: bytes.

    Data Imported to Tables

    The total amount of data imported to each non-system table. Unit: bytes.

  • Others

    Metric

    Description

    Migration Tool: Table Migration Progress

    The progress of table migration. This metric is applicable only to cluster migration scenarios.

Warehouse

  • Overview

    Metric

    Description

    CPU Utilization (Avg)

    The average CPU utilization of all BEs or CNs.

    Memory Usage (Avg)

    The average memory usage of all BEs or CNs.

    Disk Usage (Max)

    The maximum usage of multiple data disks of all BEs or CNs.

    Compaction Score (Max)

    The highest compaction score of all nodes. This value reflects the current compaction pressure.

    Node Detection Status

    The detection status of nodes. EMR Serverless StarRocks detects the status of nodes by sending HTTP requests. On indicates that the detection result is normal, and Off indicates that the detection fails.

  • Compaction

    Metric

    Description

    Compacted Data per Minute

    The amount of data that is compacted per minute during the base compaction and cumulative compaction process.

    Compacted Rowsets per Minute

    The number of rowsets that are compacted per minute during the base compaction and cumulative compaction process.

    Maximum Compaction Score

    The highest compaction score of all FEs.

    Compaction Memory

    The size of memory used by compaction tasks.

  • BE

    Metric

    Description

    CPU Utilization

    The CPU utilization of each BE.

    BE CPU Load 1min

    The average CPU load of each BE in the previous minute.

    Scanned Data in Queries

    The amount of data scanned during the queries on each BE.

    Scanned Rows in Queries

    The number of rows scanned during the queries on each BE.

    Request Statistics

    The total number of requests on BEs, including requests to create tables, publish versions, and clone tables.

    Failed Request Statistics

    The total number of failed requests on BEs, including requests to create tables, publish versions, and clone tables.

    Transaction Phase Statistics

    The statistics of transaction phases per minute.

  • BE Memory

    Metric

    Description

    Memory Usage

    The memory usage of each BE.

    Process Memory

    The size of memory used by each BE.

    Available Memory

    The size of available memory on each BE.

    Memory Pie Chart {Selected nodes}

    The items that consume memory on the selected nodes. You can use the pie chart to roughly view the proportion of memory used by each item.

    Memory Stacked Chart {Selected nodes}

  • BE Disk

    Metric

    Description

    Space Proportion

    The disk space proportions of the following items: available space, cache files, data files, and others. Other items include trash and expired data.

    Used Space

    The total disk space occupied by the following items: available space, cache files, data files, and others. Other items include trash and expired data.

    Used Space {Selected nodes}

    The disk space occupied by each item on the selected BEs.

    Available Space

    The available disk space of each BE.

    Available Space in Percentage

    The percentage of the available space of each BE.

    Used Space (Data)

    The disk space occupied by data files on each BE.

    Space Usage (Data)

    The disk usage of data files on each BE.

    Used Space (Cache)

    The disk space occupied by cache files on each BE.

    Space Usage (Cache)

    The disk usage of cache files on each BE.

    Used Space (Other)

    The disk space occupied by other items on each BE. Other items include trash and expired data.

    Space Usage (Other)

    The disk usage of other items on each BE. Other items include trash and expired data.

    Read Traffic (Sum)

    The read traffic of all disks per second on each BE.

    Read IOPS (Sum)

    The number of read operations on all disks per second on each BE.

    Read Latency (Avg)

    The average read latency of all disks.

    Write Traffic (Sum)

    The write traffic of all disks per second on each BE.

    Write IOPS (Sum)

    The number of write operations on all disks per second on each BE.

    Write Latency (Avg)

    The average write latency of all disks.

  • BE Network

    Metric

    Description

    Receive Rate

    The amount of data that is received over the network per second on each BE.

    Send Rate

    The amount of data that is sent over the network per second on each BE.

    TCP Connections

    The number of TCP connections to each BE.

  • Cache

    Metric

    Description

    FSLIB Cache Hit Ratio

    The cache hit ratio per minute.

    FSLIB Cache Hits

    The number of cache hits per minute.

  • Fully Managed Storage

    Note

    The metrics described in the following table are applicable only to fully managed storage in compute-storage separation scenarios.

    Metric

    Description

    Storage Usage Trend

    The amount of fully storage data. Unit: GiB.

    Read and Write Traffic

    The read and write traffic of fully managed storage.

  • Resource Group

    Metric

    Description

    Used CPU Cores of Resource Group V3.1.x

    The number of CPU cores used by the selected resource group. This value is an estimate, which is the average value within two consecutive sampling periods. This metric is applicable to EMR Serverless StarRocks instances of V3.1.4 and later.

    CPU Utilization of Resource Group V2.x

    The proportion of the CPU time consumed by the selected resource group to the total CPU time.

    Used Memory of Resource Group

    The size of memory used by the selected resource group.

    Running Tasks in Resource Group

    The number of query tasks that are in progress in the selected resource group.

    Concurrency Limit Events in Resource Group

    The number of queries that reach the concurrency limit in the selected resource group.

    Large Query Limit Events in Resource Group

    The number of times that the large query limit is reached in the selected resource group.

    Query Latency in Resource Group

    The 99th percentile and average query latency in the selected resource group.

    Queries per Minute in Resource Group

    The number of queries that are run per minute in the selected resource group.