All Products
Search
Document Center

AnalyticDB for MySQL:View monitoring information

Last Updated:Jan 24, 2024

AnalyticDB for MySQL Data Warehouse Edition (V3.0) clusters in elastic mode for Cluster Edition and Data Lakehouse Edition (V3.0) clusters provide a variety of metrics, including data queries and writes, resource group information, table statistics, and cluster running status. You can view cluster metrics within a time range in the last month by calling API operations or in the AnalyticDB for MySQL console. This helps you identify and resolve issues based on cluster performance and running status.

Usage notes

  • You can view the monitoring information within two days in the last month.

  • You can view the health status information only for clusters of V3.1.6 and later.

    Note
  • For AnalyticDB for MySQL Data Warehouse Edition (V3.0) clusters, take note of the following items:

    You can view the Resource Group Monitoring information only when your AnalyticDB for MySQL cluster meets the following requirements:

    • The cluster is in elastic mode for Cluster Edition.

    • The cluster has 32 cores or more.

    • The minor version of the cluster is 3.1.3.2 or later.

View monitoring information about a Data Warehouse Edition (V3.0) cluster

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Data Warehouse Edition (V3.0) tab, find the cluster that you want to manage and click the cluster ID.

  1. In the left-side navigation pane, click Monitoring Information.

  1. On the Monitoring Information page, click the Instance Resource Monitoring, Resource Group Monitoring, or Table Information Statistics tab to view the corresponding monitoring information.

Metrics

  • Health status metrics

    Important

    If the value of a health status metric is risky or unavailable, contact technical support.

    Metric

    Description

    Instance Access Node Status

    The access layer of AnalyticDB for MySQL is composed of multiple instance access nodes and provides features such as protocol layer access, SQL parsing and optimization, real-time sharding of written data, data scheduling, and query scheduling.

    Valid values:

    • Healthy: All the instance access nodes are available.

    • Risky: Fifty percent of instance access nodes or more are unavailable.

    • Unavailable: All the instance access nodes are unavailable.

    Health Status of Compute Node Groups

    The compute engine of AnalyticDB for MySQL is composed of compute node groups and supports the integrated execution of distributed massively parallel processing (MPP) and directed acyclic graph (DAG) architectures. The compute engine can work with intelligent optimizers to support high concurrency and hybrid loads of complex SQL statements. Additionally, the cloud native infrastructure allows compute nodes to be elastically scaled out within seconds based on business requirements. This allows resources to be efficiently utilized.

    Valid values:

    • Healthy: All the compute nodes are available.

    • Risky: Fifty percent of compute nodes or more are unavailable.

    • Unavailable: All the compute nodes are unavailable.

    Health Status of Storage Node Groups

    The storage engine of AnalyticDB for MySQL is composed of storage nodes and supports real-time data writes with strong consistency and high availability in compliance with the Raft consensus protocol. The storage engine uses data sharding and Multi-Raft to support parallel processing, tiered storage to separate hot and cold data at lower costs, and hybrid row-column storage and intelligent indexing to provide ultra-high performance.

    Valid values:

    • Healthy: All the storage nodes are available.

    • Risky: Fifty percent of storage nodes or more are unavailable.

    • Unavailable: All the storage nodes are unavailable.

  • Instance Resource Monitoring metrics

    Metric

    Unit

    Description

    Average CPU Utilization

    %

    Displays the following monitoring information:

    • Maximum CPU Utilization of Read/Write Node

    • CPU Utilization of Read/Write Node

    • Maximum CPU Utilization of Compute Node

    • CPU Utilization of Compute Node

    Note

    After you change a C32 cluster from reserved mode to elastic mode, the average CPU utilization increases. For more information, see the "FAQ" section of this topic.

    Disk I/O Throughput

    MB

    Displays the following monitoring information:

    • Read Throughput of Read and Write Nodes

    • Write Throughput of Read and Write Nodes

    • Read Throughput of Compute Node

    • Write Throughput of Compute Node

    Disk IOPS

    N/A

    Displays the following monitoring information:

    • Average Reads per Second of Read and Write Nodes

    • Average Writes per Second of Read and Write Nodes

    • Average Reads per Second of Compute Nodes

    • Average Writes per Second of Compute Nodes

    Disk I/O Usage

    %

    Displays the disk I/O usage of read and write nodes.

    Disk I/O Waiting Time

    ms

    Displays the disk I/O wait time of read and write nodes.

    Cluster Connections

    N/A

    Displays the number of successful connections.

    Disk Space Used

    MB

    Displays the maximum disk space used by a cluster.

    Hot Data Space Used

    MB

    Displays the amount of hot data used within a cluster.

    Cold Data Space Used

    MB

    Displays the amount of cold data used within a cluster.

    Query

    QPS

    N/A

    Displays the queries per second (QPS).

    Query Response Time

    ms

    Displays the following monitoring information:

    • Average Query Response Time

    • Maximum Query Response Time

    Query Waiting Time

    ms

    Displays the following monitoring information:

    • Average Waiting Time for Query

    • Maximum Waiting Time for Query

    Write

    Write Response Time

    ms

    Displays the following monitoring information:

    • Average Write Response Time

    • Maximum Write Response Time

    Delete Response Time

    ms

    Displays the following monitoring information:

    • Average Deletion Response Time

    • Maximum Deletion Response Time

    Update Response Time

    ms

    Displays the following monitoring information:

    • Average Update Response Time

    • Maximum Update Response Time

    Write Throughput

    MB

    Displays the average write throughput of a cluster.

    TPS

    N/A

    Displays the following monitoring information:

    • Total transactions per second (TPS), including the write TPS, delete TPS, and update TPS

    • Write TPS

    • Delete TPS

    • Update TPS

  • Resource Group Monitoring metrics

    Metric

    Unit

    Description

    Average CPU Utilization

    %

    Displays the average CPU utilization of each resource group.

    Query Response Time

    ms

    Displays the average response time of queries processed by each resource group.

    QPS

    N/A

    Displays the queries processed by each resource group per second.

    Query Waiting Time

    ms

    Displays the average wait time of queries processed by each resource group.

    Scheduled Nodes Actually Scaled Out in Resource Group

    N/A

    Displays the number of nodes added to each resource group in a scheduled scaling plan.

    Scheduled Nodes to Be Scaled Out in Resource Group

    N/A

    Displays the number of nodes that need to be added to each resource group in a scheduled scaling plan.

    For information about how to create a scaling plan for a resource group, see Create a resource scaling plan.

    Total Nodes in Resource Group

    N/A

    Displays the total number of nodes in a resource group. The total number of nodes in a resource group is calculated by using the following formula: Total number of nodes = Number of basic nodes + Number of effective nodes in scheduled scaling plans.

    Basic Nodes in Resource Group

    N/A

    Displays the number of basic nodes in a resource group.

  • Table Information Statistics metrics

    Metric

    Unit

    Description

    Database

    N/A

    Displays the name of the database to which the table belongs.

    Table Name

    N/A

    Displays the name of the table.

    Table Rows Count

    N/A

    Displays the total number of rows in the table.

    Amount of Table Data (KB)

    KB

    Displays the amount of the table data that is stored on the disk, excluding index data and primary key index data.

    Amount of Index Data (KB)

    KB

    Displays the amount of index data of the table, excluding primary key index data.

    Amount of Primary Key Index Data (KB)

    KB

    Displays the amount of primary key index data of the table.

    Partitions

    N/A

    Displays the number of partitions in the table.

    Note

    If you create a non-partitioned table in AnalyticDB for MySQL, the table contains one partition. The value of this metric is displayed as 1 for non-partitioned tables.

View monitoring information about a Data Lakehouse Edition (V3.0) cluster

Procedure

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. On the Data Lakehouse Edition (V3.0) tab, find the cluster that you want to manage and click the cluster ID.

  1. In the left-side navigation pane, choose Cluster Management > Monitoring Information.

  1. On the Monitoring Information page, select a time range and click Search in the upper-right corner.

Metrics

The instance and cluster metrics are displayed for an AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster.

  • Instance metrics

    Important

    If the value of a health status metric is risky or unavailable, contact technical support.

    Metric

    Unit

    Description

    Instance Running Status

    N/A

    Valid values:

    • Preparing

    • Creating

    • Running

    • Restoring Backup

    • Changing Specifications

    • Creating Network

    • Releasing Network

    Instance Health Status

    N/A

    Valid values:

    • Healthy: If Instance Access Node Status, Health Status of Compute Node Groups, and Health Status of Storage Node Groups are all healthy and the cluster is detected to be alive, Instance Health Status is healthy.

    • Unavailable: If one of Instance Access Node Status, Health Status of Compute Node Groups, and Health Status of Storage Node Groups is unavailable, Instance Health Status is unavailable.

    • Risky: If one of Instance Access Node Status, Health Status of Compute Node Groups, and Health Status of Storage Node Groups is risky, Instance Health Status is risky.

    Instance Access Node Status

    The access layer of AnalyticDB for MySQL is composed of multiple instance access nodes and provides features such as protocol layer access, SQL parsing and optimization, real-time sharding of written data, data scheduling, and query scheduling.

    N/A

    Valid values:

    • Healthy: All the instance access nodes are available.

    • Risky: Fifty percent of instance access nodes or more are unavailable.

    • Unavailable: All the instance access nodes are unavailable.

    Health Status of Compute Node Groups

    The compute engine of AnalyticDB for MySQL is composed of compute node groups and supports the integrated execution of distributed massively parallel processing (MPP) and directed acyclic graph (DAG) architectures. The compute engine can work with intelligent optimizers to support high concurrency and hybrid loads of complex SQL statements. Additionally, the cloud native infrastructure allows compute nodes to be elastically scaled out within seconds based on business requirements. This allows resources to be efficiently utilized.

    N/A

    Valid values:

    • Healthy: All the compute nodes are available.

    • Risky: Fifty percent of compute nodes or more are unavailable.

    • Unavailable: All the compute nodes are unavailable.

    Health Status of Storage Node Groups

    The storage engine of AnalyticDB for MySQL is composed of storage nodes and supports real-time data writes with strong consistency and high availability in compliance with the Raft consensus protocol. The storage engine uses data sharding and Multi-Raft to support parallel processing, tiered storage to separate hot and cold data at lower costs, and hybrid row-column storage and intelligent indexing to provide ultra-high performance.

    N/A

    Valid values:

    • Healthy: All the storage nodes are available.

    • Risky: Fifty percent of storage nodes or more are unavailable.

    • Unavailable: All the storage nodes are unavailable.

    Average CPU Utilization

    %

    Displays the following monitoring information:

    • Maximum CPU Utilization of Storage Node

    • Maximum CPU Utilization at Access Layer

    • Maximum CPU Utilization of Compute Node

    • Average CPU Utilization of Storage Nodes

    • Average CPU Utilization at Access Layer

    • Average CPU Utilization of Compute Nodes

    Cluster Connections

    N/A

    Displays the number of successful connections.

    Write Response Time

    ms

    Displays the following monitoring information:

    • Maximum Write Response Time

    • Average Write Response Time

    Query Response Time

    ms

    Displays the following monitoring information:

    • Maximum Query Response Time

    • Average Query Response Time

    Disk I/O Throughput

    MB

    Displays the following monitoring information:

    • Write Throughput of Compute Node

    • Write Throughput of Storage Node

    • Read Throughput of Storage Node

    • Read Throughput of Compute Node

    Disk IOPS

    N/A

    Displays the following monitoring information:

    • Disk Write IOPS of Compute Node

    • Disk Write IOPS of Storage Node

    • Disk Read IOPS of Storage Node

    • Disk Read IOPS of Compute Node

    Disk I/O Usage of Read/Write Node

    %

    Displays the average disk I/O usage.

    Disk I/O Waiting Time of Read/Write Node

    ms

    Displays the average disk I/O wait time.

    Total Disk Space Used

    MB

    Displays the following monitoring information:

    • Disk Space Used on Compute Node

    • Disk Space Used on Storage Node

    Cold Data Space Used

    MB

    Displays the amount of cold data used within a cluster.

    Hot Data Space Used

    MB

    Displays the amount of hot data used within a cluster.

  • Cluster metrics

    Metric

    Unit

    Description

    Instance Access Node Status

    The access layer of AnalyticDB for MySQL is composed of multiple instance access nodes and provides features such as protocol layer access, SQL parsing and optimization, real-time sharding of written data, data scheduling, and query scheduling.

    N/A

    Valid values:

    • Healthy: All the instance access nodes are available.

    • Risky: Fifty percent of instance access nodes or more are unavailable.

    • Unavailable: All the instance access nodes are unavailable.

    Health Status of Compute Node Groups

    The compute engine of AnalyticDB for MySQL is composed of compute node groups and supports the integrated execution of distributed massively parallel processing (MPP) and directed acyclic graph (DAG) architectures. The compute engine can work with intelligent optimizers to support high concurrency and hybrid loads of complex SQL statements. Additionally, the cloud native infrastructure allows compute nodes to be elastically scaled out within seconds based on business requirements. This allows resources to be efficiently utilized.

    N/A

    Valid values:

    • Healthy: All the compute nodes are available.

    • Risky: Fifty percent of compute nodes or more are unavailable.

    • Unavailable: All the compute nodes are unavailable.

    Health Status of Storage Node Groups

    The storage engine of AnalyticDB for MySQL is composed of storage nodes and supports real-time data writes with strong consistency and high availability in compliance with the Raft consensus protocol. The storage engine uses data sharding and Multi-Raft to support parallel processing, tiered storage to separate hot and cold data at lower costs, and hybrid row-column storage and intelligent indexing to provide ultra-high performance.

    N/A

    Valid values:

    • Healthy: All the storage nodes are available.

    • Risky: Fifty percent of storage nodes or more are unavailable.

    • Unavailable: All the storage nodes are unavailable.

    Access Metrics

    CPU Utilization

    %

    Displays the following monitoring information:

    • Maximum CPU Utilization at Access Layer

    • Average CPU Utilization at Access Layer

    Computing Resource Group Metrics

    CPU Utilization

    %

    Displays the following monitoring information:

    • Maximum CPU Utilization of Compute Node

    • Average CPU Utilization of Compute Nodes

    Storage Metrics

    CPU Utilization

    %

    Displays the following monitoring information:

    • Maximum CPU Utilization of Storage Node

    • Average CPU Utilization of Storage Nodes

    Total Disk Space Used

    MB

    Displays the total disk space used by storage nodes.

    Disk Usage

    %

    Displays the disk usage of storage nodes.

FAQ

  • Q: Why does the average CPU utilization increase after I change a cluster from reserved mode to elastic mode?

    A: After you change a C32 cluster from reserved mode to elastic mode, the specifications of a single node decrease to 8 cores. By default, BUILD jobs occupy 3 cores. In this case, the average CPU utilization increases. If the increased average CPU utilization does not affect your business, ignore this change. If your business is affected, upgrade your cluster or submit a ticket. For more information about BUILD jobs, see BUILD.

  • Q: Why are the values of Amount of Index Data (KB) and Amount of Primary Key Index Data (KB) metrics large?

    A: Large values of the preceding metrics may be caused by the following reasons:

    • Indexes and primary key indexes are created for a large number of columns.

    • The length of a value in index columns is large, or the total length of all values in an index column is large. For example, the value of an index column is a long string.

    • The number of distinct values in index columns is large. This causes a low index compression ratio. For example, index column A has four distinct values: A1, A2, A3, and A4. Data is difficult to be compressed, resulting in a low index compression ratio.

    • The length of a value in the primary key is large or multiple columns comprise a composite primary key.

  • Q: A large response time is displayed on the Monitoring Information page, but no corresponding time-consuming SQL statements are found on the Diagnostics and Optimization page. Why?

    A: A large amount of query result data requires a large amount of time to cache the result set. However, the total duration that is displayed on the Diagnostics and Optimization page consists of the queuing time, execution plan duration, and execution duration, excluding the cache duration of the result set. We recommend that you view the corresponding time-consuming SQL statements on the SQL Audit page.

Related operations for Data Warehouse Edition

Operation

Description

DescribeDBClusterPerformance

Queries the performance data of an AnalyticDB for MySQL cluster.

DescribeDBClusterResourcePoolPerformance

Queries the monitoring information about resource groups within an AnalyticDB for MySQL cluster.

DescribeDBClusterHealthStatus

Queries the health status of an AnalyticDB for MySQL cluster.

DescribeInclinedTables

Queries the table statistics of an AnalyticDB for MySQL cluster.