You can manage the resources of Hologres instances based on the metrics displayed on the Monitoring Information tab of the instance details page in the Hologres console. Hologres allows you to view the resource usage of instances, view the execution status of SQL statements, and resolve errors based on the metrics. This topic describes the metrics of Hologres.
- CPU Utilization (%)
- Memory Usage (%)
- Storage Usage (bytes)
- Connections
- QPS
- Query Latency (ms)
- Real-time Import (RPS)
- I/O Throughput (Bytes)
- The Connections metric is not supported in Hologres V0.8.
- If an instance of a version earlier than V0.9.27 contains more than 1,000 tables, the Real-time Import (RPS) metric may be lost or the metric value may be excessively low for the large amount of data. In this case, data accuracy is affected.
CPU Utilization (%)
The CPU Utilization (%) metric indicates the CPU utilization of the instance.
In Hologres, computing resources are reserved by backend processes. Even if no queries are in progress, the CPU utilization is not zero.
Hologres supports multi-core parallel computing. In many cases, the CPU utilization can rise to 100% during a single query. This indicates that the computing resources are being fully utilized.
- Check whether a large amount of offline data is imported by using INSERT operations, and whether the amount of data is still increasing.
- Check whether data is queried or written with high queries per second (QPS) performance.
- Check whether hybrid loads exist in the preceding or other scenarios.
If your instance cannot run normally due to insufficient resources, you can upgrade the specifications of the instance. This allows you to handle complex queries on larger amounts of data.
Memory Usage (%)
The Memory Usage (%) metric indicates the memory usage of the instance.
In Hologres, memory resources are reserved by backend processes. Even if no queries are in progress, specific metadata or index data is still loaded to the memory to improve the computing speed. If no queries are in progress, a memory usage of 30% to 40% is normal.
- Causes
- The number of tables and data volume grow and overwhelm the computing capacity of the instance. Memory usage increases proportionally with the volume of metadata and the number of indexes. As the numbers of tables and indexes and the data volume increase, memory usage grows.
- Indexes are not properly set. For example, bitmap indexes are created or dictionary encoding is enabled for excessive fields.
- Impacts
- Instance stability is affected. For example, if a large amount of metadata occupies
the memory space available for queries, errors such as
SERVER_INTERNAL_ERROR
,ERPC_ERROR_CONNECTION_CLOSED
, andTotal memory used by all existing queries exceeded memory limitation
may occur during queries. - Instance performance is affected. If a large amount of metadata occupies the cache space available for queries, cache hits decrease and query latency increases.
- Instance stability is affected. For example, if a large amount of metadata occupies
the memory space available for queries, errors such as
- Solutions
- Delete data that is no longer needed to query to release the occupied memory.
- Set indexes properly. You can delete unnecessary bitmap indexes or disable dictionary encoding in specific business scenarios.
- Upgrade the specifications of the instance to increase its computing and storage resources.
We recommend that you upgrade an instance based on specific scenarios.
- In scenarios where disk data can be read at a specific latency and no strict requirement is imposed on the response time (RT), we recommend that you select an appropriate instance type based on the volume of your data. One compute unit (CU) that includes 1 CPU core and 4 GB of memory can store 50 GB to 100 GB of data. Every 32 CUs can store up to 3 TB of data.
- In serving scenarios that require a short RT, we recommend that you cache all hotspot data in the memory. By default, the cache accounts for 30% of total memory. In such scenarios, 1.3 GB of memory out of 1 CU is used for data cache, and table metadata also occupies part of the data cache. For example, in a scenario that requires a short RT, 100 GB of hotspot data needs to be cached in the memory. After the data is read and decompressed, it occupies more than 100 GB of memory. In this case, at least 320 GB of memory is required, which corresponds to at least 96 CUs.
Storage Usage (bytes)
The Storage Usage (bytes) metric indicates the size of the logical disk that is used to store the data of all databases in the instance.
If you are using the subscription billing method and the quota of the storage resources is used up, the billing method is changed to pay-as-you-go. In this case, you can upgrade the specifications of the instance to increase the storage resources of the instance. You can also configure a lifecycle for the data of the instance based on your needs and delete redundant data in real time. Otherwise, you may be charged for extra storage. For information about how to configure a lifecycle, see CREATE TABLE. You can query the storage size of a table or database in Hologres by executing SQL statements. For more information, see Query the storage sizes of tables and databases.
Connections
The Connections metric indicates the total number of SQL connections in a Hologres instance, including Java Database Connectivity (JDBC) connections to PostgreSQL databases in the active and idle states.
The number of SQL connections in an instance depends on the instance type of the instance. For more information, see Instance types.
- The number of SQL connections reaches or even exceeds the value of the
max_connections
parameter. In this case, you can execute theshow max_connections;
statement to view the maximum number of connections allowed for the instance. - The
FATAL: sorry, too many clients already connection limit exceeded for superusers
error occurs. - The
FATAL: remaining connection slots are reserved for non-replication superuser connections
error occurs.
QPS
The QPS metric indicates the average number of SQL statements that are executed per second, including the SELECT, INSERT, UPDATE, and DELETE statements. This metric is updated every 20 seconds. If only a single SQL statement is executed within 20 seconds, the QPS of SQL statements is 0.05 (1/20 = 0.05).
Query Latency (ms)
The Query Latency (ms) metric indicates the average RT of SQL statements, including the SELECT, INSERT, UPDATE, and DELETE statements. You can query slow query logs to analyze the query latency. For more information, see Query and analyze slow query logs.
Real-time Import (RPS)
The real-time Import (RPS) metric indicates the number of entries that are imported or updated per second by using SQL statements or SDKs.
The insert RPS indicates the number of entries that are imported per second by using foreign tables, COPY statements, or Hologres tables.
The update RPS indicates the number of entries that are imported per second by executing the UPDATE or DELETE statement.
- Use the INSERT, UPDATE, and DELETE statements that are optimized by using fixed plans.
- Use Flink to write data to Hologres.
- Read and write data by using Holo Client.
- Write data from Apache Spark to Hologres.
- Synchronize data from DataHub to Hologres in real time.
- Use the
INSERT INTO VALUES ()
statement to write data by using the PostgreSQL client or a JDBC client.
I/O Throughput (Bytes)
The I/O Throughput (Bytes) metric indicates the I/O throughput of the instance. This metric describes the amount of data read from or written to an instance and reflects how busy the read or write operations are. You can understand the workloads on the instance based on the I/O Throughput (Bytes) metric and troubleshoot relevant issues in real time.