This topic describes the commonly used syntax of health checks in Hologres.
Health checks about table planning
- Check whether the numbers of table groups and shards are too large.
If the Hologres instance has less than 256 CPU cores, we recommend that you use only the default table group. If the Hologres instance has at least 256 CPU cores, you can retain a total of two or three table groups. We recommend that you retain no more than three table groups for a database. The number of shards must be greater than the number of worker nodes, and less than 60% of the number of the CPU cores. If you use the replication feature in Hologres, you must proportionally reduce the number of shards or proportionally increase the number of worker nodes based on the replica count. The following syntax is for your reference:
If the database has more than three table groups, we recommend that you specify a table group as the core table group and merge the redundant table groups. Alternatively, you can specify a primary table group and a secondary table group that contains dimension tables. For more information, see Use the resharding feature to rebalance tables and data.-- Query the number of table groups. select count(distinct tablegroup_name) from hologres.hg_table_group_properties; -- Check the configuration of each table group. We recommend that you include no more than 3,000 tables in each table group. The number of tables is indicated by the table_num property. select tablegroup_name as table_group ,max(property_value) filter( where property_key='shard_count') as shard_count ,max(property_value) filter( where property_key='replica_count') as replica_count ,max(property_value) filter( where property_key='is_default_tg') as is_default ,max(property_value) filter( where property_key='table_num') as table_num from hologres.hg_table_group_properties group by tablegroup_name;
- Check whether a schema has too many tables.
If a schema has too many tables, the system can generate too many small files, which leads to memory overuse. The following syntax is for your reference:
If the schema has too many small partitioned tables, merge them into a standard table.-- Check the number of tables in each schema. select table_namespace as schema ,count(distinct table_name) as total_tables ,count(distinct table_name) filter( where property_key='storage_format') as inner_tables from hologres.hg_table_properties where table_namespace NOT IN ('hologres','hologres_statistic') group by table_namespace order by table_namespace;
- Check whether the table statistics are updated in time.
The following syntax is for your reference:
If the statistics that need to be updated have not been updated in time, run the-- Retrieve tables whose statistics have not been updated in the past day. select schema_name ,table_name ,total_rows ,analyze_timestamp from hologres_statistic.hg_table_statistic where analyze_timestamp < now() - interval '1 days' order by schema_name ,table_name limit 200;
analyze <Table name>
command to update the statistics. - Check whether a Hologres instance has too many resource groups.
An excessive number of resource groups may affect the utilization of CPU and memory resources. We recommend that you set no more than three resource groups for a Hologres instance and make sure that more than 30% of resources are allocated from the default resource group.
The following syntax is for your reference:SELECT * FROM pg_holo_resource_groups where property_key='worker_limit';
Health checks about table designing
- Check whether row-oriented tables are used only in necessary scenarios.
Row-oriented tables are used only in specific scenarios, such as a scenario in which a Flink data source is joined with a dimension table. Make sure that you do not use row-oriented tables in unnecessary scenarios. The following syntax is for your reference:
If you use row-oriented tables in unnecessary scenarios, change them into column-oriented tables or tables that are both column-oriented and row-oriented.select table_namespace as schema ,table_name as tables from hologres.hg_table_properties where property_key = 'orientation' and property_value = 'row' and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog');
- Check whether the distribution key is specified for a table and whether more than
two columns constitute the distribution key.
We recommend that you specify one to two columns to constitute the distribution key. The following syntax is for your reference:
If a table uses more than two columns to constitute the distribution key, recreate the table and specify one to two columns to constitute the distribution key.-- Query all tables that use more than two columns to constitute the distribution key. select table_namespace as schema ,table_name as tables ,property_value as distribution_key from hologres.hg_table_properties where property_key = 'distribution_key' and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog') and array_length(string_to_array(property_value,','),1) > 2; -- Query all tables for which the distribution key is not specified. select distinct table_namespace as schema ,table_name as tables from hologres.hg_table_properties a where property_key='storage_format' and not exists ( select 1 from hologres.hg_table_properties b where b.property_key = 'distribution_key' and a.table_namespace = b.table_namespace and a.table_name = b.table_name) and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog' ) order by schema ,tables;
- Check whether a table has more than 20 dictionary encoding columns.
We recommend that you enable dictionary encoding for no more than 20 columns that are with low cardinality. If you are not sure about which columns you want to specify, enable automatic dictionary encoding for the whole table to prevent all the columns from being dictionary encoding columns. The following syntax is for your reference:
If the table has more than 20 dictionary encoding columns, run the-- Query all tables that have more than 20 dictionary encoding columns and for which automatic dictionary encoding is not enabled. select table_namespace as schema ,table_name as tables ,property_value as dictionary_encoding_columns from hologres.hg_table_properties where property_key = 'dictionary_encoding_columns' and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog') and array_length(string_to_array(property_value,','),1) > 20 and property_value not like '%:auto';
SET_TABLE_PROPERTY
orUPDATE_TABLE_PROPERTY
function to modify the dictionary_encoding_columns property. For more information, see SET_TABLE_PROPERTY or ALTER TABLE. - Check whether bitmap indexes are configured for more than 30 columns.
We recommend that you configure bitmap indexes for no more than 30 columns among which you want to check the values are equal. If you configure bitmap indexes for too many columns of the STRING type, excessive storage space and memory are consumed.
If bitmap indexes are configured for more than 30 columns, run the-- Query all tables in which bitmap indexes are configured for more than 30 columns. select table_namespace as schema ,table_name as tables ,property_value as bitmap_columns from hologres.hg_table_properties where property_key = 'bitmap_columns' and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog') and array_length(string_to_array(property_value,','),1) > 30;
SET_TABLE_PROPERTY
orUPDATE_TABLE_PROPERTY
function to modify the bitmap_columns property. For more information, see SET_TABLE_PROPERTY or ALTER TABLE. - Check whether more than two columns are used to constitute the clustering key of a
table.
Clustering key-based queries follow the leftmost matching principle. Therefore, we recommend that you specify no more than two columns to constitute the clustering key. Otherwise, clustering key-based queries cannot be performed in some scenarios. The following syntax is for your reference:
If a table uses more than two columns to constitute the clustering key, recreate the table and specify one or two columns used for data sorting to constitute the clustering key.-- Query all tables that use more than two columns to constitute the clustering key. select table_namespace as schema select table_namespace as schema ,property_value as clustering_key from hologres.hg_table_properties where property_key = 'clustering_key' and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog') and array_length(string_to_array(property_value,','),1) > 2;
- Check whether only one event time column is specified as the segment key for a table.
The event time column stores the timestamps of real-time data writes.
The segment key is used to split files. Therefore, we recommend that you specify only one event time column of a timestamp or integer type as the segment key. The following syntax is for your reference:
If a table uses more than one event time column as the segment key, recreate the table and specify only one event time column as the segment key.-- Query all tables that use more than one event time column as the segment key. select table_namespace as schema ,table_name as tables ,property_value as segment_key from hologres.hg_table_properties where property_key = 'segment_key' and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog') and array_length(string_to_array(property_value,','),1) > 1;
- Check whether the time to live (TTL) of a table is less than seven days.
The TTL of a table is the longest period of time that a data record can be stored in the table. The data record is recycled after the longest period of time expires. Therefore, we recommend that you set TTL to at least seven days. Otherwise, duplicate data records may exist. The following syntax is for your reference:
If the TTL of a table is less than seven days, run the-- Query all tables whose TTL is less than seven days. The TTL is indicated by the time_to_live_in_seconds property. select table_namespace as schema ,table_name as tables ,property_value as segment_key from hologres.hg_table_properties where property_key = 'time_to_live_in_seconds' and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog') and property_value::bigint < 604800;
SET_TABLE_PROPERTY
function to modify the TTL. For more information, see SET_TABLE_PROPERTY. - Check whether the binary logging feature is enabled only in necessary scenarios.
The binary logging feature is useful but resource-consuming. It affects the performance of data writes. If the feature is enabled for a row-oriented table and a column-oriented table, resources consumed by the row-column table are much less than those consumed by the column-oriented table. Therefore, we recommend that you enable the feature for column-oriented tables only in necessary scenarios. The following syntax is for your reference:
If the binary logging feature is enabled in unnecessary scenarios, run the-- Query all tables for which the binary logging feature is enabled. select table_namespace as schema ,table_name as tables from hologres.hg_table_properties where property_key = 'binlog.level' and property_value = 'replica' and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog') ; -- Query all tables for which the TTL of binary logs is set to more than seven days. We recommend that you shorten the TTL. select table_namespace as schema ,table_name as tables ,property_value as "binlog.ttl" from hologres.hg_table_properties where property_key = 'binlog.ttl' and property_value::bigint > 604800 and table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog') ;
SET_TABLE_PROPERTY
function to modify the feature settings and the TTL of binary logs. For more information, see SET_TABLE_PROPERTY. - Check whether data is unevenly distributed to shards.
In Hologres, data of a table is distributed to and stored in multiple shards. If some shards store far more data than other shards, the data is unevenly distributed. In this case, we recommend that you select one or more appropriate columns to constitute the distribution key. The following syntax is for your reference:
If the data is distributed in a remarkably uneven way, change the columns that constitute the distribution key and reinsert data.select hg_shard_id ,count(*) from table_name group by hg_shard_id;
Health checks about status
- Check the resource usage.
You can analyze CPU utilization, memory usage, and connection usage by using CloudMonitor. For more information, see Metrics of Hologres.
- Check the statistics about query statements.
You can check the proportions, success rates, and latencies of different types of query statements. The following syntax is for your reference:
-- Check the number of DML statements, including SELECT, INSERT, UPDATE, and DELETE statements, executed on each database in the past seven days. select datname, query_date, count(*) from hologres.query_log where query_date > to_char(current_date - interval'7 days','YYYYMMDD') and command_tag in ('SELECT','INSERT','UPDATE','DELETE') group by datname, query_date order by datname, query_date desc; -- Check the execution status of DML statements in the past day. select datname, query_date, command_tag, count(*) filter( where status='SUCCESS') as SUCCESS, count(*) filter( where status='FAILED') as FAILED from hologres.query_log where query_date > to_char(current_date - interval'1 days','YYYYMMDD') and command_tag in ('SELECT','INSERT','UPDATE','DELETE') group by datname, query_date, command_tag order by datname, query_date desc; -- Check the query latencies of DML statements in the past two days. select datname, query_date, command_tag, count(*), AVG(duration) as duration from hologres.query_log where query_date > to_char(current_date - interval'1 days','YYYYMMDD') and command_tag in ('SELECT','INSERT','UPDATE','DELETE') and status = 'SUCCESS' group by datname, query_date, command_tag order by datname, query_date desc;
- Check slow queries.
Check the slow queries that consume the most amount of time in the past day. The following syntax is for your reference:
-- Query the slow queries that consume the most amount of time in the past day. select status as "Status", duration as "Time consumed (ms)", optimization_cost as "Optimization time (ms)", start_query_cost as "Startup time (ms)", get_next_cost as "Execution time (ms)", duration-optimization_cost-start_query_cost-get_next_cost as "Others (ms)", query_id as "QueryID", query from hologres.hg_query_log where query_start > current_date - interval '1 days' and command_tag in ('SELECT') order by duration desc, start_query_cost desc, optimization_cost, get_next_cost desc, duration-optimization_cost-start_query_cost-get_next_cost desc limit 200;
Check the slow queries that consume the most amount of resources in the past day. The following syntax is for your reference:-- Query the slow queries that consume the most amount of resources in the past day. select status as "Status", duration as "Time consumed (ms)", query_start as "Started at", (read_bytes/1048576)::text || ' MB' as "Bytes", (memory_bytes/1048576)::text || ' MB' as "Memory", (shuffle_bytes/1048576)::text || ' MB' as "Shuffle", (cpu_time_ms/1000)::text || ' s' as "CPU time", physical_reads as "Physical reads", query_id as "QueryID", query from hologres.hg_query_log where query_start > current_date - interval'1 days' and command_tag in ('SELECT','INSERT','UPDATE','DELETE') order by duration desc, read_bytes desc, shuffle_bytes desc, memory_bytes desc, cpu_time_ms desc, physical_reads desc limit 500;