This topic describes the commonly used syntax of health checks in Hologres.

Health checks about table planning

  • Check whether the numbers of table groups and shards are too large.
    If the Hologres instance has less than 256 CPU cores, we recommend that you use only the default table group. If the Hologres instance has at least 256 CPU cores, you can retain a total of two or three table groups. We recommend that you retain no more than three table groups for a database. The number of shards must be greater than the number of worker nodes, and less than 60% of the number of the CPU cores. If you use the replication feature in Hologres, you must proportionally reduce the number of shards or proportionally increase the number of worker nodes based on the replica count. The following syntax is for your reference:
    -- Query the number of table groups.
    select  count(distinct tablegroup_name)
    from    hologres.hg_table_group_properties;
    
    -- Check the configuration of each table group. We recommend that you include no more than 3,000 tables in each table group. The number of tables is indicated by the table_num property.
    select  tablegroup_name as table_group
            ,max(property_value) filter( where property_key='shard_count') as shard_count
            ,max(property_value) filter( where property_key='replica_count') as replica_count
            ,max(property_value) filter( where property_key='is_default_tg') as is_default
            ,max(property_value) filter( where property_key='table_num') as table_num
    from    hologres.hg_table_group_properties
    group by tablegroup_name;
    If the database has more than three table groups, we recommend that you specify a table group as the core table group and merge the redundant table groups. Alternatively, you can specify a primary table group and a secondary table group that contains dimension tables. For more information, see Use the resharding feature to rebalance tables and data.
  • Check whether a schema has too many tables.
    If a schema has too many tables, the system can generate too many small files, which leads to memory overuse. The following syntax is for your reference:
    -- Check the number of tables in each schema.
    select  table_namespace as schema
            ,count(distinct table_name) as total_tables
            ,count(distinct table_name) filter( where property_key='storage_format') as inner_tables
    from    hologres.hg_table_properties
    where   table_namespace NOT IN ('hologres','hologres_statistic')
    group by table_namespace
    order by table_namespace;
    If the schema has too many small partitioned tables, merge them into a standard table.
  • Check whether the table statistics are updated in time.
    The following syntax is for your reference:
    -- Retrieve tables whose statistics have not been updated in the past day.
    select  schema_name
            ,table_name
            ,total_rows
            ,analyze_timestamp
    from    hologres_statistic.hg_table_statistic
    where   analyze_timestamp < now() - interval '1 days'
    order by schema_name
             ,table_name
    limit   200;
    If the statistics that need to be updated have not been updated in time, run the analyze <Table name> command to update the statistics.
  • Check whether a Hologres instance has too many resource groups.

    An excessive number of resource groups may affect the utilization of CPU and memory resources. We recommend that you set no more than three resource groups for a Hologres instance and make sure that more than 30% of resources are allocated from the default resource group.

    The following syntax is for your reference:
    SELECT * FROM pg_holo_resource_groups
    where property_key='worker_limit';

Health checks about table designing

  • Check whether row-oriented tables are used only in necessary scenarios.
    Row-oriented tables are used only in specific scenarios, such as a scenario in which a Flink data source is joined with a dimension table. Make sure that you do not use row-oriented tables in unnecessary scenarios. The following syntax is for your reference:
    select  table_namespace as schema
            ,table_name as tables
    from    hologres.hg_table_properties
    where
              property_key = 'orientation'
    and     property_value = 'row'
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog');
    If you use row-oriented tables in unnecessary scenarios, change them into column-oriented tables or tables that are both column-oriented and row-oriented.
  • Check whether the distribution key is specified for a table and whether more than two columns constitute the distribution key.
    We recommend that you specify one to two columns to constitute the distribution key. The following syntax is for your reference:
    -- Query all tables that use more than two columns to constitute the distribution key.
    select  table_namespace as schema
            ,table_name as tables
            ,property_value as distribution_key
    from    hologres.hg_table_properties
    where   property_key = 'distribution_key'
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog')
    and     array_length(string_to_array(property_value,','),1) > 2;
    
    -- Query all tables for which the distribution key is not specified.
    select  distinct table_namespace as schema
            ,table_name as tables
    from    hologres.hg_table_properties a
    where property_key='storage_format' and not exists (
                         select  1
                         from    hologres.hg_table_properties b
                         where   b.property_key = 'distribution_key'
                         and     a.table_namespace = b.table_namespace
                         and     a.table_name = b.table_name)
                         and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog'
                     )
    order by schema
             ,tables;
    If a table uses more than two columns to constitute the distribution key, recreate the table and specify one to two columns to constitute the distribution key.
  • Check whether a table has more than 20 dictionary encoding columns.
    We recommend that you enable dictionary encoding for no more than 20 columns that are with low cardinality. If you are not sure about which columns you want to specify, enable automatic dictionary encoding for the whole table to prevent all the columns from being dictionary encoding columns. The following syntax is for your reference:
    -- Query all tables that have more than 20 dictionary encoding columns and for which automatic dictionary encoding is not enabled.
    select  table_namespace as schema
            ,table_name as tables
            ,property_value as dictionary_encoding_columns
    from    hologres.hg_table_properties
    where   property_key = 'dictionary_encoding_columns'
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog')
    and     array_length(string_to_array(property_value,','),1) > 20
    and     property_value not like '%:auto';
    If the table has more than 20 dictionary encoding columns, run the SET_TABLE_PROPERTY or UPDATE_TABLE_PROPERTY function to modify the dictionary_encoding_columns property. For more information, see SET_TABLE_PROPERTY or ALTER TABLE.
  • Check whether bitmap indexes are configured for more than 30 columns.
    We recommend that you configure bitmap indexes for no more than 30 columns among which you want to check the values are equal. If you configure bitmap indexes for too many columns of the STRING type, excessive storage space and memory are consumed.
    -- Query all tables in which bitmap indexes are configured for more than 30 columns.
    select  table_namespace as schema
            ,table_name as tables
            ,property_value as bitmap_columns
    from    hologres.hg_table_properties
    where   property_key = 'bitmap_columns'
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog')
    and     array_length(string_to_array(property_value,','),1) > 30;
    If bitmap indexes are configured for more than 30 columns, run the SET_TABLE_PROPERTY or UPDATE_TABLE_PROPERTY function to modify the bitmap_columns property. For more information, see SET_TABLE_PROPERTY or ALTER TABLE.
  • Check whether more than two columns are used to constitute the clustering key of a table.
    Clustering key-based queries follow the leftmost matching principle. Therefore, we recommend that you specify no more than two columns to constitute the clustering key. Otherwise, clustering key-based queries cannot be performed in some scenarios. The following syntax is for your reference:
    -- Query all tables that use more than two columns to constitute the clustering key.
    select  table_namespace as schema
            select  table_namespace as schema
            ,property_value as clustering_key
    from    hologres.hg_table_properties
    where   property_key = 'clustering_key'
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog')
    and     array_length(string_to_array(property_value,','),1) > 2;
    If a table uses more than two columns to constitute the clustering key, recreate the table and specify one or two columns used for data sorting to constitute the clustering key.
  • Check whether only one event time column is specified as the segment key for a table. The event time column stores the timestamps of real-time data writes.
    The segment key is used to split files. Therefore, we recommend that you specify only one event time column of a timestamp or integer type as the segment key. The following syntax is for your reference:
    -- Query all tables that use more than one event time column as the segment key.
    select  table_namespace as schema
            ,table_name as tables
            ,property_value as segment_key
    from    hologres.hg_table_properties
    where   property_key = 'segment_key'
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog')
    and     array_length(string_to_array(property_value,','),1) > 1;
    If a table uses more than one event time column as the segment key, recreate the table and specify only one event time column as the segment key.
  • Check whether the time to live (TTL) of a table is less than seven days.
    The TTL of a table is the longest period of time that a data record can be stored in the table. The data record is recycled after the longest period of time expires. Therefore, we recommend that you set TTL to at least seven days. Otherwise, duplicate data records may exist. The following syntax is for your reference:
    -- Query all tables whose TTL is less than seven days. The TTL is indicated by the time_to_live_in_seconds property.
    select  table_namespace as schema
            ,table_name as tables
            ,property_value as segment_key
    from    hologres.hg_table_properties
    where   property_key = 'time_to_live_in_seconds'
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog')
    and     property_value::bigint < 604800;
    If the TTL of a table is less than seven days, run the SET_TABLE_PROPERTY function to modify the TTL. For more information, see SET_TABLE_PROPERTY.
  • Check whether the binary logging feature is enabled only in necessary scenarios.
    The binary logging feature is useful but resource-consuming. It affects the performance of data writes. If the feature is enabled for a row-oriented table and a column-oriented table, resources consumed by the row-column table are much less than those consumed by the column-oriented table. Therefore, we recommend that you enable the feature for column-oriented tables only in necessary scenarios. The following syntax is for your reference:
    -- Query all tables for which the binary logging feature is enabled.
    select  table_namespace as schema
            ,table_name as tables
    from    hologres.hg_table_properties
    where   property_key = 'binlog.level'
    and     property_value = 'replica'
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog')
    ;
    
    -- Query all tables for which the TTL of binary logs is set to more than seven days. We recommend that you shorten the TTL.
    select  table_namespace as schema
            ,table_name as tables
            ,property_value as "binlog.ttl"
    from    hologres.hg_table_properties
    where   property_key = 'binlog.ttl'
    and     property_value::bigint > 604800
    and     table_namespace NOT IN ('hologres','hologres_statistic','pg_catalog')
    ;
    If the binary logging feature is enabled in unnecessary scenarios, run the SET_TABLE_PROPERTY function to modify the feature settings and the TTL of binary logs. For more information, see SET_TABLE_PROPERTY.
  • Check whether data is unevenly distributed to shards.
    In Hologres, data of a table is distributed to and stored in multiple shards. If some shards store far more data than other shards, the data is unevenly distributed. In this case, we recommend that you select one or more appropriate columns to constitute the distribution key. The following syntax is for your reference:
    select  hg_shard_id
            ,count(*)
    from    table_name
    group by hg_shard_id;
    If the data is distributed in a remarkably uneven way, change the columns that constitute the distribution key and reinsert data.

Health checks about status

  • Check the resource usage.

    You can analyze CPU utilization, memory usage, and connection usage by using CloudMonitor. For more information, see Metrics of Hologres.

  • Check the statistics about query statements.
    You can check the proportions, success rates, and latencies of different types of query statements. The following syntax is for your reference:
    -- Check the number of DML statements, including SELECT, INSERT, UPDATE, and DELETE statements, executed on each database in the past seven days.
    select datname, query_date, count(*) from hologres.query_log
    where query_date > to_char(current_date - interval'7 days','YYYYMMDD')
    and command_tag in ('SELECT','INSERT','UPDATE','DELETE')
    group by datname, query_date
    order by datname, query_date desc;
    
    -- Check the execution status of DML statements in the past day.
    select datname, query_date, command_tag,
           count(*) filter( where status='SUCCESS') as SUCCESS,
           count(*) filter( where status='FAILED') as FAILED
    from hologres.query_log
    where query_date > to_char(current_date - interval'1 days','YYYYMMDD')
    and command_tag in ('SELECT','INSERT','UPDATE','DELETE')
    group by datname, query_date, command_tag
    order by datname, query_date desc;
    
    -- Check the query latencies of DML statements in the past two days.
    select datname, query_date, command_tag, count(*), AVG(duration) as duration
    from hologres.query_log
    where query_date > to_char(current_date - interval'1 days','YYYYMMDD')
    and command_tag in ('SELECT','INSERT','UPDATE','DELETE')
    and status = 'SUCCESS'
    group by datname, query_date, command_tag
    order by datname, query_date desc;
  • Check slow queries.
    Check the slow queries that consume the most amount of time in the past day. The following syntax is for your reference:
    -- Query the slow queries that consume the most amount of time in the past day.
    select status as "Status",
           duration as "Time consumed (ms)",
           optimization_cost as "Optimization time (ms)",
           start_query_cost as "Startup time (ms)",
           get_next_cost as "Execution time (ms)",
           duration-optimization_cost-start_query_cost-get_next_cost as "Others (ms)",
           query_id as "QueryID",
           query
     from hologres.hg_query_log
     where query_start > current_date - interval '1 days'
      and command_tag in ('SELECT')
     order by duration desc,
              start_query_cost desc,
              optimization_cost,
              get_next_cost desc,
              duration-optimization_cost-start_query_cost-get_next_cost desc
     limit 200;
    Check the slow queries that consume the most amount of resources in the past day. The following syntax is for your reference:
    -- Query the slow queries that consume the most amount of resources in the past day.
    select status as "Status",
           duration as "Time consumed (ms)",
           query_start as "Started at",
           (read_bytes/1048576)::text || ' MB' as "Bytes",
           (memory_bytes/1048576)::text || ' MB' as "Memory",
           (shuffle_bytes/1048576)::text || ' MB' as "Shuffle",
           (cpu_time_ms/1000)::text || ' s' as "CPU time",
           physical_reads as "Physical reads",
           query_id as "QueryID",
           query
     from hologres.hg_query_log
     where query_start > current_date - interval'1 days'
     and command_tag in ('SELECT','INSERT','UPDATE','DELETE')
     order by duration desc,
              read_bytes desc,
              shuffle_bytes desc,
              memory_bytes desc,
              cpu_time_ms desc,
              physical_reads desc
     limit 500;