This topic describes best practices to optimize the query performance on internal tables in Hologres.

Update statistical information

Statistics determine whether a suitable execution plan can be generated. Hologres collects the following statistics for analysis on sample data to be queried: data distribution and characteristics, table statistics, column statistics, number of rows, number of columns, column width, cardinality, frequency, maximum value, minimum value, long key value, and bucket distribution and characteristics. These statistics can help update the estimated cost for operator execution, narrow down the query range, and estimate the optimal join order, the memory overheads, and the degree of parallelism (DOP). This way, the optimizer can generate an optimal execution plan. For more information about the statistics, see Using EXPLAIN.

Statistics collection has the following drawback: If statistics are collected manually or periodically instead of in real time, you cannot obtain the most accurate data characteristics from the statistics. You must execute the EXPLAIN statement and check whether the statistical information in the execution result of the EXPLAIN statement is correct. In the statistical information, the rows parameter indicates the number of rows of an operator and the width parameter indicates the width of an operator.

If the statistical information is outdated, a poor execution plan may be generated, as demonstrated in the following example.

  • Table tmp1 contains 10 million rows of data. Table tmp contains 1,000 rows of data. In Hologres, the default value of the rows parameter is 1000 in the statistical information. In the following execution result of the EXPLAIN statement, the number of rows in tmp1 is not the actual value. This indicates that the statistical information is not up to date.
    Seq Scan on tmp1 (cost=0.00..5.01 rows=1000 width=1)
    Example
  • When tmp1 and tmp are joined, the expected result of the EXPLAIN statement is that tmp1 is processed before tmp and tmp is used in a HASH JOIN operation. This is because the data volume of tmp1 is larger than that of tmp. However, the statistical information in tmp1 is not up to date. As a result, tmp1 is used in a HASH JOIN operation, which leads to low efficiency or even an out of memory (OOM) exception. Therefore, you must execute the ANALYZE statement to collect statistical information for both tables that are involved in the JOIN operation.
    analyze tmp;
    analyze tmp1;
    After you execute the ANALYZE statement, the tables that are involved in the JOIN operation are processed in the correct sequence. The following figure shows the execution result of the EXPLAIN statement after the ANALYZE statement is executed. tmp1 is processed before tmp. tmp is used to create a hash table. The number of rows in tmp1 is 10 million. This indicates that the statistical information is up to date. Sequence

If rows=1000 is displayed for tmp1 in the execution result of the EXPLAIN statement, the statistical information is not synchronized to the optimizer. Low query performance is usually caused by a lack of statistical information. You can execute the ANALYZE TABLE statement to collect statistical information and synchronize the information to the optimizer in time.

We recommend that you execute the ANALYZE TABLE statement in the following scenarios:
  • Data is imported.
  • A large number of INSERT, UPDATE, and DELETE operations are performed.
  • Statistics are required for both internal and foreign tables.
  • Statistics are required for a parent partitioned table.
  • If you encounter one of the following problems, you must execute the ANALYZE TABLE statement before you run an import task. This can systematically improve efficiency.
    • An OOM exception occurs when multiple tables are joined.

      The Query executor exceeded total memory limitation xxxxx: yyyy bytes used error is reported.

    • Import efficiency is low.

      When you query or import data in Hologres, tasks cannot end in time. This results in low efficiency.

Select a join order algorithm for the optimizer
  • When complex SQL join operations are involved and many tables are joined, the optimizer takes an extended period of time to select the optimal connection. As a result, if the join order policy is adjusted, the amount of time required to optimize queries may be reduced. The following syntax is used to select a join order algorithm for the optimizer:
    set optimizer_join_order = '<value>'; 
  • Parameters
    Parameter Description
    value The join order algorithm to be used by the optimizer. Valid values:
    • query: uses the join order that is specified in the SQL statement. This algorithm has the lowest optimizer overheads.
    • greedy: uses a greedy algorithm to obtain a better join order. This algorithm has moderate optimizer overheads.
    • exhaustive: uses a dynamic planning algorithm to obtain the optimal join order. This generates the optimal execution plan. However, this algorithm has the highest optimizer overheads.
    Default value: exhaustive.
  • Usage notes
    • You can use the exhaustive algorithm to obtain the globally optimal join order. The amount of time required for optimization depends on the number of tables to be joined. For example, if more than 10 tables are to be joined, much time may be required to obtain the optimal join order. It takes less time to use the query or greedy algorithm, but neither of them can ensure that the optimal join order can be obtained.
    • You can set a join order algorithm for a session or a query.
You can use one of the following methods to check whether the statistical information is up to date:
  • Query the settings of the analyze_tuple column in the system table named hologres.hg_table_properties to check whether the number of rows is correct. Alternatively, you can view the value of the rows parameter in the Scan node.
  • Query the system table named hologres.hg_stats and obtain its information such as the histogram of each column, average width, and the number of distinct values, as shown in the following figure.
    Note In Hologres V0.10 or later, statistical information is recorded in the system table named pg_stats.
    Statistical information

View and change the number of shards after scale-up

The number of shards affects the DOP of queries. This setting is critical to query performance. If only a few shards are available, the DOP is low. Excessive shards may lead to increased query startup overheads and decreased query efficiency. In addition, a large number of small files occupy much metadata management space in the memory. You can improve query efficiency and reduce memory overheads by setting an appropriate number of shards based on the instance types.

The number of shards is pre-allocated based on the instance type of a Hologres instance. The default number of shards for an instance is approximately the number of CPU cores that are used for core queries in the instance. The number of CPU cores that are used for core queries is smaller than the number of purchased CPU cores. The purchased CPU cores are allocated to different nodes such as query nodes, access nodes, control nodes, and scheduling nodes. For more information about the default numbers of shards for instances of different instance types, see Instance types. After an instance is scaled up, the number of shards is not automatically changed for a database created before the instance is scaled up. You must change the number of shards based on your business requirements. For a database created after the instance is scaled up, the number of shards is determined based on the new instance type.

You must change the number of shards in the following scenarios:
  • After a Hologres instance is scaled up, the business continues to grow and the number of shards cannot keep up with the demand. You need to improve query efficiency. In this case, you can create a table group and specify a larger number of shards for the table group. The table and data are still in the original table group. You must import them to the new table group for resharding.
  • After a Hologres instance is scaled up, new business needs to be launched without affecting the original business. In this case, you can create a table group and specify an appropriate number of shards for the table group without modifying the schema of the original table.
Note You can create multiple table groups in a database. Make sure that the total number of shards for all table groups does not exceed the default number of shards recommended for the instance. This way, you can make the most out of CPU resources.

For more information about how to create a table group or modify the default table group of a database, see Best practices for specifying table groups.

Select an appropriate distribution key when you create a table

A distribution key is used to evenly divide data into multiple shards. This can help prevent data skew. If you specify the distribution keys of multiple tables as the join keys to perform a local JOIN operation on the tables, queries can be accelerated. When you create a table, you can select an appropriate distribution key based on the following rules:
  • Specify the join keys as the distribution keys.
    For example, the execution result of the EXPLAIN statement indicates that the execution plan for querying tmp and tmp1 contains a redistribution operation but no local JOIN operation. As a result, query efficiency is low. You must re-create the two tables and specify the distribution keys as the join keys. This prevents additional overheads caused by data redistribution when multiple tables are joined. motionYou can execute the following DDL statements to re-create the two tables and specify the distribution keys as the join keys:
    begin;
    create table tmp(a int, b int, c int);
    call set_table_property('tmp', 'distribution_key', 'a');
    commit;
    
    begin;
    create table tmp1(a int, b int, c int);
    call set_table_property('tmp1', 'distribution_key', 'b');
    commit;
    
    -- Specify the distribution keys as the join keys. 
    select count(1) from tmp join tmp1 on tmp.a = tmp1.b ;
    After you reset the distribution keys for the two tables, execute the EXPLAIN statement again. The execution result of the statement indicates that data is evenly distributed to shards by hash key. In addition, the Redistribute Motion operator (as framed in red in the preceding figure) is optimized. This indicates that data is not redistributed and that additional network overheads are prevented. Set the distribution keyHologres has four types of Motion operators that correspond to four data redistribution scenarios, as described in the following table.
    Motion operator type Description
    Redistribute Motion Data is hash or randomly distributed to one or more shards.
    Broadcast Motion Data is copied to all shards.

    The Broadcast Motion operator is superior only in scenarios where both the number of shards and the number of broadcast tables are small.

    Gather Motion Data is summarized to one shard.
    Forward Motion Data is transmitted between an external data source or execution engine and a Hologres execution engine. This operator is used for federated queries.
    When you view the execution result of the EXPLAIN statement, take note of the following items:
    • If the Motion operator is time-consuming, you can redesign the distribution key.
    • The Broadcast Motion operator is superior only in scenarios where both the numbers of shards and broadcast tables are small.
    • If a Gather Motion or Broadcast Motion operator is generated due to invalid statistical information, you can execute the ANALYZE TABLE statement to use the more efficient Redistribute Motion operator.
    • You can execute the following statements to prohibit the generation of Motion operators and then compare the query time of different Motion operators:
      -- Prohibit the generation of Broadcast Motion operators. 
      set optimizer_enable_motion_broadcast = off; 
      
      -- Prohibit the generation of Redistribute Motion operators. 
      set optimizer_enable_motion_redistribute = off; 
  • Select a column that is frequently used in the GROUP BY clause as the distribution key.
    The GROUP BY Key operation allows data to be redistributed based on the key of the column in the GROUP BY clause. If the GROUP BY operation is time-consuming, you can use the column in the GROUP BY clause as the distribution key.
    -- Data is hash-redistributed based on the values in column a. 
    select a from t1 group by a; 
  • Select a column in which data is evenly distributed and discrete as the distribution key.
    Queries slow down if data is unevenly distributed between multiple shards. You can execute the following statement to check whether data is evenly distributed:
    select count(1) from t1 group by hg_shard_id;
    If data is unevenly distributed, you must select a column in which data is evenly distributed and discrete as the distribution key.
    Note To change the distribution key, you must create another table and import data.

Disable dictionary encoding

For STRING-based queries including TEXT-, CHAR-, and VARCHAR-based queries, dictionary encoding or decoding reduces the amount of time consumed to compare strings, but causes a large amount of decoding or encoding overheads.

By default, Hologres enables dictionary encoding for all columns of the STRING type. You can leave the dictionary_encoding_columns property empty or disable dictionary encoding for specific columns. If you modify the dictionary encoding setting, data files are re-encoded for storage, which consumes CPU and memory resources. We recommend that you modify dictionary encoding settings during off-peak hours.

You can obtain statistical information to check whether the Decode operator is time-consuming. If it is, disable the Decode operator. If a table contains a large number of columns of the STRING type, we recommend that you disable dictionary encoding to improve performance.

If a table contains a large number of columns of the STRING type, you can determine whether to enable dictionary encoding for the columns based on your business requirements. For example, you can execute the following statements:
begin;
create table tbl (a int not null, b text not null, c int not null, d int);
call set_table_property('tbl', 'dictionary_encoding_columns', '');
commit;

Other optimization items

  • Select an appropriate storage type.
    Hologres supports row-oriented, column-oriented, and hybrid row-column storage modes. You can select a storage mode that best suits your business scenario. The following table describes the storage modes.
    Storage mode Scenario Disadvantage
    Row-oriented storage
    • Point query scenarios in which data is queried with high QPS performance based on primary keys.
    • Scenarios in which all columns of data are read at the same time and high performance of the UPDATE, DELETE, and INSERT operations is required.
    The performance of operations such as large-scale data queries, full table scanning, and aggregation is low.
    Column-oriented storage Data analysis scenarios such as multi-column data query by range, single-table aggregation, and multi-table join. The performance of the UPDATE and DELETE operations in this mode is lower than that in the row-oriented storage mode. The performance of point queries in this mode is lower than that in the row-oriented storage mode in scenarios without specified indexes.
    Hybrid row-column storage Scenarios in which the row-oriented and column-oriented storage modes are applicable. The memory overheads are high.
  • Select an appropriate data type.
    Hologres supports a variety of data types. You can select a data type that best suits your business needs. The following rules apply:
    • Select a type that requires minimal storage space.
      • Preferentially use the INT type rather than the BIGINT type.
      • Preferentially use the DECIMAL or NUMERIC type with minimal PRECISION or SCALE. To prevent statistical errors, do not use the FLOAT or DOUBLE PRECISION type.
      • Do not use the FLOAT or DOUBLE type for columns in the GROUP BY clause.
      • Preferentially use the TEXT type rather than the VARCHAR(N) and CHAR(N) types. Set the value of N as small as possible.
      • Use the TIMESTAMPTZ and DATE types rather than the TEXT type to represent a date.
    • Use consistent data types for JOIN operations.

      When you join tables, make sure that the columns that match each other use the same data type. This prevents additional overheads caused if Hologres implicitly converts the data type for columns to different data types.

    • Do not use the FLOAT or DOUBLE type in operations such as UNION and GROUP BY.

      Operations such as UNION and GROUP BY do not support the DOUBLE PRECISION or FLOAT type. You must use the DECIMAL type.

  • Select an appropriate primary key.
    Primary keys are used to uniquely identify data. They are applicable to data import scenarios with duplicate primary key values. You can specify option to select a deduplication mode when you import data. Valid values of option:
    • ignore: ignores new data.
    • update: uses new data to overwrite existing data.

    Proper settings of primary keys can help the optimizer generate proper execution plans in specific scenarios, such as scenarios in which GROUP BY pk,a,b,c is executed for data queries.

    In column-oriented storage mode, primary keys may degrade the performance of data writing. In most cases, the performance of data writing without configured primary keys is three times higher than that with configured primary keys.

  • Select an appropriate partitioned table.
    Hologres tables support only one level of partitioning. Proper partition settings accelerate queries. Improper partition settings such as a large number of partitions cause excessive small files and degrade query performance.
    Note For data imported by day, we recommend that you create a partitioned table for the data of each day. The data is separately stored. Only the data of the current day is accessed.
    You can configure partitions in the following scenarios:
    • When you delete all partitions of a child partitioned table, data in partitions of other child partitioned tables is not affected. The DROP or TRUNCATE statement has higher performance than the DELETE statement.
    • For queries of partition key columns in predicate conditions, you can find corresponding partitions based on indexes of the partition key columns. You can also directly query child partitions.
    • A partitioned table is required for real-time data that is imported on a periodic basis. If new data is imported every day, you can use the date column as a partition key column and import data to a child partition every day. For example, you can execute the following statements:
      begin;
      create table insert_partition(c1 bigint not null, c2 boolean, c3 float not null, c4 text, c5 timestamptz not null) partition by list(c4);
      call set_table_property('insert_partition', 'orientation', 'column');
      commit;
      create table insert_partition_child1 partition of insert_partition for values in('20190707');
      create table insert_partition_child2 partition of insert_partition for values in('20190708');
      create table insert_partition_child3 partition of insert_partition for values in('20190709');
      
      select * from insert_partition where c4 >= '20190708';
      select * from insert_partition_child3;
  • Select an appropriate index.
    Hologres uses various types of indexes to implement different features. You can select an index that best suits your business scenario to improve query performance. The following table describes the supported indexes.
    Index type Index name Description Recommendation Sample query statement
    clustering_key Clustered index The clustered index is used in files. Data in a file is sorted by this index.

    For part of range queries, Hologres can use data orderliness attributes of the clustered index to filter data.

    Use a range or filter query column as the clustered index. Index-based data filtering complies with the leftmost prefix matching principle. We recommend that you specify no more than two columns. select sum(a) from tb1 where a > 100 and a < 200;
    bitmap_columns Bitmap index The bitmap index is used in files. Bitmaps are created for data in a file based on this index.

    For equivalent queries, Hologres can encode data of each row by value, and perform bit operations to find the rows where data resides. The time complexity is O(1).

    Use an equivalent query column as the bitmap index. select * from tb1 where a =100;
    segment_key (also known as event_time_column) Segment key A file-level index. Data is written to files in Append Only mode. Then, small files are merged based on the index.

    A segment key identifies the boundary of a file. You can use the segment key to find the required file.

    A segment key is designed for ordered, range-specific data such as timestamps and dates so that the segment key is strongly correlated with the data write time.

    Use this index to filter files. Then, use the bitmap index or clustered index for range or equivalent queries in files. Index-based data filtering complies with the leftmost prefix matching principle. We recommend that you specify only a single column.

    We recommend that you specify the first non-empty timestamp field as the segment key.

    select sum(a) from tb1 where ts > '2020-01-01' and a < '2020-03-02';
    Both the clustered index and segment key must comply with the leftmost prefix matching principle of traditional databases such as MySQL. This way, data is sorted based on the values in the leftmost column of an index. If the values in the leftmost column are ordered, use the second leftmost column to sort data. For example, you can execute the following statements:
    call set_table_property('tmp', 'clustering_key', 'a,b,c');
    select * from tmp where a > 1 ;  --The clustered index can be used. 
    select * from tmp where a > 1 and c > 2 ;   --The clustered index can be used only for column a. 
    select * from tmp where a > 1 and b > 2 ;  --The clustered index can be used for both columns a and b. 
    select * from tmp where a > 1 and b > 2 and c > 3 ; --The clustered index can be used for columns a, b, and c. 
    select * from tmp where b > 1 and c > 2 ;   --The clustered index cannot be used for column b or c. 
    A bitmap index supports AND and OR queries among multiple columns. Sample statements:
    call set_table_property('tmp', 'bitmap_columns', 'a,b,c');
    select * from tmp where a = 1 and b = 2 ;  --The bitmap index can be used. 
    select * from tmp where a = 1 or b = 2 ;  --The bitmap index can be used. 
    Note You can add a bitmap index after you create a table. The clustered index and segment key are specified only when you create a table and cannot be added after the table is created.

Check whether an index is used

Create a table named tmp and specify index fields by executing the following statements:
begin;
create table tmp(a int not null, b int not null, c int not null);
call set_table_property('tmp', 'clustering_key', 'a');
call set_table_property('tmp', 'segment_key', 'b');
call set_table_property('tmp', 'bitmap_columns', 'a,b,c');
commit;
  • To check whether the clustered index is used, execute the following statement:
    explain select * from tmp where a > 1;
    cluster
  • To check whether the bitmap index is used, execute the following statement:
    explain select * from tmp where c = 1;
    bitmap
  • To check whether the segment key is used, execute the following statement:
    explain select * from tmp where b > 1;
    segment

Optimize single-stage aggregation to multi-stage aggregation

If aggregate operators are time-consuming, check whether local shard-level pre-aggregation is performed.

You can perform aggregation operations on individual shards to decrease the volume of data that is involved in the final aggregation operation and improve performance. The following aggregation operations are supported:
  • Three-stage aggregation: aggregates data by file, aggregates data in each shard, and then summarizes the aggregation results of all shards. Three-stage aggregation
  • Two-stage aggregation: aggregates data in each shard and summarizes the aggregation results of all shards. Two-stage aggregation
You can execute the following statement to forcefully perform multi-stage aggregation in Hologres:
set optimizer_force_multistage_agg = on;

Optimize SQL statements

You can optimize SQL statements to improve query efficiency.
  • Optimize data writes and updates by using fixed plans.

    Fixed plans are a specific optimization provided by Hologres for high-throughput data write, update, and delete scenarios. Simplified execution paths are used to improve performance and throughput by several times. For more information about fixed plans, see Accelerate the execution of SQL statements by using fixed plans.

  • Change operators.

    If an SQL statement contains operators that are not supported by a Hologres execution engine, the system sends the operators to a PostgreSQL execution engine. In this case, query performance is low. You must optimize the SQL statement. The following example demonstrates how to optimize an SQL statement:

    Hologres does not support the NOT IN operation. In this case, the system uses a PostgreSQL execution engine to perform the NOT IN operation. In addition, the system changes NOT IN to NOT EXISTS. Before optimization, execute the following statement to check the query plan for the SQL statement:
    explain select * from tmp where a not in (select a from tmp1);
    The External operator indicates that the SQL statement is executed in the PostgreSQL execution engine. post
    After optimization, execute the following statement to check the query plan for the SQL statement. The SQL statement is not executed in the PostgreSQL execution engine.
    explain select * from tmp where not exists (select a from tmp1 where a = tmp.a);
    SQL statement after optimization
    You can change functions to send operators to a Hologres built-in execution engine. The following table describes recommended changing results.
    Function not supported by Hologres Recommended function Example Description
    regexp_split_to_table(string text, pattern text ) unnest(string_to_array)
    select name,unnest(string_to_array(age,',')) from demo;
    The regexp_split_to_table function supports regular expressions.
    substring extract(hour from to_timestamp(c1, 'YYYYMMDD HH24:MI:SS'))
    select cast(substring(c1, 13, 2) as int) AS hour from t2;
    Change it to:
    select extract(hour from to_timestamp(c1, 'YYYYMMDD HH24:MI:SS')) from t2;
    Some of the Hologres V0.10.x and all earlier versions do not support the substring function.
    regexp_replace replace
    select regexp_replace(c1::text,'-','') from t2;
    Change it to:
    select replace(c1::text,'-','') from t2;
    The replace function does not support regular expressions.
    at time zone 'utc' Delete the at time zone 'utc' function.
    select date_trunc('day',to_timestamp(c1, 'YYYYMMDD HH24:MI:SS')  at time zone 'utc') from t2
    Change it to:
    select date_trunc('day',to_timestamp(c1, 'YYYYMMDD HH24:MI:SS') ) from t2;
    None.
    cast(text to timestamp) to_timestamp
    select cast(c1 as timestamp) from t2;
    Change it to:
    select to_timestamp(c1, 'yyyyMMdd hh24:mi:ss') from t2;
    None.
    timestamp::text to_char
    select c1::text from t2;
    Change it to:
    select to_char(c1, 'yyyyMMdd hh24:mi:ss') from t2;
    None.
    array_agg string_agg
    select array_agg(c1) from t2;
    Change it to:
    select string_agg(c1,' ') from t2;
    None.
  • Use INSERT ON CONFLICT instead of UPDATE.

    The UPDATE statement uses table-level locks. The granularity of table-level locks is relatively large. Concurrent UPDATE operations cause more overheads in waiting for locks. The INSERT ON CONFLICT statement uses row-level locks and provides higher execution efficiency.

  • Do not perform fuzzy searches.

    No index is created in a fuzzy search operation such as LIKE. Fuzzy searches are full table scans.

  • Prevent OOM exceptions.
    The following table describes the commands that you can run to prevent OOM exceptions.
    Command Description Valid value Recommendation
    set hg_experimental_scan_node_dop=<number>;
    Sets the number of tables that are concurrently scanned by the scan operator in a shard. The maximum value is the number of tables in a shard. By default, the number is set to the optimal value for query performance. 0 to 512
    Note If a large value is set, an OOM exception may occur. As a result, the import or query may fail or the instance may even restart and make the service unavailable. If a small value is set, the query performance may be low.
    1. Execute the SHOW hg_experimental_scan_node_dop statement to query the current number.
    2. Divide the queried number by two. The result is the recommended value of the hg_experimental_scan_node_dop parameter.
      Note You can also use other values based on your business requirements.
    set hg_experimental_dml_bulkload_dop=<number>;
    Sets the number of tables that are concurrently written or updated. The maximum value is the number of tables in a shard. The minimum value is 1. By default, the number is set to the optimal value for query performance. 0 to 512
    Note If a large value is set, an OOM exception may occur. As a result, the import or query may fail or the instance may even restart and make the service unavailable. If a small value is set, the query performance may be low.
    1. Execute the SHOW hg_experimental_dml_bulkload_dop statement to query the current number.
    2. Divide the queried number by two. The result is the recommended value of the hg_experimental_dml_bulkload_dop parameter.
      Note You can also use other values based on your business requirements.