Use Practice: Fixed Plan Speeds Up SQL Execution

Use Practice: Fixed Plan Speeds Up SQL Execution


This article will introduce how to speed up SQL operation through fixed plan in Hologres

SQL Execution.Query execution process


In Hologres, when the client initiates an SQL, the execution process is as follows (take one of the nodes as an example):
1.The Frontend (FE) node parses and authenticates the SQL, and distributes it to different execution modules of the Query Engine.
2.The execution engine (Query Engine) will take different execution paths according to the characteristics of SQL.
oIf it is a point-check/point-write scenario, the Query Optimizer (QO) will be skipped, and data will be directly distributed to the backend to reduce data transmission links and achieve better performance. The entire execution link is also called the Fixed Plan. The point check (with HBase's KV query) and point write scenarios will go directly to the Fixed Plan.
oIf it is an OLAP query and write scenario: First, the optimizer (Query Optimizer, QO) parses the SQL to generate an execution plan. In the execution plan, the operator execution cost, statistical information, space clipping, etc. are estimated. Through the generated execution plan, QO decides to use HQE, PQE, SQE or Hive QE to perform real computation on the operator.
3.The execution engine determines the correct execution plan, then obtains data through the storage engine (Storage Engine, SE), and finally merges the data on each shard and returns it to the client.

Fixed Plan is Hologres' unique execution engine optimization method. Traditional SQL execution needs to go through multiple components such as optimizer, coordinator, query engine, storage engine, etc., while Fixed Plan chooses the short path (Short-Cut) to optimize the execution of SQL. Bypass the overhead of the optimizer, coordinator, and some query engines. The Fixed FrontEnd is directly connected to the Fixed Query Engine to double the SQL execution efficiency. It is a key optimization method to support high-throughput real-time writing and high-concurrency queries.


The SQL that can be selected by the Fixed Plan needs to meet certain conditions and certain parameter configurations. These configuration parameters and SQL conditions will be introduced below.


Related GUC parameters
GUC list
The following are the parameter configurations that the fixed plan needs to use:
Note: The following Fixed Plan related parameters have been enabled by default in the Holo Client and take effect at the session level.
GUC use
1) Check if GUC is turned on
Use the show command to check whether the GUC is enabled:
show < GUC_name > ;

--Example: Check whether fixed plan is enabled at the instance level
show hg_experimental_enable_fixed_dispatcher ;

2) Turn on GUC at the session level
GUC parameters can be set at the session level through the set command. Session-level parameters only take effect in the current session. When the connection is disconnected, they will become invalid. It is recommended to add them before SQL and execute them together.
Syntax examples are as follows.
set < GUC_name > = < values > ;
GUC_name is the name of the GUC parameter, and values is the value of the GUC parameter.
An example of use is as follows.
--insert on conflict multi-line records support Fixed Plan writing
set hg_experimental_enable_fixed_dispatcher_for_multi_values = on ;

3) Database level
You can use the alter database xx set xxx command to set the GUC parameters at the DB level. After the execution is completed, it will take effect at the entire DB level. After the setting is completed, the current connection needs to be disconnected again to take effect. The newly created DB will not take effect and needs to be manually set again. Syntax examples are as follows.
alter database < db_name > set < GUC_name > = < values > ;
Where: db_name is the database name, GUC_name is the name of the GUC parameter, and values is the value of the GUC parameter.
The usage example is as follows:
--DB level open fixed plan
alter database < db_name > set
hg_experimental_enable_fixed_dispatcher = on ;
Requirements for data types
•Each column of the table cannot be MONEY or MONEY ARRAY
•Types supported by DML (INSERT/UPDATE/DELETE) columns and SELECT (select target columns and where columns must be satisfied) columns:
oBOOLEAN (alias BOOL)
oSMALLINT
oINTEGER (alias INT or INT4)
oBIGINT (alias INT8)
oFLOAT (alias FLOAT4)
oDOUBLE PRECISION (alias FLOAT8)
oCHAR(n)
oVARCHAR(n)
oBYTEA
oJSON and JSONB
oTEXT (alias VARCHAR)
oTIMESTAMP WITH TIME ZONE (alias TIMESTAMPTZ)
oDATE
oTIMESTAMP
oDECIMAL (alias NUMERIC)
oROARINGBITMAP
oarray type
boolean[]
smallint[]
int4[]
int8[]
float4[]
float8[]
char(n)[]
varchar(n)[]
text[]
INSERT scene
Insert expression
fixed plan supports the following insert expressions:
---Write a single line
insert into table ( col1 , col2 , col3.. ) values ( ? , ? , ?.. ) on conflict xxx ;
--- write multiple lines
insert into table ( col1 , col2 , col3.. ) values ( ? , ? , ?.. ) , ( ? , ? , ?.. ) on conflict xxx ;
•Support inner table, not outer table
•Partitioned sub-table is supported, but partitioned main table is not supported (V1.3 version supports writing to partitioned main table)
Insert on conflict single line
Support scenarios:
1.Support no on conflict expression
2.Support on conflict do nothing expressions
3.When on conflict do update is supported, all insert non-PK (Primary Key, primary key, hereinafter referred to as PK) columns must be updated. Whether PK is updated can be updated, and it can only be updated by col = excluded.col
4.Example of use:
begin ;
create table test_insert_oneline (
pk1 int ,
pk2 int ,
col1 int ,
col2 int ,
primary key ( pk1 , pk2 )
) ;
commit ;

--update all non-PK columns can be Fixed Plan
insert into test_insert_oneline values ( 1 , 2 , 3 , 4 ) on conflict ( pk1 , pk2 ) do update set col1 = excluded .col1 , col2 = excluded .col2 ;

--update all columns (including PK and non-PK), can be Fixed Plan
insert into test_insert_oneline values(1,2,3,4) on conflict(pk1,pk2) do update set col1 = excluded.col1, col2 = excluded.col2, pk1 = excluded.pk1, pk2 = excluded.pk2;

--Must update all non-pk columns to be inserted, this example does not contain col2, so it cannot be Fixed Plan
insert into test_insert_oneline values ( 1 , 2 , 3 , 4 ) on conflict ( pk1 , pk2 ) do update set col1 = excluded .col1 ;

--It must be updated by set col = excluded.col, so it cannot be Fixed Plan
insert into test_insert_oneline values ( 1 , 2 , 3 , 4 ) on conflict ( pk1 , pk2 ) do update set col1 = excluded .col1 , col2 = 5 ;
Insert on conflict multiple lines
1) When insert on conflict has multiple lines, the expression is as follows:
set hg_experimental_enable_fixed_dispatcher_for_multi_values =on;
insert into table(col1,col2,col3..) values(?,?,?..),(?,?,?..) on conflict xxx;
•需要配置GUC hg_experimental_enable_fixed_dispatcher_for_multi_values为on
•Atomicity is not guaranteed, that is, when writing multiple entries at a time, if no error is reported, all are written normally. If an error is reported, it may not be written at all or may be partially written and partially written


2) Write multiple lines Another way of writing:
set hg_experimental_enable_fixed_dispatcher_for_multi_values = on ;
insert into table select
unnest ( ARRAY [ true , false , true ] :: bool []) ,
unnest ( ARRAY [ 1 , 2 , 3 ] :: int4 []) ,
unnest ( ARRAY [ 1.11 , 2.222 , 3 ] :: float4 []) on conflict xxx ;
•Need to configure hg_experimental_enable_fixed_dispatcher_for_multi_values to on
•The column being written cannot be an array type
•ARRAY in unnest must be explicitly cast to the array type of the corresponding column type
•Example of use:
begin;
create table test_insert_multiline(
pk1 int8,
col1 float4,
primary key(pk1)
);
commit;

--支持Fixed Plan
set hg_experimental_enable_fixed_dispatcher_for_multi_values =on;
insert into test_insert_multiline select unnest ( ARRAY [ 1 , 2 , 3 ] :: int8 []) , unnest ( ARRAY [ 1.11 , 2.222 , 3 ] :: float4 []) on conflict do nothing ;

--unnest ARRAY has no explicit cast, does not support Fixed Plan
insert into test_insert_multiline select unnest ( ARRAY [ 1 , 2 , 3 ]) , unnest ( ARRAY [ 1.11 , 2.222 , 3 ]) on conflict do nothing ;

--The first column is int8, so it should be cast to int8[], the example here is int4[], so Fixed Plan is not supported
insert into test_insert_multiline select unnest ( ARRAY [ 1 , 2 , 3 ] :: int4 []) , unnest ( ARRAY [ 1.11 , 2.222 , 3 ] :: float4 []) on conflict do nothing ;
local update scene
Hologres supports updating some columns of the table through the primary key. Fixed Plan can also support partial updating, and the following conditions must be met
•Insert columns need to correspond one-to-one with update columns, including quantity and order
•Can only be updated by col = excluded.col

Default column
When the table contains the default column, the conditions of the fixed plan can be followed when writing:
•Support when inserting a single
•When inserting multiple entries, the instance version needs to be >= 1.1.36. If it is lower than that, please upgrade. At the same time, you need to set the GUC parameter hg_experimental_enable_fixed_dispatcher_for_multi_values to on
•Tables with default columns do not support Fixed Plan for insert on conflict expressions
•Example of use:
begin ;
create table test_insert_default (
pk1 int ,
col1 int default 99 ,
primary key ( pk1 )
) ;
commit ;

--Support Fixed Plan
insert into test_insert_default ( pk1 ) values ( 1 ) ;

--Requires V1.1.36+ version support
set hg_experimental_enable_fixed_dispatcher_for_multi_values = on ;
insert into test_insert_default ( pk1 ) values ( 1 ) , ( 2 ) , ( 3 ) ;
Serial column
When the table has an auto-incrementing sequence serial, it supports single or multiple writes to take the fixed plan. The usage conditions are as follows:
•Need to configure GUC hg_experimental_enable_fixed_dispatcher_autofill_series to on
•When writing multiple lines, you need to configure the GUC parameter hg_experimental_enable_fixed_dispatcher_for_multi_values to on
•Tables with serial columns do not support Fixed Plan for insert on conflict expressions
begin ;
create table test_insert_serial (
pk1 int ,
col1 serial ,
primary key ( pk1 )
) ;
commit ;

--Support Fixed Plan
set hg_experimental_enable_fixed_dispatcher_autofill_series = on ;
insert into test_insert_serial (pk1) values(1);

--支持Fixed Plan
set hg_experimental_enable_fixed_dispatcher_autofill_series =on;
set hg_experimental_enable_fixed_dispatcher_for_multi_values =on;
insert into test_insert_serial ( pk1 ) values ( 1 ) , ( 2 ) , ( 3 ) ;

UPDATE scene
Update expression
The expressions that can take the fixed plan during update are as follows:
set hg_experimental_enable_fixed_dispatcher_for_update = on ;
update table set col1 = ? , col2 = ? where pk1 = ? and pk2 = ?;
•Need to configure GUC hg_experimental_enable_fixed_dispatcher_for_update to on.
•Support inner table, not outer table
•Cannot be a partitioned main table, supports partitioned sub-tables
•The table must have a primary key (pk)
Conditions of use of Update
The situations supported by the updtae scene are as follows:
•The column of set cannot be pk
•where there are and only all pk
•You can use pk in (?,?,?) or pk = ANY() to change multiple lines at one time eg pk1 in (1,2) and pk2 = any('{3,4}') and pk3 = 5 <=> Change (1,3,5), (1,4,5), (2,3,5), (2,4,5) four
•The same column in where can only have one condition (exactly the same as one condition)
•Example of use:
begin ;
create table test_update (
pk1 int ,
pk2 int ,
col1 int ,
col2 int ,
primary key(pk1, pk2)
);
commit;

--支持Fixed Plan
set hg_experimental_enable_fixed_dispatcher_for_update =on;
update test_update set col1 = 1, col2 = 2 where pk1 = 3 and pk2 = 4;

--支持Fixed Plan
set hg_experimental_enable_fixed_dispatcher_for_update =on;
update test_update set col1 = 1 where pk1 = 3 and pk2 = 4;

--支持Fixed Plan
set hg_experimental_enable_fixed_dispatcher_for_update =on;
update test_update set col1 = 1 , col2 = 2 where pk1 in ( 1 , 2 ) and pk2 = any ( '{3,4}' ) ;

--pk1 multiple filter conditions, does not support Fixed Plan
update test_update set col1 = 1 , col2 = 2 where pk1 = 3 and pk1 = 4 ;

--pk1 multiple filter conditions, does not support Fixed Plan
update test_update set col1 = 1 , col2 = 2 where pk1 in ( 1 , 2 ) and pk1 = 1 ;

--pk1 Multiple filter conditions, but the same filter conditions, support Fixed Plan
set hg_experimental_enable_fixed_dispatcher_for_update = on ;
update test_update set col1 = 1 , col2 = 2 where pk1 in ( 1 , 2 ) and pk1 in ( 1 , 2 ) and pk2 = 4 ;


DELETE scene
Delete expression
set hg_experimental_enable_fixed_dispatcher_for_delete = on ;
delete from table where pk1 = ? and pk2 = ? and pk3 = ?;
•Need to configure GUC hg_experimental_enable_fixed_dispatcher_for_delete to on
•Support inner table, not outer table
•Does not support partitioned main table, supports partitioned sub-table
•Table must have pk
Conditions of use of Delete
The scenarios supported by delete scenarios are as follows:
•where there are and only all pk
•You can use pk in (?,?,?) or pk = ANY() to delete multiple entries at once eg pk1 in (1,2) and pk2 = any('{3,4}') and pk3 = 5 <=> Delete (1,3,5), (1,4,5), (2,3,5), (2,4,5) four
•There can only be one condition in the same column (exactly the same as one condition)
•Example of use:
begin ;
create table test_delete (
pk1 int ,
pk2 int ,
col1 int ,
col2 int ,
primary key ( pk1 , pk2 )
) ;
commit ;

--Support Fixed Plan, more scenes are consistent with Update samples
set hg_experimental_enable_fixed_dispatcher_for_delete = on ;
delete from test_delete where pk1 = 1 and pk2 = 2 ;


SELECT scene
Select expression
select col1 , col2 , col3 ,... from table where pk1 = ? and pk2 = ? and pk3 = ?;
•Support inner table, not outer table
•Does not support partitioned main table, supports partitioned sub-table
•Table must have pk
Check (key/value) scene
The situations supported by the check-in scene are as follows:
•where there are and only all pk
•You can use pk in (?,?,?) or pk = ANY() to check multiple entries at once eg pk1 in (1,2) and pk2 = any('{3,4}') and pk3 = 5 <=> Check (1,3,5), (1,4,5), (2,3,5), (2,4,5) four
•There can only be one condition in the same column (exactly the same as one condition)
•If there is a limit, the value of limit must be > 0
begin ;
create table test_select (
pk1 int ,
pk2 int ,
col1 int ,
col2 int ,
primary key ( pk1 , pk2 )
) ;
commit ;

--Support Fixed Plan, more scene conditions are consistent with the Update sample
select * from test_select where pk1 = 1 and pk2 = 2 ;
PrefixScan Scenario
The PrefixScan scenario means that the table has multiple primary keys, and only a few columns of primary keys are checked according to the left matching principle . The query is as follows:
set hg_experimental_enable_fixed_dispatcher_for_scan = on;
select col1,col2,col3,... from table where pk1 = ? and pk2 = ?;
select col1 , col2 , col3 ,... from table where pk1 = ? and pk2 > ? and pk 2 < ?; -- Since version 1.1.48, the last column condition of pk is range
select col1 , col2 , col3 ,... from table where pk1 = ? and pk2 between ? and ?; -- Since version 1.1.48, the last column condition of pk is range

•Need to configure GUC hg_experimental_enable_fixed_dispatcher_for_scan to on, and V1.1.24+ version
•PrefixScan supports the last column condition of pk as range since version 1.1.48.
Note: prefixScan returns all result lines at one time. If the number of bytes of the result is greater than hg_experimental_fixed_scan_bytesize_limit, an error will be reported scan result size larger than fixed scan size limit. You can configure hg_experimental_fixed_scan_bytesize_limit to set a value more suitable for the scene. The default value is 1048576, which is 1MB.
•The table must have a distribution key


The usage conditions of PrefixScan are as follows:
•where there is and only the prefix of pk. Definition of prefix: If pk is (pk1, pk2, pk3) (pk1), (pk1, pk2) is prefix
•where must contain all distribution keys
•eg If the table pk is (pk1,pk2,pk3,pk4) and the distribution key is 'pk1,pk3', the support conditions are as follows:
begin ;
create table test_select_prefix (
pk1 int,
pk2 int,
pk3 int,
pk4 int,
primary key(pk1, pk2,pk3,pk4)
);
call set_table_property('test_select_prefix', 'distribution_key', 'pk1,pk3');
commit ;
--Does not contain all distribution keys, cannot go to fixed plan
select * from test_select_prefix where pk1 = ? and pk2 = ?;

--Not the prefix of pk, you can't go to the fixed plan
select * from test_select_prefix where pk1 = ? and pk3 = ?;
--You can take the fixed plan
set hg_experimental_enable_fixed_dispatcher_for_scan = on ;
select * from test_select_prefix where pk1 = ? and pk2 = ? and pk3 = ?;

•You can use pk in (?,?,?) or pk = ANY() to check multiple items at once
pk1 in (1,2) and pk2 = 3 <=> scan(1,3),(2,3) two groups
pk2 =any('{3,4}') and pk1 in (1,2) <=> scan(1,3),(1,4),(2,3),(2,4) four groups
•There can only be one condition in the same column (exactly the same as one condition)
•If there is limit, the value of limit must be > 0
•The usage example is as follows:
begin ;
create table test_scan (
pk1 int ,
pk2 int ,
pk3 int ,
col1 int ,
primary key(pk1, pk2, pk3)
);
CALL SET_TABLE_PROPERTY ('test_scan', 'distribution_key', 'pk1,pk2');
commit;

INSERT INTO test_scan values (1,2,3,4);

--支持Fixed Plan
set hg_experimental_enable_fixed_dispatcher_for_scan = on;
select * from test_scan where pk1 = 1 and pk2 = 2;

--支持Fixed Plan
set hg_experimental_enable_fixed_dispatcher_for_scan = on;
select * from test_scan where pk1 = 1 and pk2 in (2,3);

--支持Fixed Plan
set hg_experimental_enable_fixed_dispatcher_for_scan = on;
select * from test_scan where pk1 = ANY ( '{3,4}' ) and pk2 in ( 2 , 3 ) ;

--Support fixed plan, the last column of pk is the range condition, which needs to be supported by version 1.1.48 and above
set hg_experimental_enable_fixed_dispatcher_for_scan = on ;
select * from test_scan where pk1 = 1 and pk2 > 1 and pk2 < 3 ;

--Support fixed plan, the last column of pk is the range condition, which needs to be supported by version 1.1.48 and above
set hg_experimental_enable_fixed_dispatcher_for_scan = on ;
select * from test_scan where pk1 = 1 and pk2 between 1 and 3 ;

--Does not include all distribution keys, does not support Fixed Plan
select * from test_scan where pk1 = 1 ;

-- Does not meet the primary key prefix Prefix, does not support Fixed Plan
select * from test_scan where pk2 = 2 ;
Verify Fixed Plan
The update class SQL executed through FixedPlan will be displayed as SDK type in the real-time import RPS panel of the console , including operations of Insert, Update and Delete types. It is recommended that the real-time write classes Insert, Update, and Delete should be optimized to the Fixed Plan solution as much as possible to improve the efficiency of data update.
You can also check the SQL execution plan (explain sql) to verify whether it complies with the Fixed Plan. If there is a FixedXXXNode in the returned execution plan, it means that the Fixed Plan is triggered. If the FixedXXXNode execution plan is not generated, please check whether the conditions are met according to the support scenarios above.

performance tuning
In some scenarios, if you have enabled the Fixed Plan but still need to perform performance tuning, you can choose the following methods.
•Hologres V1.1.49 version has been optimized for the Fixed Plan enumeration scenario, and the throughput has been improved by more than 30% in the case of large-scale enumeration. If necessary, please upgrade the instance to V1.1.49 or later.
•The client should accumulate batches reasonably (using Holo Client will automatically accumulate batches), that is, the number of SQL commands to be executed at one time. Practice has proved that the number of 512 or a multiple of 512 will perform better.


Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00