Hints are used as supplementary SQL syntax to allow you to change the execution method of SQL statements. This facilitates SQL optimization. Hologres V2.2 and later support the hint syntax. You can use hints to change the execution method of SQL statements to optimize SQL statements and achieve better performance. This topic describes usage notes and use scenarios of hints.
Prerequisites
Before using hints, make sure that you have set the GUC parameter pg_hint_plan_enable_hint to on at the session level or database level based on your business requirements:
Enable the hint feature at the session level.
SET pg_hint_plan_enable_hint=on;Enable the hint feature at the database level. This configuration takes effect for new connections.
ALTER DATABASE <dbname> SET pg_hint_plan_enable_hint=on;
Limits
To use hints, you must make sure that the version of your Hologres instance is V2.2 or later. If the version of your Hologres instance is V2.1 or earlier, upgrade your instance.
Usage notes
You can specify hints for regular tables, subqueries, and common table expressions (CTEs). Views are not supported. Regular tables include foreign tables.
Hints must be enclosed between
/*+HINTand*/. Comments are not allowed in hints.Keywords of hints are not case-sensitive.
A hint can contain multiple keywords.
A hint applies to a specific query level of an SQL statement. A hint that applies to a parent query can contain only parameters in the parent query, and a hint that applies to a subquery can contain only parameters in the subquery. In the following sample code, the hint
/*+HINT Leading(tt t2) */can contain only thettandt2parameters but not thet1, t3, and tparameters. Similarly, the hint/*+HINT Leading(t t1) */can contain only thetandt1parameters but not thet2, t3, and ttparameters.SELECT /*+HINT Leading(t t1) */ * FROM t1 join ( SELECT /*+HINT Leading(tt t2) */ * FROM t2 join ( SELECT * FROM t3 ) tt ) t;For the
INSERT INTO ... SELECTstatement, the INSERT operation applies to the destination table and the outermost source table of the SELECT operation. The SELECT operation does not apply to the destination table. To prevent conflicts, you cannot specify hints for the SELECT operation if a hint is specified for the INSERT operation. Examples:Correct usage
-- Example 1: Specify the target, t1, and t2 parameters in the hint for the INSERT operation. INSERT /*+HINT Leading(target (t1 t2)) */ INTO target SELECT t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a -- Example 2: Specify the t1 and t2 parameters in the hint for the SELECT operation. INSERT INTO target SELECT /*+HINT Leading(t2 t1) */ t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a;Incorrect usage
-- Specify hints for the INSERT and SELECT operations. In this case, an error message is reported. INSERT /*+HINT Leading(target (t1 t2)) */ INTO target SELECT /*+HINT Leading(t2 t1) */ t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a; -- The following error message is reported. ERROR: insert statement with hint should not have sub select with hint at the same time
Grand Unified Configuration (GUC) hints take effect for the entire query statement even if you configure GUC hints for a subquery. Subsequent queries are not affected by the GUC hints that are configured for the current query statement.
In this example, a GUC hint is used to disable
count distinct reusefor the subquery. This configuration takes effect on the entire query.SELECT count(DISTINCT a), count(DISTINCT b) FROM ( SELECT /*+HINT set(hg_experimental_enable_reuse_cte_of_count_distinct off) */ t1.a t2.b FROM t1 JOIN t2 ON t1.a = t2.a)You can use parentheses () in hints to specify parameter priorities. Nested parentheses () are allowed. For example, the hint
Leading(t1 t2 t3)specifies that t1 and t2 are joined, and then t3 is joined. The hintLeading(t1 (t2 t3))specifies thatt2 and t3 are joined, and then t1 is joined.Join method hints and join order hints require at least two valid parameters to take effect.
NoteParameters that specify tables, subqueries, and CTEs of the query level to which a hint applies are considered as valid parameters for the hint. For example, the hints
Leading(t1 t1)andLeading(t1)contain only one valid parameter t1.Runtime filter hints take effect only on hash joins.
If the generated candidate plan does not contain the table join method specified by a join method hint, the hint does not take effect. For example, if you specify HashJoin(t1 t2) but the generated plan specifies that
t1 and t3 are joinedand then t2 is joined, the hint does not take effect. In this case, you can specify the hintLeading(t1 t2)to forcefully specify the join order.If the SELECT keyword is followed by multiple consecutive hints that are enclosed between
/*HINT+and*/, only the first hint takes effect. For example, in theSELECT /*+HINT HashJoin(t1 t2) */ /*+HINT Leading(t1 t2) */ ...statement, only the hash join hint takes effect, and the leading hint is ignored.If tables defined in the same type of hints conflict with each other, the first hint takes effect.
NoteConflicts occur in the following scenarios:
The same tables are specified in two hints.
The table sets are the same in two hints.
Parameters in a join order hint are a subset of the parameters in another join order hint.
Parameters in a hint are not a subset of the parameters in another hint. The hints can be join method hints, runtime filter hints, or skew join hints.
Example 1: The hints
HashJoin(t1 t2)andNestLoop(t2 t1)contain the same tables. In this case, a conflict occurs and only the hintHashJoin(t1 t2)is parsed.SELECT /*+HINT HashJoin(t1 t2) NestLoop(t2 t1) */ ...Example 2: Tables in the hint
Leading(t1 t2)are a subset of the tables in the hintLeading(t1 t2 t3). In this case, a conflict occurs and only the hintLeading(t1 t2)is parsed.SELECT /*+HINT Leading(t1 t2) Leading(t1 t2 t3) */ ...
If the specified hint plan does not meet the generation conditions, the execution plan cannot be generated. For example, you specify NestLoop and Right Join for two tables, but the plan is not supported. As a result, the following error message is reported:
ERROR: ORCA failed to produce a plan : No plan has been computed for required properties.
Hint format
The following code shows the format of hints in SQL statements:
SELECT|UPDATE|INSERT|DELETE /*+HINT <HintName(params)> */ ...HintName(params) specifies the hint keyword and the parameters. For more information, see Hint keywords.
Keywords of hints are not case-sensitive.
You can specify hints only after the INSERT, UPDATE, DELETE, and SELECT keywords.
Hints must be enclosed between
/*+HINTand*/.
Hint keywords
The following table describes the supported hint keywords and their parameter formats.
Type | Parameter format | Description | Example | Remarks |
Join method |
| Forcefully uses a nested loop join. | |
|
| Forcefully uses a hash join. | | ||
Join order |
| Forcefully uses a join order. | | |
| Forcefully uses a join order and a join direction. Note A join pair is a pair of tables or other connected objects that are enclosed in parentheses (). Join pairs support the nested structure. | | ||
Runtime filter |
| Forcefully triggers a runtime filter for hash joins on specified tables. | |
|
GUC |
| Specifies a GUC parameter value during the plan construction. Note
| |
|
Motion hint |
| Forces one side of the table set in a JOIN operation to be broadcast. | |
|
| Forces one side of the table set in a JOIN operation not to be broadcast. | | ||
| Forces one side of the table set in a JOIN operation to be gathered. | | ||
| Forces one side of the table set in a JOIN operation not to be gathered. | |
Scenarios
This section provides examples to demonstrate the scenarios in which hints can be used. The tables in the examples are created by executing the following data definition language (DDL) statements:
CREATE TABLE target (a int primary key, b int);
CREATE TABLE t1 (a int, b int);
CREATE TABLE t2 (a int, b int);
CREATE TABLE t3 (a int);
CREATE TABLE t4 (a int);Use hints to adjust join orders
You can use join order hints to adjust the order for table joins. In most cases, if statistical information is missing or incorrect, the join order is invalid. This significantly degrades the query performance of SQL statements.
In most cases, the statistical information is missing because the ANALYZE statement is not promptly executed. For more information about the ANALYZE statement, see ANALYZE and auto-analyze.
In most cases, after a filter or join operation is performed, the actual number of data rows in the result set greatly differs from the estimated number of data rows. In this case, the statistical information is incorrect.
If the join order is invalid, you can use GUC parameters or hints to manually adjust the join order based on your business requirements. Compared with GUC parameters, hints are easier and more convenient to use.
In this example, t1 Join t2 is specified. In hash joins, the small table is used to create a hash table, as indicated in the lower part of the hash operator in the execution plan. If the number of rows in table t2 is greater than the number of rows in table t1, the query performance deteriorates. In this case, you can execute the ANALYZE statement to update the statistical information or use hints to adjust the join order. For example, you can use the hint Leading(t2 t1) to change the join order to t2 Join t1 to make the execution plan valid. This helps improve the execution efficiency.
Sample SQL statements
SELECT /*+HINT Leading(t2 t1) */ t1.a FROM t1 JOIN t2 ON t1.a = t2.a;Comparison of execution plans
Execution plan when the hint feature is disabled
QUERY PLAN ----------------------------------------------------------------------------------- Gather (cost=0.00..10.07 rows=1000 width=4) -> Hash Join (cost=0.00..10.05 rows=1000 width=4) Hash Cond: (t1.a = t2.a) -> Redistribution (cost=0.00..5.01 rows=1000 width=4) Hash Key: t1.a -> Local Gather (cost=0.00..5.00 rows=1000 width=4) -> Seq Scan on t1 (cost=0.00..5.00 rows=1000 width=4) -> Hash (cost=5.01..5.01 rows=1000 width=4) -> Redistribution (cost=0.00..5.01 rows=1000 width=4) Hash Key: t2.a -> Local Gather (cost=0.00..5.00 rows=1000 width=4) -> Seq Scan on t2 (cost=0.00..5.00 rows=1000 width=4)Execution plan when the hint feature is enabled
QUERY PLAN ----------------------------------------------------------------------------------- Gather (cost=0.00..10.07 rows=1000 width=4) -> Hash Join (cost=0.00..10.05 rows=1000 width=4) Hash Cond: (t2.a = t1.a) -> Redistribution (cost=0.00..5.01 rows=1000 width=4) Hash Key: t2.a -> Local Gather (cost=0.00..5.00 rows=1000 width=4) -> Seq Scan on t2 (cost=0.00..5.00 rows=1000 width=4) -> Hash (cost=5.01..5.01 rows=1000 width=4) -> Redistribution (cost=0.00..5.01 rows=1000 width=4) Hash Key: t1.a -> Local Gather (cost=0.00..5.00 rows=1000 width=4) -> Seq Scan on t1 (cost=0.00..5.00 rows=1000 width=4)
Use GUC hints
In specific scenarios, you must use GUC parameters to achieve better query performance. GUC hints are used to configure GUC parameters at the query level. You can configure GUC parameters before you perform queries to achieve the same effect. GUC hints help you configure GUC parameters for a query in an efficient manner. After the query is complete, the GUC parameters become invalid. The GUC parameters do not take effect on other queries.
Sample SQL statements
SELECT /*+HINT set(hg_experimental_query_batch_size 512) */t1.a FROM t1 JOIN t2 ON t1.a = t2.a;Execution plan
QUERY PLAN Hash Join (cost=0.00..10.00 rows=1 width=4) Hash Cond: (t1.a = t2.a) -> Gather (cost=0.00..5.00 rows=1 width=4) -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t1 (cost=0.00..5.00 rows=1 width=4) -> Hash (cost=5.00..5.00 rows=1 width=4) -> Gather (cost=0.00..5.00 rows=1 width=4) -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t2 (cost=0.00..5.00 rows=1 width=4)
Use hints for CTEs and subqueries
This section provides examples on how to use hints for CTEs or subqueries to optimize execution plans.
Sample SQL statements
WITH c1 AS ( SELECT /*+HINT Leading(t2 t1) */ t1.a FROM ( ( SELECT /*+HINT leading(t2 t1) */ t1.a FROM t1 JOIN t2 ON t1.a = t2.a ) AS t1 JOIN ( SELECT /*+HINT NestLoop(t4 t3) */ t4.a FROM t3 JOIN t4 ON t3.a = t4.a ) AS t2 ON t1.a = t2.a ) ), c2 AS ( SELECT /*+HINT leading(t1 t2) */ t2.a FROM ( ( SELECT /*+HINT Leading(t1 t2) */ t1.a FROM t1 JOIN t2 ON t1.a = t2.a ) AS t1 JOIN ( SELECT /*+HINT Leading(t4 t3) */ t4.a FROM t3 JOIN t4 ON t3.a = t4.a ) AS t2 ON t1.a = t2.a ) ) SELECT /*+HINT NestLoop(v2 v1) */ * FROM ( ( SELECT /*+HINT Leading (c1 t2) */ c1.a FROM c1 JOIN t2 ON c1.a = t2.a ) AS v1 JOIN ( SELECT /*+HINT Leading (t1 c2) */ c2.a FROM t1 JOIN c2 ON t1.a = c2.a ) AS v2 ON v1.a = v2.a ) ORDER BY v2.a;Execution plan
QUERY PLAN Sort (cost=0.00..10660048.36 rows=1 width=8) Sort Key: t4_1.a -> Gather (cost=0.00..10660048.36 rows=1 width=8) -> Nested Loop (cost=0.00..10660048.36 rows=1 width=8) Join Filter: ((t1.a = t4_1.a) AND (t1.a = t1_1.a) AND (t2_1.a = t1_1.a) AND (t2_1.a = t4_1.a)) -> Hash Join (cost=0.00..25.01 rows=1 width=8) Hash Cond: (t1_1.a = t4_1.a) -> Redistribution (cost=0.00..5.00 rows=1 width=4) Hash Key: t1_1.a -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t1 t1_1 (cost=0.00..5.00 rows=1 width=4) -> Hash (cost=20.00..20.00 rows=1 width=4) -> Hash Join (cost=0.00..20.00 rows=1 width=4) Hash Cond: ((t1_2.a = t4_1.a) AND (t1_2.a = t3_1.a) AND (t2_2.a = t3_1.a) AND (t2_2.a = t4_1.a)) -> Hash Join (cost=0.00..10.00 rows=1 width=8) Hash Cond: (t1_2.a = t2_2.a) -> Redistribution (cost=0.00..5.00 rows=1 width=4) Hash Key: t1_2.a -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t1 t1_2 (cost=0.00..5.00 rows=1 width=4) -> Hash (cost=5.00..5.00 rows=1 width=4) -> Redistribution (cost=0.00..5.00 rows=1 width=4) Hash Key: t2_2.a -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t2 t2_2 (cost=0.00..5.00 rows=1 width=4) -> Hash (cost=10.00..10.00 rows=1 width=8) -> Hash Join (cost=0.00..10.00 rows=1 width=8) Hash Cond: (t4_1.a = t3_1.a) -> Redistribution (cost=0.00..5.00 rows=1 width=4) Hash Key: t4_1.a -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t4 t4_1 (cost=0.00..5.00 rows=1 width=4) -> Hash (cost=5.00..5.00 rows=1 width=4) -> Redistribution (cost=0.00..5.00 rows=1 width=4) Hash Key: t3_1.a -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t3 t3_1 (cost=0.00..5.00 rows=1 width=4) -> Materialize (cost=0.00..10385.07 rows=40 width=8) -> Broadcast (cost=0.00..10385.07 rows=40 width=8) -> Hash Join (cost=0.00..10385.07 rows=1 width=8) Hash Cond: (t1.a = t2_1.a) -> Hash Join (cost=0.00..10380.07 rows=1 width=4) Hash Cond: ((t4.a = t1.a) AND (t3.a = t1.a) AND (t3.a = t2.a) AND (t4.a = t2.a)) -> Redistribution (cost=0.00..10370.07 rows=1 width=8) Hash Key: t4.a -> Nested Loop (cost=0.00..10370.07 rows=1 width=8) Join Filter: (t3.a = t4.a) -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t3 (cost=0.00..5.00 rows=1 width=4) -> Materialize (cost=0.00..5.00 rows=40 width=4) -> Broadcast (cost=0.00..5.00 rows=40 width=4) -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t4 (cost=0.00..5.00 rows=1 width=4) -> Hash (cost=10.00..10.00 rows=1 width=8) -> Hash Join (cost=0.00..10.00 rows=1 width=8) Hash Cond: (t2.a = t1.a) -> Redistribution (cost=0.00..5.00 rows=1 width=4) Hash Key: t2.a -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t2 (cost=0.00..5.00 rows=1 width=4) -> Hash (cost=5.00..5.00 rows=1 width=4) -> Redistribution (cost=0.00..5.00 rows=1 width=4) Hash Key: t1.a -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t1 (cost=0.00..5.00 rows=1 width=4) -> Hash (cost=5.00..5.00 rows=1 width=4) -> Redistribution (cost=0.00..5.00 rows=1 width=4) Hash Key: t2_1.a -> Local Gather (cost=0.00..5.00 rows=1 width=4) -> Seq Scan on t2 t2_1 (cost=0.00..5.00 rows=1 width=4)
Use hints in INSERT statements
In most cases, you can use hints in the INSERT INTO ... SELECT syntax if the destination table is associated with the source table and you want to specify join orders or perform other adjustments. The SQL logic of the INSERT INTO ... SELECT syntax is complex and hints must be added based on the business plan.
Example 1: The hint takes effect on the destination table of the INSERT operation and the outermost source table of the SELECT operation.
If the amount of data returned by the
t1 Join t2operation is small, the amount of data in the destination table named target is large, and the statistical information is not updated, the generated execution plan may not be optimal. In this case, you can use hints to adjust the join order to achieve better performance.Sample SQL statements
--The hint takes effect on the destination table of the INSERT operation and the outermost source table of the SELECT operation. INSERT /*+HINT Leading(target (t1 t2)) */ INTO target SELECT t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a;Execution plan
QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Gather (cost=0.00..26.57 rows=1000 width=8) -> Insert (cost=0.00..26.54 rows=1000 width=8) -> Project (cost=0.00..16.12 rows=1000 width=8) -> Hash Right Join (cost=0.00..15.12 rows=1000 width=12) Hash Cond: (target.a = t1.a) -> Local Gather (cost=0.00..5.00 rows=1000 width=4) -> Seq Scan on target (cost=0.00..5.00 rows=1000 width=4) -> Hash (cost=10.07..10.07 rows=1000 width=8) -> Redistribution (cost=0.00..10.07 rows=1000 width=8) Hash Key: t1.a -> Hash Join (cost=0.00..10.06 rows=1000 width=8) Hash Cond: (t1.a = t2.a) -> Redistribution (cost=0.00..5.01 rows=1000 width=4) Hash Key: t1.a -> Local Gather (cost=0.00..5.00 rows=1000 width=4) -> Seq Scan on t1 (cost=0.00..5.00 rows=1000 width=4) -> Hash (cost=5.01..5.01 rows=1000 width=8) -> Redistribution (cost=0.00..5.01 rows=1000 width=8) Hash Key: t2.a -> Local Gather (cost=0.00..5.00 rows=1000 width=8) -> Seq Scan on t2 (cost=0.00..5.00 rows=1000 width=8)
Example 2: The hint takes effect on the SELECT subquery.
Sample SQL statements
NoteThe hints in the following INSERT statements can achieve the same effect.
INSERT INTO target SELECT /*+HINT Leading(t2 t1) */ t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a; INSERT /*+HINT Leading(t2 t1) */ INTO target SELECT t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a;Execution plan
QUERY PLAN ----------------------------------------------------------------------------------------------------------- Gather (cost=0.00..26.57 rows=1000 width=8) -> Insert (cost=0.00..26.54 rows=1000 width=8) -> Project (cost=0.00..16.12 rows=1000 width=8) -> Hash Left Join (cost=0.00..15.12 rows=1000 width=12) Hash Cond: (t1.a = target.a) -> Redistribution (cost=0.00..10.07 rows=1000 width=8) Hash Key: t1.a -> Hash Join (cost=0.00..10.06 rows=1000 width=8) Hash Cond: (t2.a = t1.a) -> Redistribution (cost=0.00..5.01 rows=1000 width=8) Hash Key: t2.a -> Local Gather (cost=0.00..5.00 rows=1000 width=8) -> Seq Scan on t2 (cost=0.00..5.00 rows=1000 width=8) -> Hash (cost=5.01..5.01 rows=1000 width=4) -> Redistribution (cost=0.00..5.01 rows=1000 width=4) Hash Key: t1.a -> Local Gather (cost=0.00..5.00 rows=1000 width=4) -> Seq Scan on t1 (cost=0.00..5.00 rows=1000 width=4) -> Hash (cost=5.00..5.00 rows=1000 width=4) -> Local Gather (cost=0.00..5.00 rows=1000 width=4) -> Seq Scan on target (cost=0.00..5.00 rows=1000 width=4)
Use hints in UPDATE statements
In most cases, you can use hints in UPDATE statements if the destination table is associated with the source table and manual adjustment is required.
Sample SQL statements
In this example, the amount of data in table t1 is larger than the amount of data in table target. You can configure a hint to use table target to create a hash table. This way, the join order is adjusted.
UPDATE /*+HINT Leading(t1 target) */ target SET b=t1.b+1 FROM t1 WHERE t1.a=target.a;Comparison of execution plans
Execution plan when the hint feature is disabled
QUERY PLAN ----------------------------------------------------------------------------------------------- Gather (cost=0.00..52.77 rows=1000 width=1) -> Update (cost=0.00..52.76 rows=1000 width=1) -> Project (cost=0.00..11.09 rows=1000 width=32) -> Hash Join (cost=0.00..10.08 rows=1000 width=32) Hash Cond: (target.a = t1.a) -> Local Gather (cost=0.00..5.00 rows=1000 width=28) -> Seq Scan on target (cost=0.00..5.00 rows=1000 width=28) -> Hash (cost=5.01..5.01 rows=1000 width=8) -> Redistribution (cost=0.00..5.01 rows=1000 width=8) Hash Key: t1.a -> Local Gather (cost=0.00..5.00 rows=1000 width=8) -> Seq Scan on t1 (cost=0.00..5.00 rows=1000 width=8)Execution plan when the hint feature is enabled
QUERY PLAN ---------------------------------------------------------------------------------------------- Gather (cost=0.00..52.77 rows=1000 width=1) -> Update (cost=0.00..52.76 rows=1000 width=1) -> Project (cost=0.00..11.09 rows=1000 width=32) -> Hash Join (cost=0.00..10.08 rows=1000 width=32) Hash Cond: (t1.a = target.a) -> Redistribution (cost=0.00..5.01 rows=1000 width=8) Hash Key: t1.a -> Local Gather (cost=0.00..5.00 rows=1000 width=8) -> Seq Scan on t1 (cost=0.00..5.00 rows=1000 width=8) -> Hash (cost=5.00..5.00 rows=1000 width=28) -> Local Gather (cost=0.00..5.00 rows=1000 width=28) -> Seq Scan on target (cost=0.00..5.00 rows=1000 width=28)
Use runtime filter hints
Hologres supports runtime filters. If an SQL statement does not meet the conditions for triggering a runtime filter, you can use a hint to forcefully trigger a runtime filter to improve query performance.
You can use a hint to forcefully trigger a runtime filter only for hash joins.
Runtime filters that are forcefully triggered do not always improve query performance. Use runtime filter hints based on your business requirements.
Sample SQL statements
SELECT /*+HINT runtimefilter(t1 t2) */ * FROM t1 JOIN t2 ON t1.a = t2.a;Execution plan
-- Runtime Filter appears in the execution plan, which indicates that a runtime filter is triggered. QUERY PLAN ----------------------------------------------------------------------------------- Gather (cost=0.00..10.13 rows=1000 width=16) -> Hash Join (cost=0.00..10.07 rows=1000 width=16) Hash Cond: (t1.a = t2.a) Runtime Filter Cond: (t1.a = t2.a) -> Redistribution (cost=0.00..5.01 rows=1000 width=8) Hash Key: t1.a -> Local Gather (cost=0.00..5.00 rows=1000 width=8) -> Seq Scan on t1 (cost=0.00..5.00 rows=1000 width=8) Runtime Filter Target Expr: t1.a -> Hash (cost=5.01..5.01 rows=1000 width=8) -> Redistribution (cost=0.00..5.01 rows=1000 width=8) Hash Key: t2.a -> Local Gather (cost=0.00..5.00 rows=1000 width=8) -> Seq Scan on t2 (cost=0.00..5.00 rows=1000 width=8)
Use motion hints
Example 1: Force the
t1table to be broadcast during the JOIN operation. This is typically used when statistics are inaccurate, leading to the shuffling of a large amount of data. Since Hash JOIN can broadcast a table only on the build side, it is necessary to use theLeadinghint to specify the JOIN order. This avoids the situation where the lack of accurate statistics results in a high cost penalty for broadcasting, which might otherwise cause the optimizer to choose thet1 JOIN t2order.Sample SQL statements
SELECT /*+HINT Leading(t2 t1) Broadcast(t1) */ * FROM t1 JOIN t2 ON t1.a = t2.a;Execution plan
QUERY PLAN ---------------------------------------------------------------------------------------------- Gather (cost=0.00..100000000000000005366162204393472.00 rows=1000 width=16) -> Hash Join (cost=0.00..100000000000000005366162204393472.00 rows=1000 width=16) Hash Cond: (t2.a = t1.a) -> Local Gather (cost=0.00..5.00 rows=1000 width=8) -> Seq Scan on t2 (cost=0.00..5.00 rows=1000 width=8) -> Hash (cost=100000000000000005366162204393472.00..100000000000000005366162204393472.00 rows=3000 width=8) -> Broadcast (cost=0.00..100000000000000005366162204393472.00 rows=3000 width=8) -> Local Gather (cost=0.00..5.00 rows=1000 width=8) -> Seq Scan on t1 (cost=0.00..5.00 rows=1000 width=8)
Example 2: Force the
t1table not to be broadcast. This is typically used when the statistics for the table that would be broadcast are inaccurate, leading to a significant underestimation of the actual number of rows. As a result, broadcasting a large table can lead to very poor performance.Sample data
CREATE TABLE test1(a int, b int); CREATE TABLE test2(a int, b int); INSERT INTO test1 SELECT 1, i FROM generate_series(1, 10) AS i; INSERT INTO test2 SELECT 1,i FROM generate_series(1, 1000000) AS i; analyze test1,test2;Sample SQL statements
Perform broadcast without using a hint
explain SELECT * FROM test1 JOIN test2 ON test1.b = test2.b;Use a hint to disable broadcast
explain SELECT /*+HINT NoBroadcast(test1) */ * FROM test1 JOIN test2 ON test1.b = test2.b;Execution plan
Perform broadcast without using a hint
QUERY PLAN --------------------------------------------------------------------------------- Gather (cost=0.00..51.98 rows=1000000 width=16) -> Hash Join (cost=0.00..13.12 rows=1000000 width=16) Hash Cond: (test2.b = test1.b) -> Local Gather (cost=0.00..5.23 rows=1000000 width=8) -> Seq Scan on test2 (cost=0.00..5.20 rows=1000000 width=8) -> Hash (cost=5.00..5.00 rows=200 width=8) -> Broadcast (cost=0.00..5.00 rows=200 width=8) -> Local Gather (cost=0.00..5.00 rows=10 width=8) -> Seq Scan on test1 (cost=0.00..5.00 rows=10 width=8)Use a hint to disable broadcast
QUERY PLAN --------------------------------------------------------------------------------- Gather (cost=0.00..53.23 rows=1000000 width=16) -> Hash Join (cost=0.00..14.37 rows=1000000 width=16) Hash Cond: (test2.b = test1.b) -> Redistribution (cost=0.00..6.48 rows=1000000 width=8) Hash Key: test2.b -> Local Gather (cost=0.00..5.23 rows=1000000 width=8) -> Seq Scan on test2 (cost=0.00..5.20 rows=1000000 width=8) -> Hash (cost=5.00..5.00 rows=10 width=8) -> Redistribution (cost=0.00..5.00 rows=10 width=8) Hash Key: test1.b -> Local Gather (cost=0.00..5.00 rows=10 width=8) -> Seq Scan on test1 (cost=0.00..5.00 rows=10 width=8)