Usage notes and use scenarios of hints - Hologres - Alibaba Cloud Documentation Center

Hints are used as supplementary SQL syntax to allow you to change the execution method of SQL statements. This facilitates SQL optimization. Hologres V2.2 and later support the hint syntax. You can use hints to change the execution method of SQL statements to optimize SQL statements and achieve better performance. This topic describes usage notes and use scenarios of hints.

Prerequisites

Before using hints, make sure that you have set the GUC parameter pg_hint_plan_enable_hint to on at the session level or database level based on your business requirements:

Enable the hint feature at the session level.
```
SET pg_hint_plan_enable_hint=on;
```
Enable the hint feature at the database level. This configuration takes effect for new connections.
```
ALTER DATABASE <dbname> SET pg_hint_plan_enable_hint=on;
```

Limits

To use hints, you must make sure that the version of your Hologres instance is V2.2 or later. If the version of your Hologres instance is V2.1 or earlier, upgrade your instance.

Usage notes

You can specify hints for regular tables, subqueries, and common table expressions (CTEs). Views are not supported. Regular tables include foreign tables.
Hints must be enclosed between /*+HINT and */. Comments are not allowed in hints.
Keywords of hints are not case-sensitive.
A hint can contain multiple keywords.
A hint applies to a specific query level of an SQL statement. A hint that applies to a parent query can contain only parameters in the parent query, and a hint that applies to a subquery can contain only parameters in the subquery. In the following sample code, the hint /*+HINT Leading(tt t2) */ can contain only the tt and t2 parameters but not the t1, t3, and t parameters. Similarly, the hint /*+HINT Leading(t t1) */ can contain only the t and t1 parameters but not the t2, t3, and tt parameters.
```
SELECT /*+HINT Leading(t t1) */ * FROM t1 join (
  SELECT /*+HINT Leading(tt t2) */ * FROM t2 join (
    SELECT * FROM t3
  ) tt
) t;
```

For the INSERT INTO ... SELECT statement, the INSERT operation applies to the destination table and the outermost source table of the SELECT operation. The SELECT operation does not apply to the destination table. To prevent conflicts, you cannot specify hints for the SELECT operation if a hint is specified for the INSERT operation. Examples:

Correct usage

-- Example 1: Specify the target, t1, and t2 parameters in the hint for the INSERT operation. 
INSERT /*+HINT Leading(target (t1 t2)) */ INTO target SELECT t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a

-- Example 2: Specify the t1 and t2 parameters in the hint for the SELECT operation. 
INSERT INTO target SELECT /*+HINT Leading(t2 t1) */ t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a;

Incorrect usage

-- Specify hints for the INSERT and SELECT operations. In this case, an error message is reported. 
INSERT /*+HINT Leading(target (t1 t2)) */ INTO target SELECT /*+HINT Leading(t2 t1) */ t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a;

-- The following error message is reported.
ERROR: insert statement with hint should not have sub select with hint at the same time

Grand Unified Configuration (GUC) hints take effect for the entire query statement even if you configure GUC hints for a subquery. Subsequent queries are not affected by the GUC hints that are configured for the current query statement.
In this example, a GUC hint is used to disable count distinct reuse for the subquery. This configuration takes effect on the entire query.
```
SELECT
    count(DISTINCT a),
    count(DISTINCT b)
FROM (
    SELECT
        /*+HINT set(hg_experimental_enable_reuse_cte_of_count_distinct off) */
        t1.a t2.b
    FROM
        t1
        JOIN t2 ON t1.a = t2.a)
```
You can use parentheses () in hints to specify parameter priorities. Nested parentheses () are allowed. For example, the hint Leading(t1 t2 t3) specifies that t1 and t2 are joined, and then t3 is joined. The hint Leading(t1 (t2 t3)) specifies that t2 and t3 are joined, and then t1 is joined.
Join method hints and join order hints require at least two valid parameters to take effect.
Note
Parameters that specify tables, subqueries, and CTEs of the query level to which a hint applies are considered as valid parameters for the hint. For example, the hints Leading(t1 t1) and Leading(t1) contain only one valid parameter t1.
Runtime filter hints take effect only on hash joins.
If the generated candidate plan does not contain the table join method specified by a join method hint, the hint does not take effect. For example, if you specify HashJoin(t1 t2) but the generated plan specifies that t1 and t3 are joined and then t2 is joined, the hint does not take effect. In this case, you can specify the hint Leading(t1 t2) to forcefully specify the join order.
If the SELECT keyword is followed by multiple consecutive hints that are enclosed between /*HINT+ and */, only the first hint takes effect. For example, in the SELECT /*+HINT HashJoin(t1 t2) */ /*+HINT Leading(t1 t2) */ ... statement, only the hash join hint takes effect, and the leading hint is ignored.
If tables defined in the same type of hints conflict with each other, the first hint takes effect.
Note
Conflicts occur in the following scenarios:
- The same tables are specified in two hints.
- The table sets are the same in two hints.
- Parameters in a join order hint are a subset of the parameters in another join order hint.
- Parameters in a hint are not a subset of the parameters in another hint. The hints can be join method hints, runtime filter hints, or skew join hints.
- Example 1: The hints HashJoin(t1 t2) and NestLoop(t2 t1) contain the same tables. In this case, a conflict occurs and only the hint HashJoin(t1 t2) is parsed.
```
SELECT /*+HINT HashJoin(t1 t2) NestLoop(t2 t1) */ ...
```
- Example 2: Tables in the hint Leading(t1 t2) are a subset of the tables in the hint Leading(t1 t2 t3). In this case, a conflict occurs and only the hint Leading(t1 t2) is parsed.
```
SELECT /*+HINT Leading(t1 t2) Leading(t1 t2 t3) */ ...
```
If the specified hint plan does not meet the generation conditions, the execution plan cannot be generated. For example, you specify NestLoop and Right Join for two tables, but the plan is not supported. As a result, the following error message is reported: ERROR: ORCA failed to produce a plan : No plan has been computed for required properties.

Hint format

The following code shows the format of hints in SQL statements:

SELECT|UPDATE|INSERT|DELETE /*+HINT <HintName(params)>  */ ...

HintName(params) specifies the hint keyword and the parameters. For more information, see Hint keywords.

Note

Keywords of hints are not case-sensitive.
You can specify hints only after the INSERT, UPDATE, DELETE, and SELECT keywords.
Hints must be enclosed between /*+HINT and */.

Hint keywords

The following table describes the supported hint keywords and their parameter formats.

Type	Parameter format	Description	Example	Remarks
Join method	`NestLoop(table1 table2[ table...])`	Forcefully uses a nested loop join.	`SELECT /+HINT NestLoop(table1 table2) / * FROM table1 JOIN table2 ON table1.a = table2.a;`	The hints take effect only when at least two valid parameters are specified in the hints. Note Parameters that specify tables, subqueries, or CTEs of the query level to which a hint applies are considered as valid parameters for the hint. The hints are supported in Hologres V2.2 and later.
Join method	`HashJoin(table1 table2[ table...])`	Forcefully uses a hash join.	`SELECT /+HINT HashJoin(table1 table2 table3) / table1.a FROM table1 JOIN table2 ON table1.a = table2.a JOIN table3 ON table1.a = table3.a;`
Join order	`Leading(table1 table2[ table...])`	Forcefully uses a join order.	`SELECT /+HINT Leading(table2 table1) / table1.a from table1 Join table2 on table1.a = table2.a;`
Join order	`Leading(<Join pair>)`	Forcefully uses a join order and a join direction. Note A join pair is a pair of tables or other connected objects that are enclosed in parentheses (). Join pairs support the nested structure.	`SELECT /+HINT Leading((table2 table1) (table3 table4)) / * FROM table1 LEFT JOIN table2 ON table1.a = table2.a RIGHT JOIN table3 ON table1.a = table3.a LEFT JOIN table4 ON table3.a = table4.a ORDER BY table1.a;`
Runtime filter	`RuntimeFilter(table1 table2[ table...])`	Forcefully triggers a runtime filter for hash joins on specified tables.	`SELECT /+HINT RuntimeFilter(table1 table2) / * FROM table1 JOIN table2 ON table1.a = table2.a;`	The hint takes effect only for hash joins. The hint is supported in Hologres V2.2 and later.
GUC	`Set(GUC-parameter value)`	Specifies a GUC parameter value during the plan construction. Note GUC-parameter indicates the name of the GUC parameter. value indicates the value of the GUC parameter. For more information about the GUC parameters supported by Hologres, see GUC parameters.	`EXPLAIN SELECT /+HINT set(hg_experimental_enable_reuse_cte_of_count_distinct off) / count(DISTINCT a), count(DISTINCT b) FROM table1;`	The hint takes effect only for the current query. After the current query is complete, subsequent queries are not affected by the GUC hint. The hint is supported in Hologres V2.2 and later.
Motion hint	`Broadcast(table table[ table...])`	Forces one side of the table set in a JOIN operation to be broadcast.	`SELECT /+HINT Broadcast(t2) Leading(t1 t2) / * FROM t1 JOIN t2 ON t1.a=t2.a;`	The hint is used with Leading in most cases. The hint is supported in Hologres V3.0 and later.
	`NoBroadcast(table table[ table...])`	Forces one side of the table set in a JOIN operation not to be broadcast.	`SELECT /+HINT NoBroadcast(t2) Leading(t1 t2) / * FROM t1 JOIN t2 ON t1.a=t2.a;`
	`Gather(table table[ table...])`	Forces one side of the table set in a JOIN operation to be gathered.	`SELECT /+HINT Gather(t2) Leading(t1 t2) / * FROM t1 JOIN t2 ON t1.a=t2.a;`
	`NoGather(table table[ table...])`	Forces one side of the table set in a JOIN operation not to be gathered.	`SELECT /+HINT NoGather(t2) Leading(t1 t2) / * FROM t1 JOIN t2 ON t1.a=t2.a;`

Scenarios

This section provides examples to demonstrate the scenarios in which hints can be used. The tables in the examples are created by executing the following data definition language (DDL) statements:

CREATE TABLE target (a int primary key, b int);
CREATE TABLE t1 (a int, b int);
CREATE TABLE t2 (a int, b int);
CREATE TABLE t3 (a int);
CREATE TABLE t4 (a int);

Use hints to adjust join orders

You can use join order hints to adjust the order for table joins. In most cases, if statistical information is missing or incorrect, the join order is invalid. This significantly degrades the query performance of SQL statements.

In most cases, the statistical information is missing because the ANALYZE statement is not promptly executed. For more information about the ANALYZE statement, see ANALYZE and auto-analyze.
In most cases, after a filter or join operation is performed, the actual number of data rows in the result set greatly differs from the estimated number of data rows. In this case, the statistical information is incorrect.

If the join order is invalid, you can use GUC parameters or hints to manually adjust the join order based on your business requirements. Compared with GUC parameters, hints are easier and more convenient to use.

In this example, t1 Join t2 is specified. In hash joins, the small table is used to create a hash table, as indicated in the lower part of the hash operator in the execution plan. If the number of rows in table t2 is greater than the number of rows in table t1, the query performance deteriorates. In this case, you can execute the ANALYZE statement to update the statistical information or use hints to adjust the join order. For example, you can use the hint Leading(t2 t1) to change the join order to t2 Join t1 to make the execution plan valid. This helps improve the execution efficiency.

Sample SQL statements

SELECT /*+HINT Leading(t2 t1) */ t1.a FROM t1 JOIN t2 ON t1.a = t2.a;

Comparison of execution plans

Execution plan when the hint feature is disabled

QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Gather  (cost=0.00..10.07 rows=1000 width=4)
   ->  Hash Join  (cost=0.00..10.05 rows=1000 width=4)
         Hash Cond: (t1.a = t2.a)
         ->  Redistribution  (cost=0.00..5.01 rows=1000 width=4)
               Hash Key: t1.a
               ->  Local Gather  (cost=0.00..5.00 rows=1000 width=4)
                     ->  Seq Scan on t1  (cost=0.00..5.00 rows=1000 width=4)
         ->  Hash  (cost=5.01..5.01 rows=1000 width=4)
               ->  Redistribution  (cost=0.00..5.01 rows=1000 width=4)
                     Hash Key: t2.a
                     ->  Local Gather  (cost=0.00..5.00 rows=1000 width=4)
                           ->  Seq Scan on t2  (cost=0.00..5.00 rows=1000 width=4)

Execution plan when the hint feature is enabled

QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Gather  (cost=0.00..10.07 rows=1000 width=4)
   ->  Hash Join  (cost=0.00..10.05 rows=1000 width=4)
         Hash Cond: (t2.a = t1.a)
         ->  Redistribution  (cost=0.00..5.01 rows=1000 width=4)
               Hash Key: t2.a
               ->  Local Gather  (cost=0.00..5.00 rows=1000 width=4)
                     ->  Seq Scan on t2  (cost=0.00..5.00 rows=1000 width=4)
         ->  Hash  (cost=5.01..5.01 rows=1000 width=4)
               ->  Redistribution  (cost=0.00..5.01 rows=1000 width=4)
                     Hash Key: t1.a
                     ->  Local Gather  (cost=0.00..5.00 rows=1000 width=4)
                           ->  Seq Scan on t1  (cost=0.00..5.00 rows=1000 width=4)

Use GUC hints

In specific scenarios, you must use GUC parameters to achieve better query performance. GUC hints are used to configure GUC parameters at the query level. You can configure GUC parameters before you perform queries to achieve the same effect. GUC hints help you configure GUC parameters for a query in an efficient manner. After the query is complete, the GUC parameters become invalid. The GUC parameters do not take effect on other queries.

Sample SQL statements

SELECT /*+HINT set(hg_experimental_query_batch_size 512) */t1.a FROM t1 JOIN t2 ON t1.a = t2.a;

Execution plan

QUERY PLAN
Hash Join  (cost=0.00..10.00 rows=1 width=4)
  Hash Cond: (t1.a = t2.a)
  ->  Gather  (cost=0.00..5.00 rows=1 width=4)
        ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
              ->  Seq Scan on t1  (cost=0.00..5.00 rows=1 width=4)
  ->  Hash  (cost=5.00..5.00 rows=1 width=4)
        ->  Gather  (cost=0.00..5.00 rows=1 width=4)
              ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                    ->  Seq Scan on t2  (cost=0.00..5.00 rows=1 width=4)

Use hints for CTEs and subqueries

This section provides examples on how to use hints for CTEs or subqueries to optimize execution plans.

Sample SQL statements

WITH c1 AS (
        SELECT /*+HINT Leading(t2 t1) */ t1.a FROM (
            (
                SELECT /*+HINT leading(t2 t1) */ t1.a FROM t1 JOIN t2 ON t1.a = t2.a
            ) AS t1
            JOIN
            (
                SELECT /*+HINT NestLoop(t4 t3) */ t4.a FROM t3 JOIN t4 ON t3.a = t4.a
            ) AS t2
            ON t1.a = t2.a
        )
    ),
    c2 AS (
        SELECT /*+HINT leading(t1 t2) */ t2.a FROM (
            (
                SELECT /*+HINT Leading(t1 t2) */ t1.a FROM t1 JOIN t2 ON t1.a = t2.a
            ) AS t1
            JOIN
            (
                SELECT /*+HINT Leading(t4 t3) */ t4.a FROM t3 JOIN t4 ON t3.a = t4.a
            ) AS t2
            ON t1.a = t2.a
        )
    )
    SELECT /*+HINT NestLoop(v2 v1) */  * FROM (
        (
            SELECT /*+HINT Leading (c1 t2) */ c1.a FROM c1 JOIN t2 ON c1.a = t2.a
        ) AS v1
        JOIN
        (
            SELECT /*+HINT Leading (t1 c2) */ c2.a FROM t1 JOIN c2 ON t1.a = c2.a
        ) AS v2
        ON v1.a = v2.a
    )
    ORDER BY v2.a;

Execution plan

QUERY PLAN
Sort  (cost=0.00..10660048.36 rows=1 width=8)
  Sort Key: t4_1.a
  ->  Gather  (cost=0.00..10660048.36 rows=1 width=8)
        ->  Nested Loop  (cost=0.00..10660048.36 rows=1 width=8)
              Join Filter: ((t1.a = t4_1.a) AND (t1.a = t1_1.a) AND (t2_1.a = t1_1.a) AND (t2_1.a = t4_1.a))
              ->  Hash Join  (cost=0.00..25.01 rows=1 width=8)
                    Hash Cond: (t1_1.a = t4_1.a)
                    ->  Redistribution  (cost=0.00..5.00 rows=1 width=4)
                          Hash Key: t1_1.a
                          ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                ->  Seq Scan on t1 t1_1  (cost=0.00..5.00 rows=1 width=4)
                    ->  Hash  (cost=20.00..20.00 rows=1 width=4)
                          ->  Hash Join  (cost=0.00..20.00 rows=1 width=4)
                                Hash Cond: ((t1_2.a = t4_1.a) AND (t1_2.a = t3_1.a) AND (t2_2.a = t3_1.a) AND (t2_2.a = t4_1.a))
                                ->  Hash Join  (cost=0.00..10.00 rows=1 width=8)
                                      Hash Cond: (t1_2.a = t2_2.a)
                                      ->  Redistribution  (cost=0.00..5.00 rows=1 width=4)
                                            Hash Key: t1_2.a
                                            ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                  ->  Seq Scan on t1 t1_2  (cost=0.00..5.00 rows=1 width=4)
                                      ->  Hash  (cost=5.00..5.00 rows=1 width=4)
                                            ->  Redistribution  (cost=0.00..5.00 rows=1 width=4)
                                                  Hash Key: t2_2.a
                                                  ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                        ->  Seq Scan on t2 t2_2  (cost=0.00..5.00 rows=1 width=4)
                                ->  Hash  (cost=10.00..10.00 rows=1 width=8)
                                      ->  Hash Join  (cost=0.00..10.00 rows=1 width=8)
                                            Hash Cond: (t4_1.a = t3_1.a)
                                            ->  Redistribution  (cost=0.00..5.00 rows=1 width=4)
                                                  Hash Key: t4_1.a
                                                  ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                        ->  Seq Scan on t4 t4_1  (cost=0.00..5.00 rows=1 width=4)
                                            ->  Hash  (cost=5.00..5.00 rows=1 width=4)
                                                  ->  Redistribution  (cost=0.00..5.00 rows=1 width=4)
                                                        Hash Key: t3_1.a
                                                        ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                              ->  Seq Scan on t3 t3_1  (cost=0.00..5.00 rows=1 width=4)
              ->  Materialize  (cost=0.00..10385.07 rows=40 width=8)
                    ->  Broadcast  (cost=0.00..10385.07 rows=40 width=8)
                          ->  Hash Join  (cost=0.00..10385.07 rows=1 width=8)
                                Hash Cond: (t1.a = t2_1.a)
                                ->  Hash Join  (cost=0.00..10380.07 rows=1 width=4)
                                      Hash Cond: ((t4.a = t1.a) AND (t3.a = t1.a) AND (t3.a = t2.a) AND (t4.a = t2.a))
                                      ->  Redistribution  (cost=0.00..10370.07 rows=1 width=8)
                                            Hash Key: t4.a
                                            ->  Nested Loop  (cost=0.00..10370.07 rows=1 width=8)
                                                  Join Filter: (t3.a = t4.a)
                                                  ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                        ->  Seq Scan on t3  (cost=0.00..5.00 rows=1 width=4)
                                                  ->  Materialize  (cost=0.00..5.00 rows=40 width=4)
                                                        ->  Broadcast  (cost=0.00..5.00 rows=40 width=4)
                                                              ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                                    ->  Seq Scan on t4  (cost=0.00..5.00 rows=1 width=4)
                                      ->  Hash  (cost=10.00..10.00 rows=1 width=8)
                                            ->  Hash Join  (cost=0.00..10.00 rows=1 width=8)
                                                  Hash Cond: (t2.a = t1.a)
                                                  ->  Redistribution  (cost=0.00..5.00 rows=1 width=4)
                                                        Hash Key: t2.a
                                                        ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                              ->  Seq Scan on t2  (cost=0.00..5.00 rows=1 width=4)
                                                  ->  Hash  (cost=5.00..5.00 rows=1 width=4)
                                                        ->  Redistribution  (cost=0.00..5.00 rows=1 width=4)
                                                              Hash Key: t1.a
                                                              ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                                    ->  Seq Scan on t1  (cost=0.00..5.00 rows=1 width=4)
                                ->  Hash  (cost=5.00..5.00 rows=1 width=4)
                                      ->  Redistribution  (cost=0.00..5.00 rows=1 width=4)
                                            Hash Key: t2_1.a
                                            ->  Local Gather  (cost=0.00..5.00 rows=1 width=4)
                                                  ->  Seq Scan on t2 t2_1  (cost=0.00..5.00 rows=1 width=4)

Use hints in INSERT statements

In most cases, you can use hints in the INSERT INTO ... SELECT syntax if the destination table is associated with the source table and you want to specify join orders or perform other adjustments. The SQL logic of the INSERT INTO ... SELECT syntax is complex and hints must be added based on the business plan.

Example 1: The hint takes effect on the destination table of the INSERT operation and the outermost source table of the SELECT operation.

If the amount of data returned by the t1 Join t2 operation is small, the amount of data in the destination table named target is large, and the statistical information is not updated, the generated execution plan may not be optimal. In this case, you can use hints to adjust the join order to achieve better performance.

Sample SQL statements

--The hint takes effect on the destination table of the INSERT operation and the outermost source table of the SELECT operation.
INSERT /*+HINT Leading(target (t1 t2)) */ INTO target SELECT t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a;

Execution plan

QUERY PLAN                                                    
-----------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..26.57 rows=1000 width=8)
   ->  Insert  (cost=0.00..26.54 rows=1000 width=8)
         ->  Project  (cost=0.00..16.12 rows=1000 width=8)
               ->  Hash Right Join  (cost=0.00..15.12 rows=1000 width=12)
                     Hash Cond: (target.a = t1.a)
                     ->  Local Gather  (cost=0.00..5.00 rows=1000 width=4)
                           ->  Seq Scan on target  (cost=0.00..5.00 rows=1000 width=4)
                     ->  Hash  (cost=10.07..10.07 rows=1000 width=8)
                           ->  Redistribution  (cost=0.00..10.07 rows=1000 width=8)
                                 Hash Key: t1.a
                                 ->  Hash Join  (cost=0.00..10.06 rows=1000 width=8)
                                       Hash Cond: (t1.a = t2.a)
                                       ->  Redistribution  (cost=0.00..5.01 rows=1000 width=4)
                                             Hash Key: t1.a
                                             ->  Local Gather  (cost=0.00..5.00 rows=1000 width=4)
                                                   ->  Seq Scan on t1  (cost=0.00..5.00 rows=1000 width=4)
                                       ->  Hash  (cost=5.01..5.01 rows=1000 width=8)
                                             ->  Redistribution  (cost=0.00..5.01 rows=1000 width=8)
                                                   Hash Key: t2.a
                                                   ->  Local Gather  (cost=0.00..5.00 rows=1000 width=8)
                                                         ->  Seq Scan on t2  (cost=0.00..5.00 rows=1000 width=8)

Example 2: The hint takes effect on the SELECT subquery.

Sample SQL statements

Note

The hints in the following INSERT statements can achieve the same effect.

INSERT INTO target SELECT /*+HINT Leading(t2 t1) */ t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a;

INSERT /*+HINT Leading(t2 t1) */ INTO target SELECT t1.a,t2.b FROM t1 JOIN t2 ON t1.a=t2.a;

Execution plan

QUERY PLAN                                                 
-----------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..26.57 rows=1000 width=8)
   ->  Insert  (cost=0.00..26.54 rows=1000 width=8)
         ->  Project  (cost=0.00..16.12 rows=1000 width=8)
               ->  Hash Left Join  (cost=0.00..15.12 rows=1000 width=12)
                     Hash Cond: (t1.a = target.a)
                     ->  Redistribution  (cost=0.00..10.07 rows=1000 width=8)
                           Hash Key: t1.a
                           ->  Hash Join  (cost=0.00..10.06 rows=1000 width=8)
                                 Hash Cond: (t2.a = t1.a)
                                 ->  Redistribution  (cost=0.00..5.01 rows=1000 width=8)
                                       Hash Key: t2.a
                                       ->  Local Gather  (cost=0.00..5.00 rows=1000 width=8)
                                             ->  Seq Scan on t2  (cost=0.00..5.00 rows=1000 width=8)
                                 ->  Hash  (cost=5.01..5.01 rows=1000 width=4)
                                       ->  Redistribution  (cost=0.00..5.01 rows=1000 width=4)
                                             Hash Key: t1.a
                                             ->  Local Gather  (cost=0.00..5.00 rows=1000 width=4)
                                                   ->  Seq Scan on t1  (cost=0.00..5.00 rows=1000 width=4)
                     ->  Hash  (cost=5.00..5.00 rows=1000 width=4)
                           ->  Local Gather  (cost=0.00..5.00 rows=1000 width=4)
                                 ->  Seq Scan on target  (cost=0.00..5.00 rows=1000 width=4)

Use hints in UPDATE statements

In most cases, you can use hints in UPDATE statements if the destination table is associated with the source table and manual adjustment is required.

Sample SQL statements
In this example, the amount of data in table t1 is larger than the amount of data in table target. You can configure a hint to use table target to create a hash table. This way, the join order is adjusted.
```
UPDATE /*+HINT Leading(t1 target) */ target SET b=t1.b+1 FROM t1 WHERE t1.a=target.a;
```

Comparison of execution plans

Execution plan when the hint feature is disabled

QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Gather  (cost=0.00..52.77 rows=1000 width=1)
   ->  Update  (cost=0.00..52.76 rows=1000 width=1)
         ->  Project  (cost=0.00..11.09 rows=1000 width=32)
               ->  Hash Join  (cost=0.00..10.08 rows=1000 width=32)
                     Hash Cond: (target.a = t1.a)
                     ->  Local Gather  (cost=0.00..5.00 rows=1000 width=28)
                           ->  Seq Scan on target  (cost=0.00..5.00 rows=1000 width=28)
                     ->  Hash  (cost=5.01..5.01 rows=1000 width=8)
                           ->  Redistribution  (cost=0.00..5.01 rows=1000 width=8)
                                 Hash Key: t1.a
                                 ->  Local Gather  (cost=0.00..5.00 rows=1000 width=8)
                                       ->  Seq Scan on t1  (cost=0.00..5.00 rows=1000 width=8)

Execution plan when the hint feature is enabled

QUERY PLAN                                          
----------------------------------------------------------------------------------------------
 Gather  (cost=0.00..52.77 rows=1000 width=1)
   ->  Update  (cost=0.00..52.76 rows=1000 width=1)
         ->  Project  (cost=0.00..11.09 rows=1000 width=32)
               ->  Hash Join  (cost=0.00..10.08 rows=1000 width=32)
                     Hash Cond: (t1.a = target.a)
                     ->  Redistribution  (cost=0.00..5.01 rows=1000 width=8)
                           Hash Key: t1.a
                           ->  Local Gather  (cost=0.00..5.00 rows=1000 width=8)
                                 ->  Seq Scan on t1  (cost=0.00..5.00 rows=1000 width=8)
                     ->  Hash  (cost=5.00..5.00 rows=1000 width=28)
                           ->  Local Gather  (cost=0.00..5.00 rows=1000 width=28)
                                 ->  Seq Scan on target  (cost=0.00..5.00 rows=1000 width=28)

Use runtime filter hints

Hologres supports runtime filters. If an SQL statement does not meet the conditions for triggering a runtime filter, you can use a hint to forcefully trigger a runtime filter to improve query performance.

Important

You can use a hint to forcefully trigger a runtime filter only for hash joins.
Runtime filters that are forcefully triggered do not always improve query performance. Use runtime filter hints based on your business requirements.

Sample SQL statements

SELECT /*+HINT runtimefilter(t1 t2) */ * FROM t1 JOIN t2 ON t1.a = t2.a;

Execution plan

-- Runtime Filter appears in the execution plan, which indicates that a runtime filter is triggered. 
QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Gather  (cost=0.00..10.13 rows=1000 width=16)
   ->  Hash Join  (cost=0.00..10.07 rows=1000 width=16)
         Hash Cond: (t1.a = t2.a)
         Runtime Filter Cond: (t1.a = t2.a)
         ->  Redistribution  (cost=0.00..5.01 rows=1000 width=8)
               Hash Key: t1.a
               ->  Local Gather  (cost=0.00..5.00 rows=1000 width=8)
                     ->  Seq Scan on t1  (cost=0.00..5.00 rows=1000 width=8)
                           Runtime Filter Target Expr: t1.a
         ->  Hash  (cost=5.01..5.01 rows=1000 width=8)
               ->  Redistribution  (cost=0.00..5.01 rows=1000 width=8)
                     Hash Key: t2.a
                     ->  Local Gather  (cost=0.00..5.00 rows=1000 width=8)
                           ->  Seq Scan on t2  (cost=0.00..5.00 rows=1000 width=8)

Use motion hints

Example 1: Force the t1 table to be broadcast during the JOIN operation. This is typically used when statistics are inaccurate, leading to the shuffling of a large amount of data. Since Hash JOIN can broadcast a table only on the build side, it is necessary to use the Leading hint to specify the JOIN order. This avoids the situation where the lack of accurate statistics results in a high cost penalty for broadcasting, which might otherwise cause the optimizer to choose the t1 JOIN t2 order.

Sample SQL statements

SELECT /*+HINT Leading(t2 t1) Broadcast(t1) */ * FROM t1 JOIN t2 ON t1.a = t2.a;

Execution plan

QUERY PLAN                                          
----------------------------------------------------------------------------------------------
  Gather  (cost=0.00..100000000000000005366162204393472.00 rows=1000 width=16)
   ->  Hash Join  (cost=0.00..100000000000000005366162204393472.00 rows=1000 width=16)
         Hash Cond: (t2.a = t1.a)
         ->  Local Gather  (cost=0.00..5.00 rows=1000 width=8)
               ->  Seq Scan on t2  (cost=0.00..5.00 rows=1000 width=8)
         ->  Hash  (cost=100000000000000005366162204393472.00..100000000000000005366162204393472.00 rows=3000 width=8)
               ->  Broadcast  (cost=0.00..100000000000000005366162204393472.00 rows=3000 width=8)
                     ->  Local Gather  (cost=0.00..5.00 rows=1000 width=8)
                           ->  Seq Scan on t1  (cost=0.00..5.00 rows=1000 width=8)

Example 2: Force the t1 table not to be broadcast. This is typically used when the statistics for the table that would be broadcast are inaccurate, leading to a significant underestimation of the actual number of rows. As a result, broadcasting a large table can lead to very poor performance.

Sample data

CREATE TABLE test1(a int, b int);
CREATE TABLE test2(a int, b int);

INSERT INTO test1 SELECT 1, i FROM generate_series(1, 10) AS i;
INSERT INTO test2 SELECT 1,i FROM generate_series(1, 1000000) AS i;

analyze test1,test2;

Sample SQL statements

Perform broadcast without using a hint

explain SELECT * FROM test1 JOIN test2 ON test1.b = test2.b;

Use a hint to disable broadcast

explain SELECT /*+HINT NoBroadcast(test1)  */ * FROM test1 JOIN test2 ON test1.b = test2.b;

Execution plan

Perform broadcast without using a hint

QUERY PLAN
---------------------------------------------------------------------------------
Gather  (cost=0.00..51.98 rows=1000000 width=16)
  ->  Hash Join  (cost=0.00..13.12 rows=1000000 width=16)
        Hash Cond: (test2.b = test1.b)
        ->  Local Gather  (cost=0.00..5.23 rows=1000000 width=8)
              ->  Seq Scan on test2  (cost=0.00..5.20 rows=1000000 width=8)
        ->  Hash  (cost=5.00..5.00 rows=200 width=8)
              ->  Broadcast  (cost=0.00..5.00 rows=200 width=8)
                    ->  Local Gather  (cost=0.00..5.00 rows=10 width=8)
                          ->  Seq Scan on test1  (cost=0.00..5.00 rows=10 width=8)

Use a hint to disable broadcast

QUERY PLAN
---------------------------------------------------------------------------------
Gather  (cost=0.00..53.23 rows=1000000 width=16)
  ->  Hash Join  (cost=0.00..14.37 rows=1000000 width=16)
        Hash Cond: (test2.b = test1.b)
        ->  Redistribution  (cost=0.00..6.48 rows=1000000 width=8)
              Hash Key: test2.b
              ->  Local Gather  (cost=0.00..5.23 rows=1000000 width=8)
                    ->  Seq Scan on test2  (cost=0.00..5.20 rows=1000000 width=8)
        ->  Hash  (cost=5.00..5.00 rows=10 width=8)
              ->  Redistribution  (cost=0.00..5.00 rows=10 width=8)
                    Hash Key: test1.b
                    ->  Local Gather  (cost=0.00..5.00 rows=10 width=8)
                          ->  Seq Scan on test1  (cost=0.00..5.00 rows=10 width=8)