All Products
Search
Document Center

AnalyticDB for PostgreSQL:Dynamic join filter

Last Updated:Apr 22, 2024

The dynamic join filter feature can be used before a hash join to filter out the data that does not match the hash join from the left table by using the dynamically collected join key of the right table. This improves the hash join performance of AnalyticDB for PostgreSQL.

Precautions

  • Dynamic join filter is available only in AnalyticDB for PostgreSQL V6.3.8.0 or later. For information about how to update the minor engine version of an AnalyticDB for PostgreSQL instance, see Update the minor engine version.

  • Dynamic join filter is supported for the Legacy query optimizer but not for the ORCA optimizer. For more information about optimizers, see Optimize query performance.

  • Dynamic join filter is supported only for the Laser computing engine. For more information about the Laser computing engine, see Use the Laser computing engine.

  • Dynamic join filter is not supported for the SQL statements that contain IntiPlan in the execution plan.

  • Dynamic join filter is supported for the User Datagram Protocol (UDP) but not for the Transmission Control Protocol (TCP).

Feature introduction

In AnalyticDB for PostgreSQL, hash joins are commonly used and can implement more than 95% of correlated queries. Data queries may hit a bottleneck when hash joins are used because hash joins involve disk reads and writes, network interaction, and large amounts of computations.

The dynamic join filter feature is supported as of AnalyticDB for PostgreSQL V6.3.8.0. This feature can be used before a hash join to filter out the data that does not match the hash join from the left table by using the dynamically collected join key of the right table. This way, the disk reading, networking, and CPU computing overheads are reduced to improve the hash join performance. The following figure shows the principle of dynamic join filter.

Dynamic Join Filter

Disable or enable the dynamic join filter feature

By default, the dynamic join filter feature is enabled. You can modify the adbpg_enable_dynamic_join_filter value to disable or enable this feature.

  • Execute the following statement to disable the dynamic join filter feature for a session:

    SET adbpg_enable_dynamic_join_filter TO off;
  • Execute the following statement to enable the dynamic join filter feature for a session:

    SET adbpg_enable_dynamic_join_filter TO on;

To disable or enable the dynamic join filter feature for an instance, Submit a ticket to contact the technical support personnel.

Examples

In this example, the lineitem and part tables in the TPC-H test set are used. The following section uses TPC-H Q17 to compare the query duration before and after the dynamic join filter feature is enabled.

Note

The TPC-H performance tests described in this topic are implemented based on the TPC-H benchmark tests but cannot meet all requirements of TPC-H benchmark tests. Therefore, the test results described in this topic are incomparable with the published TPC-H benchmark test results.

  1. Generate 1 GB of test data based on the TPC-H benchmark. For more information, see the "Generate test data" section in TPC-H.

  2. Execute the following statements to create the lineitem and part tables:

    CREATE TABLE LINEITEM (
        L_ORDERKEY    BIGINT NOT NULL,
        L_PARTKEY     INTEGER NOT NULL,
        L_SUPPKEY     INTEGER NOT NULL,
        L_LINENUMBER  INTEGER NOT NULL,
        L_QUANTITY    NUMERIC(15,2) NOT NULL,
        L_EXTENDEDPRICE  NUMERIC(15,2) NOT NULL,
        L_DISCOUNT    NUMERIC(15,2) NOT NULL,
        L_TAX         NUMERIC(15,2) NOT NULL,
        L_RETURNFLAG  CHAR(1) NOT NULL,
        L_LINESTATUS  CHAR(1) NOT NULL,
        L_SHIPDATE    DATE NOT NULL,
        L_COMMITDATE  DATE NOT NULL,
        L_RECEIPTDATE DATE NOT NULL,
        L_SHIPINSTRUCT CHAR(25) NOT NULL,
        L_SHIPMODE     CHAR(10) NOT NULL,
        L_COMMENT      VARCHAR(44) NOT NULL
    ) WITH (APPENDONLY=TRUE, ORIENTATION=COLUMN)
    DISTRIBUTED BY (L_ORDERKEY, L_LINENUMBER);
    
    CREATE TABLE PART (
        P_PARTKEY     INTEGER NOT NULL,
        P_NAME        VARCHAR(55) NOT NULL,
        P_MFGR        CHAR(25) NOT NULL,
        P_BRAND       CHAR(10) NOT NULL,
        P_TYPE        VARCHAR(25) NOT NULL,
        P_SIZE        INTEGER NOT NULL,
        P_CONTAINER   CHAR(10) NOT NULL,
        P_RETAILPRICE NUMERIC(15,2) NOT NULL,
        P_COMMENT     VARCHAR(23) NOT NULL
    ) WITH (APPENDONLY=TRUE, ORIENTATION=COLUMN)
    DISTRIBUTED BY (P_PARTKEY);
  3. Execute the following \COPY statements to import 1 GB of test data to the lineitem and part tables:

    \COPY LINEITEM FROM 'lineitem.tbl' with DELIMITER '|' NULL '';
    \COPY PART FROM 'part.tbl' with DELIMITER '|' NULL '';
  4. Query the execution duration of a Q17 query when the dynamic join filter feature is disabled.

    1. Disable the dynamic join filter feature.

      SET adbpg_enable_dynamic_join_filter TO off;
    2. Execute a TCP-H Q17 query.

      SELECT
          sum(l_extendedprice) / 7.0 as avg_yearly
      FROM
          lineitem,
          part
      WHERE
          p_partkey = l_partkey
          and p_brand = 'Brand#54'
          and p_container = 'SM CAN'
          and l_quantity < (
              SELECT
                  0.2 * avg(l_quantity)
              FROM
                  lineitem
              WHERE
                  l_partkey = p_partkey
          );
    3. View the execution duration from the query result. The execution duration is 3,468 ms.

           avg_yearly
      ---------------------
       336452.465714285714
      (1 row)
      
      Time: 3468.411 ms
  5. Query the execution duration of a Q17 query when the dynamic join filter feature is enabled.

    1. Enable the dynamic join filter feature.

      SET adbpg_enable_dynamic_join_filter TO on;
    2. Execute a TCP-H Q17 query.

      SELECT
          sum(l_extendedprice) / 7.0 as avg_yearly
      FROM
          lineitem,
          part
      WHERE
          p_partkey = l_partkey
          and p_brand = 'Brand#54'
          and p_container = 'SM CAN'
          and l_quantity < (
              SELECT
                  0.2 * avg(l_quantity)
              FROM
                  lineitem
              WHERE
                  l_partkey = p_partkey
          );
    3. View the execution duration from the query result. The execution duration is 305 ms.

           avg_yearly
      ---------------------
       336452.465714285714
      (1 row)
      
      Time: 305.632 ms

After the dynamic join filter feature is enabled, the query performance is significantly improved as the execution duration is reduced from 3,468 ms to 305 ms.