All Products
Search
Document Center

PolarDB:Partition-wise join

Last Updated:Dec 26, 2025

PolarDB for PostgreSQL and support the partition-wise join feature. This feature improves the performance of join queries by reducing invalid joins between partitions.

Overview

Partition-wise join optimizes joins between two partitioned tables. When you join these tables on their partition keys, this feature eliminates invalid joins between partitions and improves query performance.

image.png

Usage

Execute the following statement to enable the partition-wise join feature:

set enable_partitionwise_join to on;

Example

The following two examples describe partition-wise joins in detail.

The example uses two tables: measurement and sales.

CREATE TABLE measurement(
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
) PARTITION BY RANGE (logdate);
CREATE TABLE measurement_y2023q1 PARTITION OF measurement
    FOR VALUES FROM ('2023-01-01') TO ('2023-04-01');
CREATE TABLE measurement_y2023q2 PARTITION OF measurement
    FOR VALUES FROM ('2023-04-01') TO ('2023-07-01');
CREATE TABLE measurement_y2023q3 PARTITION OF measurement
    FOR VALUES FROM ('2023-07-01') TO ('2023-10-01');
CREATE TABLE measurement_y2023q4 PARTITION OF measurement
    FOR VALUES FROM ('2023-10-01') TO ('2024-04-01');
    
CREATE TABLE sales (
    dept_no     integer,
    part_no     varchar(2),
    country     varchar(20),
    date        date,
    amount      decimal
) PARTITION BY RANGE (date);
CREATE TABLE sales_y2023q1 PARTITION OF sales
    FOR VALUES FROM ('2023-01-01') TO ('2023-04-01');
CREATE TABLE sales_y2023q2 PARTITION OF sales
    FOR VALUES FROM ('2023-04-01') TO ('2023-07-01');
CREATE TABLE sales_y2023q3 PARTITION OF sales
    FOR VALUES FROM ('2023-07-01') TO ('2023-10-01');
CREATE TABLE sales_y2023q4 PARTITION OF sales
    FOR VALUES FROM ('2023-10-01') TO ('2024-04-01');

These statements show that:

  • The measurement table has four partitions: measurement_y2023q1, measurement_y2023q2, measurement_y2023q3, and measurement_y2023q4. These partitions correspond to the four quarters of 2023.

  • The sales table also has four partitions: sales_y2023q1, sales_y2023q2, sales_y2023q3, and sales_y2023q4. These partitions also correspond to the four quarters of 2023.

Now, execute a join query on the measurement and sales tables and view its query plan:

explain select a.* from sales a join measurement b on a.date = b.logdate where b.unitsales > 10;

When the partition-wise join feature is disabled, the measurement and sales tables are joined in their entirety. The query plan is as follows:

                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Aggregate  (cost=871.75..871.76 rows=1 width=8)
   ->  Merge Join  (cost=448.58..812.79 rows=23587 width=32)
         Merge Cond: (a.date = b.logdate)
         ->  Sort  (cost=185.83..191.03 rows=2080 width=40)
               Sort Key: a.date
               ->  Append  (cost=0.00..71.20 rows=2080 width=40)
                     ->  Seq Scan on sales_y2023q1 a  (cost=0.00..15.20 rows=520 width=40)
                     ->  Seq Scan on sales_y2023q2 a_1  (cost=0.00..15.20 rows=520 width=40)
                     ->  Seq Scan on sales_y2023q3 a_2  (cost=0.00..15.20 rows=520 width=40)
                     ->  Seq Scan on sales_y2023q4 a_3  (cost=0.00..15.20 rows=520 width=40)
         ->  Sort  (cost=262.75..268.42 rows=2268 width=8)
               Sort Key: b.logdate
               ->  Append  (cost=0.00..136.34 rows=2268 width=8)
                     ->  Seq Scan on measurement_y2023q1 b  (cost=0.00..31.25 rows=567 width=8)
                           Filter: (unitsales > 10)
                     ->  Seq Scan on measurement_y2023q2 b_1  (cost=0.00..31.25 rows=567 width=8)
                           Filter: (unitsales > 10)
                     ->  Seq Scan on measurement_y2023q3 b_2  (cost=0.00..31.25 rows=567 width=8)
                           Filter: (unitsales > 10)
                     ->  Seq Scan on measurement_y2023q4 b_3  (cost=0.00..31.25 rows=567 width=8)
                           Filter: (unitsales > 10)
(21 rows)

As you can see, the query plan shows a join on all data from the measurement and sales tables. However, this creates invalid joins. For example, the join between sales_y2023q1 and measurement_y2023q3 is always empty. This is because the join condition requires the partition keys to be equal, but the partition keys of sales_y2023q1 and measurement_y2023q3 are not equal. The join returns results only when partitions with matching partition keys are joined, such as sales_y2023q1 and measurement_y2023q1.

If the partition-wise join feature is enabled:

set enable_partitionwise_join to on;

Then, execute the same join query on the measurement and sales tables. The query plan is as follows:

explain select a.* from sales a join measurement b on a.date = b.logdate where b.unitsales > 10;
                                       QUERY PLAN                                       
----------------------------------------------------------------------------------------
 Append  (cost=21.70..453.33 rows=5896 width=128)
   ->  Hash Join  (cost=21.70..105.96 rows=1474 width=128)
         Hash Cond: (b.logdate = a.date)
         ->  Seq Scan on measurement_y2023q1 b  (cost=0.00..31.25 rows=567 width=8)
               Filter: (unitsales > 10)
         ->  Hash  (cost=15.20..15.20 rows=520 width=128)
               ->  Seq Scan on sales_y2023q1 a  (cost=0.00..15.20 rows=520 width=128)
   ->  Hash Join  (cost=21.70..105.96 rows=1474 width=128)
         Hash Cond: (b_1.logdate = a_1.date)
         ->  Seq Scan on measurement_y2023q2 b_1  (cost=0.00..31.25 rows=567 width=8)
               Filter: (unitsales > 10)
         ->  Hash  (cost=15.20..15.20 rows=520 width=128)
               ->  Seq Scan on sales_y2023q2 a_1  (cost=0.00..15.20 rows=520 width=128)
   ->  Hash Join  (cost=21.70..105.96 rows=1474 width=128)
         Hash Cond: (b_2.logdate = a_2.date)
         ->  Seq Scan on measurement_y2023q3 b_2  (cost=0.00..31.25 rows=567 width=8)
               Filter: (unitsales > 10)
         ->  Hash  (cost=15.20..15.20 rows=520 width=128)
               ->  Seq Scan on sales_y2023q3 a_2  (cost=0.00..15.20 rows=520 width=128)
   ->  Hash Join  (cost=21.70..105.96 rows=1474 width=128)
         Hash Cond: (b_3.logdate = a_3.date)
         ->  Seq Scan on measurement_y2023q4 b_3  (cost=0.00..31.25 rows=567 width=8)
               Filter: (unitsales > 10)
         ->  Hash  (cost=15.20..15.20 rows=520 width=128)
               ->  Seq Scan on sales_y2023q4 a_3  (cost=0.00..15.20 rows=520 width=128)
(25 rows)

After you enable partition-wise join, the optimization is clear. The database joins only matching partitions: sales_y2023q2 with measurement_y2023q2, sales_y2023q3 with measurement_y2023q3, and sales_y2023q4 with measurement_y2023q4. This process greatly reduces the number of invalid joins between partitions and significantly improves join query performance.