When a query joins multiple tables and uses an OR condition that references columns from different tables, the PostgreSQL optimizer treats the entire OR expression as a join filter applied after the tables are joined. This prevents index use on either side of the OR and forces full table scans on both tables.
PolarDB for PostgreSQL (Compatible with Oracle) can automatically rewrite such queries as UNION ALL, allowing the optimizer to push each predicate down to its own table and use the available indexes. The rewrite is applied only when the resulting plan costs less than the original — enabling the feature does not degrade performance for queries where the rewrite is not beneficial.
Supported versions
This feature requires PolarDB for PostgreSQL (Compatible with Oracle) 2.0, revision version 2.0.14.27.0 or later.
To check your revision version, run:
SHOW polar_version;To upgrade, see Upgrade the version.
How it works
Single-table OR: index-friendly by default
When all columns in an OR condition belong to a single table and each column is indexed, the optimizer generates a BitmapOr index path without any intervention:
EXPLAIN SELECT * FROM my_test WHERE (id = 123 OR name = '123' OR salary = 123.0); QUERY PLAN
---------------------------------------------------------------------------------------
Bitmap Heap Scan on my_test (cost=12.90..24.33 rows=3 width=15)
Recheck Cond: ((id = 123) OR ((name)::text = '123'::text) OR (salary = 123.0))
-> BitmapOr (cost=12.90..12.90 rows=3 width=0)
-> Bitmap Index Scan on my_test_id_idx (cost=0.00..4.30 rows=1 width=0)
Index Cond: (id = 123)
-> Bitmap Index Scan on my_test_name_idx (cost=0.00..4.30 rows=1 width=0)
Index Cond: ((name)::text = '123'::text)
-> Bitmap Index Scan on my_test_salary_idx (cost=0.00..4.30 rows=1 width=0)
Index Cond: (salary = 123.0)
(9 rows)Multi-table OR: the problem
When the OR condition references columns from different tables, the optimizer cannot push the individual predicates down to their respective tables. It applies the entire OR expression as a Join Filter after the join, scanning both tables in full:
Without conversion — the OR condition becomes a join filter, forcing full scans on both tables:
EXPLAIN ANALYZE SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2); QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Hash Join (cost=299.00..660.50 rows=110 width=51) (actual time=5.992..15.673 rows=110 loops=1)
Hash Cond: (t1.id = t2.id)
Join Filter: ((t1.num = 1) OR (t2.cnt = 2))
Rows Removed by Join Filter: 9890
-> Seq Scan on t1 (cost=0.00..174.00 rows=10000 width=25) (actual time=0.012..2.080 rows=10000 loops=1)
-> Hash (cost=174.00..174.00 rows=10000 width=26) (actual time=5.855..5.857 rows=10000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 704kB
-> Seq Scan on t2 (cost=0.00..174.00 rows=10000 width=26) (actual time=0.007..1.779 rows=10000 loops=1)
Planning Time: 1.237 ms
Execution Time: 15.836 ms
(10 rows)The optimizer uses indexes on neither t1.num nor t2.cnt. Both tables are fully scanned.
With conversion — the optimizer rewrites the query as UNION ALL, enabling index use on each branch:
EXPLAIN ANALYZE SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2); QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Append (cost=85.48..411.16 rows=110 width=51) (actual time=0.396..4.822 rows=110 loops=1)
-> Hash Join (cost=85.48..297.98 rows=100 width=51) (actual time=0.395..4.639 rows=100 loops=1)
Hash Cond: (t2.id = t1.id)
-> Seq Scan on t2 (cost=0.00..174.00 rows=10000 width=26) (actual time=0.010..1.750 rows=10000 loops=1)
-> Hash (cost=84.23..84.23 rows=100 width=25) (actual time=0.333..0.335 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 14kB
-> Bitmap Heap Scan on t1 (cost=5.06..84.23 rows=100 width=25) (actual time=0.056..0.247 rows=100 loops=1)
Recheck Cond: (num = 1)
Heap Blocks: exact=73
-> Bitmap Index Scan on t1_num_idx (cost=0.00..5.04 rows=100 width=0) (actual time=0.028..0.028 rows=100 loops=1)
Index Cond: (num = 1)
-> Nested Loop (cost=4.65..112.63 rows=10 width=51) (actual time=0.049..0.164 rows=10 loops=1)
-> Bitmap Heap Scan on t2 (cost=4.36..33.46 rows=10 width=26) (actual time=0.027..0.044 rows=10 loops=1)
Recheck Cond: (cnt = 2)
Heap Blocks: exact=10
-> Bitmap Index Scan on t2_cnt_idx (cost=0.00..4.36 rows=10 width=0) (actual time=0.019..0.019 rows=10 loops=1)
Index Cond: (cnt = 2)
-> Index Scan using t1_id_idx on t1 (cost=0.29..7.91 rows=1 width=25) (actual time=0.010..0.010 rows=1 loops=10)
Index Cond: (id = t2.id)
Filter: ((num <> 1) OR ((num = 1) IS NULL))
Planning Time: 2.903 ms
Execution Time: 4.980 ms
(22 rows)Execution time drops from 15.836 ms to 4.980 ms. The Bitmap Index Scan on t1_num_idx and Bitmap Index Scan on t2_cnt_idx lines confirm that both indexes are now used.
The second UNION ALL branch includes (num <> 1) OR ((num = 1) IS NULL) as a filter on t1. This excludes rows already returned by the first branch (t1.num = 1), ensuring correct deduplication without a DISTINCT pass.
Parameters
| Parameter | Description | Default |
|---|---|---|
polar_cbqt_convert_or_to_union_all_mode | Controls whether the rewrite is applied. OFF: disabled. ON: the optimizer rewrites when the converted plan costs less than the original. FORCE: the optimizer always rewrites, regardless of cost. | OFF |
polar_cbqt_cost_threshold | The minimum estimated cost of the original plan required to trigger the rewrite. Only queries whose plan cost exceeds this threshold are candidates. Valid values: (0,+∞). | 50000 |
Do not set polar_cbqt_cost_threshold to 0. This forces the rewrite for every query regardless of cost and can increase execution time.
Limitations
| Limitation | What to do instead |
|---|---|
| OR conditions that reference only one table are never rewritten. | Ensure the relevant columns are indexed so the optimizer can generate a BitmapOr plan automatically. |
| OR conditions with more than six parameters are not rewritten. | Manually rewrite the query as UNION ALL. |
| If a query contains more than six OR conditions, only the first six are rewritten. | Manually rewrite the full query as UNION ALL, or split it into multiple queries. |
In FORCE mode, the optimizer selects the lowest-cost path among all valid converted plans. | — |
Setting polar_cbqt_cost_threshold to 0 or using FORCE mode applies the rewrite globally. | Use a query hint to force the rewrite for a specific query without changing global settings (see Force the rewrite with a hint). |
Enable the rewrite
Prepare sample data
The examples below use the following tables and indexes:
CREATE TABLE t1(id int, num int, dsc text, log_date text);
CREATE TABLE t2(id int, cnt int, change text, op_date text);
INSERT INTO t1 SELECT i, i%100, 'test'||1, to_char('1990-10-10'::date + i, 'YYYY-MM-DD') FROM generate_series(1,10000)i;
INSERT INTO t2 SELECT i, i%1000, 'now'||i, to_char('1990-10-10'::date + i, 'YYYY-MM-DD') FROM generate_series(1,10000)i;
CREATE INDEX ON t1(id);
CREATE INDEX ON t1(num);
CREATE INDEX ON t2(id);
CREATE INDEX ON t2(cnt);
ANALYZE t1;
ANALYZE t2;Enable ON mode
Set polar_cbqt_convert_or_to_union_all_mode to ON. Set the cost threshold low enough that queries you want to optimize qualify — the original plan in this example has a cost of 660.50, so a threshold of 100 is sufficient:
SET polar_cbqt_cost_threshold TO 100;
SET polar_cbqt_convert_or_to_union_all_mode TO on;
EXPLAIN ANALYZE SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2);The optimizer rewrites the query as an Append (UNION ALL) plan and uses indexes on t1.num and t2.cnt. Execution time drops from 15.836 ms to 4.980 ms (see the full plan output above).
Force the rewrite with FORCE mode
ON mode converts only when the converted plan costs less than the original. FORCE mode always converts, which is useful when cost estimation is inaccurate and you know the converted plan performs better in practice.
The following query joins on unindexed text columns (t1.dsc and t2.change). In ON mode, the optimizer does not convert because the cost estimate does not favor it:
SET polar_cbqt_cost_threshold TO 100;
SET polar_cbqt_convert_or_to_union_all_mode TO on;
EXPLAIN ANALYZE SELECT * FROM t1, t2 WHERE t1.dsc = t2.change AND (t1.log_date = '2024-01-01' OR t2.op_date = '2024-01-01'); QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Hash Join (cost=299.00..660.50 rows=2 width=51) (actual time=14.321..14.325 rows=0 loops=1)
Hash Cond: (t1.dsc = t2.change)
Join Filter: ((t1.log_date = '2024-01-01'::text) OR (t2.op_date = '2024-01-01'::text))
-> Seq Scan on t1 (cost=0.00..174.00 rows=10000 width=25) (actual time=0.016..3.204 rows=10000 loops=1)
-> Hash (cost=174.00..174.00 rows=10000 width=26) (actual time=6.506..6.508 rows=10000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 704kB
-> Seq Scan on t2 (cost=0.00..174.00 rows=10000 width=26) (actual time=0.006..1.755 rows=10000 loops=1)
Planning Time: 0.932 ms
Execution Time: 14.571 ms
(9 rows)Switching to FORCE mode forces the rewrite:
SET polar_cbqt_cost_threshold TO 100;
SET polar_cbqt_convert_or_to_union_all_mode TO force;
EXPLAIN ANALYZE SELECT * FROM t1, t2 WHERE t1.dsc = t2.change AND (t1.log_date = '2024-01-01' OR t2.op_date = '2024-01-01'); QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Append (cost=199.01..871.05 rows=2 width=51) (actual time=9.915..9.923 rows=0 loops=1)
-> Hash Join (cost=199.01..410.52 rows=1 width=51) (actual time=5.046..5.050 rows=0 loops=1)
Hash Cond: (t2.change = t1.dsc)
-> Seq Scan on t2 (cost=0.00..174.00 rows=10000 width=26) (actual time=0.015..0.015 rows=1 loops=1)
-> Hash (cost=199.00..199.00 rows=1 width=25) (actual time=5.014..5.016 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on t1 (cost=0.00..199.00 rows=1 width=25) (actual time=5.013..5.013 rows=0 loops=1)
Filter: (log_date = '2024-01-01'::text)
Rows Removed by Filter: 10000
-> Hash Join (cost=199.01..460.52 rows=1 width=51) (actual time=4.865..4.867 rows=0 loops=1)
Hash Cond: (t1.dsc = t2.change)
-> Seq Scan on t1 (cost=0.00..224.00 rows=9999 width=25) (actual time=0.015..0.016 rows=1 loops=1)
Filter: ((log_date <> '2024-01-01'::text) OR ((log_date = '2024-01-01'::text) IS NULL))
-> Hash (cost=199.00..199.00 rows=1 width=26) (actual time=4.828..4.829 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on t2 (cost=0.00..199.00 rows=1 width=26) (actual time=4.827..4.827 rows=0 loops=1)
Filter: (op_date = '2024-01-01'::text)
Rows Removed by Filter: 10000
Planning Time: 0.777 ms
Execution Time: 10.088 ms
(20 rows)The FORCE plan has a higher estimated cost (871.05 vs 660.50) but runs faster (10.088 ms vs 14.571 ms). This illustrates that cost estimates can be inaccurate for certain query shapes. Use FORCE mode only after verifying through testing that the converted plan performs better.
Force the rewrite with a hint
To force the rewrite on a single query without changing global parameters, use a hint. This is the preferred approach when you want to apply the optimization selectively.
SET polar_cbqt_convert_or_to_union_all_mode TO off;
EXPLAIN ANALYZE /*+ Set(polar_cbqt_convert_or_to_union_all_mode force) Set(polar_cbqt_cost_threshold 0) */
SELECT * FROM t1, t2 WHERE t1.dsc = t2.change AND (t1.log_date = '2024-01-01' OR t2.op_date = '2024-01-01'); QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Append (cost=199.01..871.05 rows=2 width=51) (actual time=9.684..9.691 rows=0 loops=1)
-> Hash Join (cost=199.01..410.52 rows=1 width=51) (actual time=4.711..4.714 rows=0 loops=1)
Hash Cond: (t2.change = t1.dsc)
-> Seq Scan on t2 (cost=0.00..174.00 rows=10000 width=26) (actual time=0.013..0.013 rows=1 loops=1)
-> Hash (cost=199.00..199.00 rows=1 width=25) (actual time=4.682..4.684 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on t1 (cost=0.00..199.00 rows=1 width=25) (actual time=4.681..4.681 rows=0 loops=1)
Filter: (log_date = '2024-01-01'::text)
Rows Removed by Filter: 10000
-> Hash Join (cost=199.01..460.52 rows=1 width=51) (actual time=4.969..4.970 rows=0 loops=1)
Hash Cond: (t1.dsc = t2.change)
-> Seq Scan on t1 (cost=0.00..224.00 rows=9999 width=25) (actual time=0.018..0.018 rows=1 loops=1)
Filter: ((log_date <> '2024-01-01'::text) OR ((log_date = '2024-01-01'::text) IS NULL))
-> Hash (cost=199.00..199.00 rows=1 width=26) (actual time=4.935..4.936 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on t2 (cost=0.00..199.00 rows=1 width=26) (actual time=4.934..4.934 rows=0 loops=1)
Filter: (op_date = '2024-01-01'::text)
Rows Removed by Filter: 10000
Planning Time: 0.798 ms
Execution Time: 9.858 ms
(20 rows)The hint overrides the session-level off setting for this query only, producing the same Append plan as FORCE mode.
Cost threshold behavior
polar_cbqt_cost_threshold sets the minimum plan cost before conversion is attempted. In the basic example, the original plan cost is 660.50. Setting the threshold to 1000 — above 660.50 — prevents conversion even in FORCE mode:
SET polar_cbqt_cost_threshold TO 1000;
SET polar_cbqt_convert_or_to_union_all_mode TO force;
EXPLAIN ANALYZE SELECT * FROM t1 JOIN t2 ON t1.id = t2.id WHERE (t1.num = 1 OR t2.cnt = 2); QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Hash Join (cost=299.00..660.50 rows=110 width=51) (actual time=6.374..15.802 rows=110 loops=1)
Hash Cond: (t1.id = t2.id)
Join Filter: ((t1.num = 1) OR (t2.cnt = 2))
Rows Removed by Join Filter: 9890
-> Seq Scan on t1 (cost=0.00..174.00 rows=10000 width=25) (actual time=0.011..2.038 rows=10000 loops=1)
-> Hash (cost=174.00..174.00 rows=10000 width=26) (actual time=6.266..6.268 rows=10000 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 704kB
-> Seq Scan on t2 (cost=0.00..174.00 rows=10000 width=26) (actual time=0.006..1.778 rows=10000 loops=1)
Planning Time: 0.663 ms
Execution Time: 16.036 ms
(10 rows)Because 660.50 is below the 1000 threshold, the optimizer keeps the original Hash Join plan. Set the threshold lower than the plan cost of the queries you want to optimize.
What's next
Version management — upgrade your cluster to a supported revision version