All Products
Search
Document Center

PolarDB:Cost-based query transformation (CBQT)

Last Updated:Mar 28, 2026

Cost-based query transformation (CBQT) extends PolarDB for PostgreSQL's query optimizer to evaluate whether specific transformations improve execution cost before applying them. For complex queries involving sublinks or OR conditions, CBQT can reduce execution time by several orders of magnitude.

CBQT is available on PolarDB for PostgreSQL 14 (revision version 2.0.14.13.28.0 or later) and PolarDB for PostgreSQL 11 (revision version 2.0.11.15.44.0 or later). Run SHOW polardb_version; or view the revision version in the console to check your version. To upgrade, see Version management.

How it works

PostgreSQL applies several query transformations unconditionally — subquery pull-up, outer join removal, expression preprocessing, useless join removal, and predicate pushdown — because they always preserve equivalence and maintain or improve plan quality.

Other transformations, such as sublink pushdown and OR to UNION ALL conversion, can either improve or degrade a query plan depending on data distribution and query structure. Because applying these transformations does not always result in a better plan, PolarDB for PostgreSQL uses CBQT to evaluate them selectively by comparing execution costs before applying them.

For a given query, CBQT identifies all applicable cost-based transformations across each query block and arranges them into a state space. Each state represents a combination of transformations applied to specific query blocks. CBQT then searches this state space using the configured strategy and selects the state with the lowest execution cost.

For example, given transformations A and B applicable in Query Block 1, and transformation A applicable in Query Block 2, CBQT evaluates these states:

  • None: no transformation applied

  • [1, A]: transformation A applied to Query Block 1

  • [1, B]: transformation B applied to Query Block 1

  • [2, A]: transformation A applied to Query Block 2

Under the default linear search strategy, CBQT evaluates each state in turn. If state [1, B] produces the lowest cost, CBQT uses that plan.

image

Version requirements

CBQT requires one of the following versions:

  • PolarDB for PostgreSQL 14, revision version 2.0.14.13.28.0 or later

  • PolarDB for PostgreSQL 11, revision version 2.0.11.15.44.0 or later

Run SHOW polardb_version; or view the revision version in the console to check your version. To upgrade, see Version management.

Configure CBQT

CBQT is controlled by four parameters. For clusters running revision version 2.0.14.15.29.0 or later, configure these parameters directly in the PolarDB console — see Configure cluster parameters. For all other clusters, connect to the cluster and use SET to configure them at the session level — see Connect to a PolarDB for PostgreSQL cluster.

CBQT control parameters

ParameterDescriptionDefaultValid values
polar_enable_cbqtEnables or disables CBQT.offon, off
polar_cbqt_cost_thresholdThe minimum execution cost of the original plan required to trigger CBQT. CBQT is skipped if the original plan's cost is below this threshold.50000[0, +∞)
polar_cbqt_strategyThe search strategy for the state space. linear compares costs before and after applying each state and selects the best incrementally. twophase compares applying all states against applying none, then selects the better option.linearlinear, twophase
polar_cbqt_iteration_limitThe maximum number of iterations CBQT runs when searching the state space. Higher values increase the chance of finding the optimal plan at the cost of longer planning time.10[1, +∞)

Supported transformations

CBQT currently supports two cost-based transformations, each controlled by a separate parameter:

ParameterTransformationDetails
polar_cbqt_pushdown_sublinkSublink pushdown — pushes a sublink into a subquery so the optimizer can generate a parameterized path and use an index on the inner tableSublink pushdown
polar_cbqt_convert_or_to_union_all_modeOR to UNION ALL conversion — rewrites an OR condition as a UNION ALL to enable more efficient access paths per branchConvert an OR condition into a UNION ALL

Examples

The following examples use sublink pushdown to demonstrate how CBQT parameters affect query planning and execution time.

Set up test tables

CREATE TABLE t_small(a int);
CREATE TABLE t_big(a int, b int, c int);

CREATE INDEX ON t_big(a);

INSERT INTO t_big SELECT i, i, i FROM generate_series(1, 1000000) i;
INSERT INTO t_small VALUES(1), (1000000);

ANALYZE t_small, t_big;

Verify that CBQT changes the query plan

The following query has a sublink (a IN (SELECT a FROM t_small)) inside a subquery. When CBQT and sublink pushdown are enabled, the optimizer pushes the IN condition down into the subquery. This is semantically equivalent to moving the filter earlier in the query:

-- Original query
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v
WHERE a IN (SELECT a FROM t_small);

-- Equivalent plan after sublink pushdown:
-- The IN condition is pushed inside the subquery so that t_big is accessed
-- with a parameterized index scan keyed on t_small.a, rather than scanning
-- all rows and filtering afterward.
SELECT * FROM (
  SELECT a, sum(b) b FROM t_big
  WHERE a IN (SELECT a FROM t_small)  -- pushed here
  GROUP BY a
) v;

Use EXPLAIN (COSTS FALSE) to verify the structural change without noise from cost numbers.

Without CBQT — sublink pushdown is configured but CBQT is off, so the transformation does not take effect:

SET polar_enable_cbqt TO off;
SET polar_cbqt_pushdown_sublink TO on;

EXPLAIN (COSTS FALSE)
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v
WHERE a IN (SELECT a FROM t_small);
QUERY PLAN
-----------------------------------------------------------
 Merge Semi Join
   Merge Cond: (t_big.a = t_small.a)
   ->  GroupAggregate
         Group Key: t_big.a
         ->  Index Scan using t_big_a_idx on t_big
   ->  Sort
         Sort Key: t_small.a
         ->  Seq Scan on t_small

With CBQT — CBQT evaluates the cost of pushing the sublink down and finds it beneficial:

SET polar_enable_cbqt TO on;
SET polar_cbqt_pushdown_sublink TO on;

EXPLAIN (COSTS FALSE)
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v
WHERE a IN (SELECT a FROM t_small);
QUERY PLAN
-----------------------------------------------------------
 GroupAggregate
   Group Key: t_big.a
   ->  Sort
         Sort Key: t_big.a
         ->  Nested Loop
               ->  Unique
                     ->  Sort
                           Sort Key: t_small.a
                           ->  Seq Scan on t_small
               ->  Index Scan using t_big_a_idx on t_big
                     Index Cond: (a = t_small.a)

The plan shifts from a full-table Merge Semi Join to a Nested Loop with an index scan keyed on t_small.a. t_big is now scanned only for the two rows that match t_small.

Measure the performance impact

Use EXPLAIN ANALYZE to compare actual execution time.

Without CBQT:

SET polar_enable_cbqt TO off;
SET polar_cbqt_pushdown_sublink TO on;

EXPLAIN ANALYZE
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v
WHERE a IN (SELECT a FROM t_small);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Semi Join  (cost=1.46..59511.17 rows=10000 width=12) (actual time=0.052..1274.435 rows=2 loops=1)
   Merge Cond: (t_big.a = t_small.a)
   ->  GroupAggregate  (cost=0.42..46910.13 rows=1000000 width=12) (actual time=0.033..1151.005 rows=1000000 loops=1)
         Group Key: t_big.a
         ->  Index Scan using t_big_a_idx on t_big  (cost=0.42..31910.13 rows=1000000 width=8) (actual time=0.022..433.821 rows=1000000 loops=1)
   ->  Sort  (cost=1.03..1.03 rows=2 width=4) (actual time=0.015..0.016 rows=2 loops=1)
         Sort Key: t_small.a
         Sort Method: quicksort  Memory: 25kB
         ->  Seq Scan on t_small  (cost=0.00..1.02 rows=2 width=4) (actual time=0.005..0.006 rows=2 loops=1)
 Planning Time: 0.904 ms
 Execution Time: 1274.539 ms
(11 rows)

With CBQT:

SET polar_enable_cbqt TO on;
SET polar_cbqt_pushdown_sublink TO on;

EXPLAIN ANALYZE
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v
WHERE a IN (SELECT a FROM t_small);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=17.96..17.99 rows=2 width=12) (actual time=0.060..0.063 rows=2 loops=1)
   Group Key: t_big.a
   ->  Sort  (cost=17.96..17.96 rows=2 width=8) (actual time=0.052..0.053 rows=2 loops=1)
         Sort Key: t_big.a
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop  (cost=1.46..17.95 rows=2 width=8) (actual time=0.032..0.046 rows=2 loops=1)
               ->  Unique  (cost=1.03..1.04 rows=2 width=4) (actual time=0.014..0.018 rows=2 loops=1)
                     ->  Sort  (cost=1.03..1.03 rows=2 width=4) (actual time=0.013..0.014 rows=2 loops=1)
                           Sort Key: t_small.a
                           Sort Method: quicksort  Memory: 25kB
                           ->  Seq Scan on t_small  (cost=0.00..1.02 rows=2 width=4) (actual time=0.006..0.007 rows=2 loops=1)
               ->  Index Scan using t_big_a_idx on t_big  (cost=0.42..8.44 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=2)
                     Index Cond: (a = t_small.a)
 Planning Time: 0.644 ms
 Execution Time: 0.150 ms
(15 rows)

Execution time drops from 1274.539 ms to 0.150 ms — a reduction of over 8,000x.

Understand the cost threshold

CBQT is skipped when the original plan's cost is below polar_cbqt_cost_threshold. In the previous example, the original plan costs 59511.17. Setting the threshold above that value prevents CBQT from running:

SET polar_enable_cbqt TO on;
SET polar_cbqt_cost_threshold TO 500000;
SET polar_cbqt_pushdown_sublink TO on;

EXPLAIN ANALYZE
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v
WHERE a IN (SELECT a FROM t_small);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Semi Join  (cost=1.46..59511.17 rows=10000 width=12) (actual time=0.059..1253.452 rows=2 loops=1)
   Merge Cond: (t_big.a = t_small.a)
   ->  GroupAggregate  (cost=0.42..46910.13 rows=1000000 width=12) (actual time=0.041..1127.255 rows=1000000 loops=1)
         Group Key: t_big.a
         ->  Index Scan using t_big_a_idx on t_big  (cost=0.42..31910.13 rows=1000000 width=8) (actual time=0.029..414.488 rows=1000000 loops=1)
   ->  Sort  (cost=1.03..1.03 rows=2 width=4) (actual time=0.014..0.015 rows=2 loops=1)
         Sort Key: t_small.a
         Sort Method: quicksort  Memory: 25kB
         ->  Seq Scan on t_small  (cost=0.00..1.02 rows=2 width=4) (actual time=0.005..0.006 rows=2 loops=1)
 Planning Time: 0.280 ms
 Execution Time: 1253.558 ms
(11 rows)

The original plan is used unchanged because 59511.17 < 500000.

Choose between search strategies

When multiple sublinks are present, linear and twophase strategies can select different plans. In the following query, two sublinks are eligible for pushdown, but pushing down only the second sublink (the one joining against the smaller t_small) is optimal.

`linear` strategy — evaluates each state incrementally and selects the best:

SET polar_enable_cbqt TO on;
SET polar_cbqt_strategy TO linear;
SET polar_cbqt_pushdown_sublink TO on;

EXPLAIN
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v WHERE a IN (SELECT a FROM t_big)
UNION ALL
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v WHERE a IN (SELECT a FROM t_small);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Append  (cost=0.85..105692.60 rows=500002 width=12)
   ->  Merge Semi Join  (cost=0.85..98174.56 rows=500000 width=12)
         Merge Cond: (t_big_1.a = t_big.a)
         ->  GroupAggregate  (cost=0.42..46910.13 rows=1000000 width=12)
               Group Key: t_big_1.a
               ->  Index Scan using t_big_a_idx on t_big t_big_1  (cost=0.42..31910.13 rows=1000000 width=8)
         ->  Index Only Scan using t_big_a_idx on t_big  (cost=0.42..26264.42 rows=1000000 width=4)
   ->  GroupAggregate  (cost=17.96..17.99 rows=2 width=12)
         Group Key: t_big_2.a
         ->  Sort  (cost=17.96..17.96 rows=2 width=8)
               Sort Key: t_big_2.a
               ->  Nested Loop  (cost=1.46..17.95 rows=2 width=8)
                     ->  Unique  (cost=1.03..1.04 rows=2 width=4)
                           ->  Sort  (cost=1.03..1.03 rows=2 width=4)
                                 Sort Key: t_small.a
                                 ->  Seq Scan on t_small  (cost=0.00..1.02 rows=2 width=4)
                     ->  Index Scan using t_big_a_idx on t_big t_big_2  (cost=0.42..8.44 rows=1 width=8)
                           Index Cond: (a = t_small.a)
(18 rows)

Only the second sublink (against t_small) is pushed down. The first sublink (against t_big) remains as a Merge Semi Join because pushing it down would not reduce cost.

`twophase` strategy — compares applying all transformations against applying none:

SET polar_enable_cbqt TO on;
SET polar_cbqt_strategy TO twophase;
SET polar_cbqt_pushdown_sublink TO on;

EXPLAIN
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v WHERE a IN (SELECT a FROM t_big)
UNION ALL
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v WHERE a IN (SELECT a FROM t_small);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.85..113192.60 rows=1000002 width=12)
   ->  GroupAggregate  (cost=0.85..88174.56 rows=1000000 width=12)
         Group Key: t_big.a
         ->  Merge Semi Join  (cost=0.85..73174.56 rows=1000000 width=8)
               Merge Cond: (t_big.a = t_big_1.a)
               ->  Index Scan using t_big_a_idx on t_big  (cost=0.42..31910.13 rows=1000000 width=8)
               ->  Index Only Scan using t_big_a_idx on t_big t_big_1  (cost=0.42..26264.42 rows=1000000 width=4)
   ->  GroupAggregate  (cost=17.96..17.99 rows=2 width=12)
         Group Key: t_big_2.a
         ->  Sort  (cost=17.96..17.96 rows=2 width=8)
               Sort Key: t_big_2.a
               ->  Nested Loop  (cost=1.46..17.95 rows=2 width=8)
                     ->  Unique  (cost=1.03..1.04 rows=2 width=4)
                           ->  Sort  (cost=1.03..1.03 rows=2 width=4)
                                 Sort Key: t_small.a
                                 ->  Seq Scan on t_small  (cost=0.00..1.02 rows=2 width=4)
                     ->  Index Scan using t_big_a_idx on t_big t_big_2  (cost=0.42..8.44 rows=1 width=8)
                           Index Cond: (a = t_small.a)
(18 rows)

Both sublinks are pushed down. The twophase strategy applies all transformations or none, so it cannot selectively skip the first sublink pushdown. The result is a higher total cost (113192.60 vs. 105692.60) compared to linear.

Use linear when transformations may have mixed effects across query blocks. Use twophase when you expect all applicable transformations to be beneficial.

Limit the number of iterations

polar_cbqt_iteration_limit caps how many states CBQT evaluates. With a limit of 1, CBQT may not reach the optimal state even if it exists:

SET polar_enable_cbqt TO on;
SET polar_cbqt_strategy TO twophase;
SET polar_cbqt_iteration_limit TO 1;
SET polar_cbqt_pushdown_sublink TO on;

EXPLAIN
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v WHERE a IN (SELECT a FROM t_big)
UNION ALL
SELECT * FROM (SELECT a, sum(b) b FROM t_big GROUP BY a) v WHERE a IN (SELECT a FROM t_small);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.85..113192.60 rows=1000002 width=12)
   ->  GroupAggregate  (cost=0.85..88174.56 rows=1000000 width=12)
         Group Key: t_big.a
         ->  Merge Semi Join  (cost=0.85..73174.56 rows=1000000 width=8)
               Merge Cond: (t_big.a = t_big_1.a)
               ->  Index Scan using t_big_a_idx on t_big  (cost=0.42..31910.13 rows=1000000 width=8)
               ->  Index Only Scan using t_big_a_idx on t_big t_big_1  (cost=0.42..26264.42 rows=1000000 width=4)
   ->  GroupAggregate  (cost=17.96..17.99 rows=2 width=12)
         Group Key: t_big_2.a
         ->  Sort  (cost=17.96..17.96 rows=2 width=8)
               Sort Key: t_big_2.a
               ->  Nested Loop  (cost=1.46..17.95 rows=2 width=8)
                     ->  Unique  (cost=1.03..1.04 rows=2 width=4)
                           ->  Sort  (cost=1.03..1.03 rows=2 width=4)
                                 Sort Key: t_small.a
                                 ->  Seq Scan on t_small  (cost=0.00..1.02 rows=2 width=4)
                     ->  Index Scan using t_big_a_idx on t_big t_big_2  (cost=0.42..8.44 rows=1 width=8)
                           Index Cond: (a = t_small.a)
(18 rows)

The output matches the twophase result from the previous example. With only one iteration, CBQT cannot evaluate enough states to find that pushing down only the second sublink is optimal. Increase polar_cbqt_iteration_limit if you have complex queries with many eligible transformations.

Related topics