The Elastic Parallel Query (ePQ) feature of PolarDB for PostgreSQL allows you to query partitioned tables.
Background information
As the amount of data continues to grow, the size of a table increases. To facilitate management and improve query performance, you can use partitioned tables to store data. That way, a table is partitioned into multiple child tables. In addition, each child table can be further partitioned to form a multi-level partition hierarchy.
PolarDB for PostgreSQL provides the ePQ feature to support multi-node parallel queries. This allows you to use multiple compute nodes in a cluster to improve the performance of read-only queries. ePQ supports efficient multi-node parallel queries of not only regular tables but also partitioned tables.
ePQ supports the following basic features for partitioned tables:
Scan of partitioned tables whose partitioning policy is range, list, or hash
Index scan of partitioned tables
Queries of partitioned tables that are joined
In addition, ePQ supports the following advanced features for partitioned tables:
Partition pruning
Partition-wise joins
Parallel queries of multi-level partitioned tables
ePQ does not support parallel queries of partitioned tables with multi-column partition keys.
Prerequisites
Your PolarDB for PostgreSQL cluster runs the following engine:
PostgreSQL 11 with a revision version of 1.1.17 or later
PostgreSQL 14 with a revision version of 14.8.11.0 or later
You can execute one of the following statements to query the revision version of your PolarDB for PostgreSQL cluster:
PostgreSQL 11
show polar_version;PostgreSQL 14
select version();
Procedure
Parallel queries of partitioned tables
Create a partitioned table whose partitioning policy is range and create three partitions.
CREATE TABLE t1 (id INT) PARTITION BY RANGE(id); CREATE TABLE t1_p1 PARTITION OF t1 FOR VALUES FROM (0) TO (200); CREATE TABLE t1_p2 PARTITION OF t1 FOR VALUES FROM (200) TO (400); CREATE TABLE t1_p3 PARTITION OF t1 FOR VALUES FROM (400) TO (600);Enable the ePQ feature and the partitioned table scan feature of ePQ.
SET polar_enable_px TO ON; SET polar_px_enable_partition TO ON;Query the execution plan of the statement that performs a full table scan on the partitioned table.
EXPLAIN (COSTS OFF) SELECT * FROM t1; QUERY PLAN ------------------------------------------- PX Coordinator 6:1 (slice1; segments: 6) -> Append -> Partial Seq Scan on t1_p1 -> Partial Seq Scan on t1_p2 -> Partial Seq Scan on t1_p3 Optimizer: PolarDB PX Optimizer (6 rows)As shown in the preceding execution plan, ePQ starts a group of processes to scan each child table of the partitioned table in parallel. Each scan process scans a portion of the data of each child table (
Partial Seq Scan) by using theAppendoperator. A Motion operator (PX Coordinator) aggregates the scan results of all processes to the process that initiated the query and returns the results.
Static partition pruning
If the filter condition of a query uses a partition key, the ePQ optimizer can prune the partitioned table to be scanned based on the filter condition to prevent unnecessary partitions from being scanned. This reduces system resource usage and improves query performance. In this example, the t1 table created in the previous section is used. Execute the following statement to query the execution plan of a sample query with a filter condition:
EXPLAIN (COSTS OFF) SELECT * FROM t1 WHERE id < 100;
QUERY PLAN
-------------------------------------------
PX Coordinator 6:1 (slice1; segments: 6)
-> Append
-> Partial Seq Scan on t1_p1
Filter: (id < 100)
Optimizer: PolarDB PX Optimizer
(5 rows)The filter condition id < 100 of the query uses the partition key. Therefore, the ePQ optimizer removes the partitions t1_p2 and t1_p3 that do not meet the filter condition from the execution plan based on the partition boundary, and retains only the partition t1_p1 that meets the filter condition.
Partition-wise joins
During the join of partitioned tables, if the partitioning policy and boundaries are the same, and the join condition is based on a partition key, the ePQ optimizer can generate an execution plan in which the partitioned tables are joined in a partition-wise manner. This prevents a Cartesian product join of the partitioned tables, reduces system resource usage, and improves query performance.
In the following example, two partitioned tables whose partitioning policy is range are joined.
Create partitioned tables
t2andt3whose partitioning policy and boundaries are the same.CREATE TABLE t2 (id INT) PARTITION BY RANGE(id); CREATE TABLE t2_p1 PARTITION OF t2 FOR VALUES FROM (0) TO (200); CREATE TABLE t2_p2 PARTITION OF t2 FOR VALUES FROM (200) TO (400); CREATE TABLE t2_p3 PARTITION OF t2 FOR VALUES FROM (400) TO (600); CREATE TABLE t3 (id INT) PARTITION BY RANGE(id); CREATE TABLE t3_p1 PARTITION OF t3 FOR VALUES FROM (0) TO (200); CREATE TABLE t3_p2 PARTITION OF t3 FOR VALUES FROM (200) TO (400); CREATE TABLE t3_p3 PARTITION OF t3 FOR VALUES FROM (400) TO (600);Enable the ePQ feature and the partitioned table scan feature of ePQ.
SET polar_enable_px TO ON; SET polar_px_enable_partition TO ON;Disable partition-wise joins by setting the polar_px_enable_partitionwise_join parameter to OFF. Query the execution plan of a query in which an equi join is performed on the two partitioned tables based on the partition key.
SET polar_px_enable_partitionwise_join TO OFF; EXPLAIN (COSTS OFF) SELECT * FROM t2 JOIN t3 ON t2.id = t3.id; QUERY PLAN ----------------------------------------------------------- PX Coordinator 6:1 (slice1; segments: 6) -> Hash Join Hash Cond: (t2_p1.id = t3_p1.id) -> Append -> Partial Seq Scan on t2_p1 -> Partial Seq Scan on t2_p2 -> Partial Seq Scan on t2_p3 -> Hash -> PX Broadcast 6:6 (slice2; segments: 6) -> Append -> Partial Seq Scan on t3_p1 -> Partial Seq Scan on t3_p2 -> Partial Seq Scan on t3_p3 Optimizer: PolarDB PX Optimizer (14 rows)As shown in the preceding execution plan, each of the six processes that execute
slice1scans a portion of the data in each partition of the partitioned tablet2in sequence by using theAppendoperator. The Motion operatorPX Broadcastis used to receive the full data of the partitionedt3broadcast by the six processes that executeslice2. After the localHash Joinis complete, the Motion operatorPX Coordinatoraggregates and returns the results. In this case, a join is performed between each row of data int2and each row of data int3.Enable partition-wise joins by setting the polar_px_enable_partitionwise_join parameter to ON. Query the execution plan again.
SET polar_px_enable_partitionwise_join TO ON; EXPLAIN (COSTS OFF) SELECT * FROM t2 JOIN t3 ON t2.id = t3.id; QUERY PLAN ------------------------------------------------ PX Coordinator 6:1 (slice1; segments: 6) -> Append -> Hash Join Hash Cond: (t2_p1.id = t3_p1.id) -> Partial Seq Scan on t2_p1 -> Hash -> Full Seq Scan on t3_p1 -> Hash Join Hash Cond: (t2_p2.id = t3_p2.id) -> Partial Seq Scan on t2_p2 -> Hash -> Full Seq Scan on t3_p2 -> Hash Join Hash Cond: (t2_p3.id = t3_p3.id) -> Partial Seq Scan on t2_p3 -> Hash -> Full Seq Scan on t3_p3 Optimizer: PolarDB PX Optimizer (18 rows)As shown in the preceding execution plan, each of the six processes that execute
slice1scans a portion of data in each partition of the partitioned tablet2, as well as all data in the corresponding partition of the partitioned tablet3in sequence by using theAppendoperator. AHash Joinis performed on the data. Then, the Motion operatorPX Coordinatoraggregates and returns the results.During the execution, a join is performed only between each of the partitions
t2_p1,t2_p2, andt2_p3of the partitioned tablet2and each of the corresponding partitionst3_p1,t3_p2, andt3_p3of the partitioned tablet3. No join is performed on irrelevant partitions. This improves efficiency.
Parallel queries of multi-level partitioned tables
In a multi-level partitioned table, the partition key at each level may vary. For example, a table can be partitioned by time, and then further partitioned by region. If the filter condition of a query uses the partition key at each level of a multi-level partitioned table, the ePQ optimizer supports static partition pruning to remove partitions that do not need to be scanned.
As shown in the following figure, the filter condition WHERE date = '202201' AND region = 'beijing' uses both the level-1 partition key date and the level-2 partition key region. In this case, the ePQ optimizer can remove all irrelevant partitions, and generates an execution plan that contains only the partitions that meet the condition. This way, only the required partitions are scanned by the executor.
Example
Create a multi-level partitioned table.
CREATE TABLE r1 (a INT, b TIMESTAMP) PARTITION BY RANGE (b); CREATE TABLE r1_p1 PARTITION OF r1 FOR VALUES FROM ('2000-01-01') TO ('2010-01-01') PARTITION BY RANGE (a); CREATE TABLE r1_p1_p1 PARTITION OF r1_p1 FOR VALUES FROM (1) TO (1000000); CREATE TABLE r1_p1_p2 PARTITION OF r1_p1 FOR VALUES FROM (1000000) TO (2000000); CREATE TABLE r1_p2 PARTITION OF r1 FOR VALUES FROM ('2010-01-01') TO ('2020-01-01') PARTITION BY RANGE (a); CREATE TABLE r1_p2_p1 PARTITION OF r1_p2 FOR VALUES FROM (1) TO (1000000); CREATE TABLE r1_p2_p2 PARTITION OF r1_p2 FOR VALUES FROM (1000000) TO (2000000);Enable the ePQ feature and the partitioned table scan feature of ePQ.
SET polar_enable_px TO ON; SET polar_px_enable_partition TO ON;Disable the multi-level partition scan feature of ePQ by setting the polar_px_optimizer_multilevel_partitioning parameter to OFF. Query the execution plan of an SQL statement that uses both partition keys in a filter condition. The execution plan based on static partition pruning of the built-in optimizer of PostgreSQL is generated.
SET polar_px_optimizer_multilevel_partitioning TO OFF; EXPLAIN (COSTS OFF) SELECT * FROM r1 WHERE a < 1000000 AND b < '2009-01-01 00:00:00'; QUERY PLAN ---------------------------------------------------------------------------------------- Seq Scan on r1_p1_p1 r1 Filter: ((a < 1000000) AND (b < '2009-01-01 00:00:00'::timestamp without time zone)) (2 rows)Enable the multi-level partition scan feature of ePQ by setting the polar_px_optimizer_multilevel_partitioning parameter to ON. Query the execution plan again.
SET polar_px_optimizer_multilevel_partitioning TO ON; EXPLAIN (COSTS OFF) SELECT * FROM r1 WHERE a < 1000000 AND b < '2009-01-01 00:00:00'; QUERY PLAN ---------------------------------------------------------------------------------------------------- PX Coordinator 6:1 (slice1; segments: 6) -> Append -> Partial Seq Scan on r1_p1_p1 Filter: ((a < 1000000) AND (b < '2009-01-01 00:00:00'::timestamp without time zone)) Optimizer: PolarDB PX Optimizer (5 rows)As shown in the preceding execution plan, the ePQ optimizer performs static partition pruning on the multi-level partitioned table. The six processes that execute
slice1need to only perform parallel scan (Partial Seq Scan) on ther1_p1_p1partition that meets the filter condition, aggregate the scanned data by using the Motion operatorPX Coordinator, and then return the data.