This topic describes the PartitionedTableScan feature, providing information on its background, usage methods, and statistics on performance improvement.
Background information
PolarDB for PostgreSQL (Compatible with Oracle) doesn't limit the number of partitions for a table, and thus subpartitioning would lead to an exponential increase in the number of partitions.
For example, if a table is hash-partitioned into 100 partitions, and then each partition is hash-partitioned into 100 subpartitions, there would be 10,000 partitions in total for this partitioned table. If you perform a query on this table, the execution plan would look like this:
explain analyze select * from part_hash;
QUERY PLAN
-----------------------------------------------------------------------------
Append (cost=0.00..344500.00 rows=16300000 width=22)
-> Seq Scan on part_hash_sys0102 (cost=0.00..26.30 rows=1630 width=22)
-> Seq Scan on part_hash_sys0103 (cost=0.00..26.30 rows=1630 width=22)
-> Seq Scan on part_hash_sys0104 (cost=0.00..26.30 rows=1630 width=22)
...
...
...
-> Seq Scan on part_hash_sys10198 (cost=0.00..26.30 rows=1630 width=22)
-> Seq Scan on part_hash_sys10199 (cost=0.00..26.30 rows=1630 width=22)
-> Seq Scan on part_hash_sys10200 (cost=0.00..26.30 rows=1630 width=22)
Planning Time: 3183.644 ms
Execution Time: 633.779 ms
(10003 rows)
Total Memory: 216852KB
As is shown in the plan, the query would take a long time to complete, which is related to how the optimizer generates an execution plan for a partitioned table: the system generates an optimal plan for each partition and appends these plans into a single plan for the entire partitioned table. If the partitioned table only has a small number of partitions, the process would be fast and deliver satisfactory user experience. However, as the number of partitions grows, the process becomes slower, and you may find partitioned tables perform worse than non-partitioned tables.
For example, in the previous query, the table part_hash
has 10,000 partitions. Its planning time reached 3 seconds. In contrast, if the table is not partitioned, the planning time for this query would be around 0.1 millisecond only, which makes for a hundred-time difference. In addition to a long planning time, the query for the partitioned table also causes high memory usage, which could lead to an OOM issue.
This drawback of partitioned table is even worse when joins are included in a query, as is shown in the following query plan:
create table part_hash2 (a int, b int, c varchar(10))
PARTITION by HASH(a) SUBPARTITION by HASH (b) PARTITIONS 100 SUBPARTITIONS 100;
explain analyze select count(*) from part_hash a join part_hash2 b on a.a=b.b where b.c = '0001';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=48970442.90..48970442.91 rows=1 width=8) (actual time=6466.854..6859.935 rows=1 loops=1)
-> Gather (cost=48970442.68..48970442.89 rows=2 width=8) (actual time=397.780..6859.902 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=48969442.68..48969442.69 rows=1 width=8) (actual time=4.748..11.768 rows=1 loops=3)
-> Merge Join (cost=1403826.01..42177776.01 rows=2716666667 width=0) (actual time=4.736..11.756 rows=0 loops=3)
Merge Cond: (a.a = b.b)
-> Sort (cost=1093160.93..1110135.93 rows=6790000 width=4) (actual time=4.734..8.588 rows=0 loops=3)
Sort Key: a.a
Sort Method: quicksort Memory: 25kB
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
-> Parallel Append (cost=0.00..229832.35 rows=6790000 width=4) (actual time=4.665..8.518 rows=0 loops=3)
-> Parallel Seq Scan on part_hash_sys0102 a (cost=0.00..19.59 rows=959 width=4) (actual time=0.001..0.001 rows=0 loops=1)
-> Parallel Seq Scan on part_hash_sys0103 a_1 (cost=0.00..19.59 rows=959 width=4) (actual time=0.001..0.001 rows=0 loops=1)
-> Parallel Seq Scan on part_hash_sys0104 a_2 (cost=0.00..19.59 rows=959 width=4) (actual time=0.001..0.001 rows=0 loops=1)
...
-> Sort (cost=310665.08..310865.08 rows=80000 width=4) (never executed)
Sort Key: b.b
-> Append (cost=0.00..304150.00 rows=80000 width=4) (never executed)
-> Seq Scan on part_hash2_sys0102 b (cost=0.00..30.38 rows=8 width=4) (never executed)
Filter: ((c)::text = '0001'::text)
-> Seq Scan on part_hash2_sys0103 b_1 (cost=0.00..30.38 rows=8 width=4) (never executed)
Filter: ((c)::text = '0001'::text)
-> Seq Scan on part_hash2_sys0104 b_2 (cost=0.00..30.38 rows=8 width=4) (never executed)
Filter: ((c)::text = '0001'::text)
...
Planning Time: 221082.616 ms
Execution Time: 9500.148 ms
(30018 rows)
Total Memory: 679540KB
Therefore, in the case of a query that cannot limit the scanned data to specific partitions, partitioned tables may perform worse than non-partitioned tables. Although partition pruning can help pinpoint the partitions, it is not a cure-all solution, as a full-table scan may still be needed in some OLAP scenarios.
Solution
To solve the issue of prolonged queries on tables with numerous partitions, PolarDB for PostgreSQL (Compatible with Oracle) provides the PartitionedTableScan
operator. Compared with using the Append operator, it achieves higher efficiency, shorter planning time, and smaller memory usage.
If PartitionedTableScan
is used, the two SQL statements would be executed with much less planning time and memory consumption.
explain analyze select * from part_hash;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
PartitionedTableScan on part_hash (cost=0.00..1.00 rows=1 width=22) (actual time=134.348..134.352 rows=0 loops=1)(Iteration partition number 10000)
Scan Partitions: part_hash_sys0102, part_hash_sys0103, ...part_hash_sys10198, part_hash_sys10199, part_hash_sys10200
-> Seq Scan on part_hash (cost=0.00..1.00 rows=1 width=22)
Planning Time: 293.778 ms
Execution Time: 384.202 ms
(5 rows)
Total Memory: 40276KB
explain analyze select count(*) from part_hash a join part_hash2 b on a.a=b.b where b.c = '0001';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=2.02..2.03 rows=1 width=8) (actual time=152.322..152.326 rows=1 loops=1)
-> Nested Loop (cost=0.00..2.02 rows=1 width=0) (actual time=152.308..152.311 rows=0 loops=1)
Join Filter: (a.a = b.b)
-> PartitionedTableScan on part_hash a (cost=0.00..1.00 rows=1 width=4) (actual time=152.305..152.306 rows=0 loops=1)(Iteration partition number 10000)
Scan Partitions: part_hash_sys0102, part_hash_sys0103,, part_hash_sys10198, part_hash_sys10199, part_hash_sys10200
-> Seq Scan on part_hash a (cost=0.00..1.00 rows=1 width=4)
-> PartitionedTableScan on part_hash2 b (cost=0.00..1.00 rows=1 width=4) (never executed)
-> Seq Scan on part_hash2 b (cost=0.00..1.00 rows=1 width=4)
Filter: ((c)::text = '0001'::text)
Planning Time: 732.952 ms
Execution Time: 436.927 ms
(11 rows)
Total Memory: 68104KB
The following table lists the comparison.
Item |
|
|
Single Query Planning Time | 3183.644ms | 293.778ms |
Single Total Memory | 216852 KB | 40276 KB |
Join Query Planning Time | 221082.616ms | 732.952ms |
Join Total Memory | 679540 KB | 68104 KB |
Limitations
Both PolarDB for PostgreSQL (Compatible with Oracle) 1.0 and 2.0 clusters support
PartitionedTableScan
, but they must be of a revision version of V1.1.32 or later.Only
SELECT
commands can be executed usingPartitionedTableScan
. DML statements are not supported.PartitionedTableScan
cannot be used together with partition-wise joins. If the partition-wise join feature is enabled, execution plans will not include thePartitionedTableScan
operator.
If you want to enable PartitionedTable Scan
for a cluster whose revision version is earlier than V1.1.32, contact technical support.
Procedure
In this section, an example is used to illustrate how to use the PartitionedTableScan
feature.
The following partitioned table is created for demonstration.
CREATE TABLE prt1 (a int, b int, c varchar) PARTITION BY Hash(a) partitions 16;
Enable PartitionedTableScan
You can enable or disable the PartitionedTableScan
feature by configuring the polar_num_parts_for_partitionedscan
parameter.
Parameter | Valid values | Default value | Description |
polar_num_parts_for_partitionedscan | -1 to INT_MAX | 64 | The number of partitions beyond which the The parameter has two special values:
|
Example
SET polar_num_parts_for_partitionedscan to -1;
explain select * from prt1;
QUERY PLAN
-----------------------------------------------------------------
PartitionedTableScan on prt1 (cost=0.00..1.00 rows=1 width=40)
-> Seq Scan on prt1 (cost=0.00..1.00 rows=1 width=40)
(2 rows)
Use hints
You can use the hint PARTEDSCAN(table alias)
to dictate the use of PartitionedTableScan. Example:
EXPLAIN select /*+PARTEDSCAN(prt1) */ * from prt1;
QUERY PLAN
-----------------------------------------------------------------
PartitionedTableScan on prt1 (cost=0.00..1.00 rows=1 width=40)
-> Seq Scan on prt1 (cost=0.00..1.00 rows=1 width=40)
(2 rows)
PartitionedTableScan in parallel
PolarDB for PostgreSQL (Compatible with Oracle) provides the parallel append feature to improve the performance of queries on large volumes of data. Similar to parallel append
, PartitionedTableScan
can also be executed in parallel. The feature is named PartitionedTableScan Append
. It supports two parallel modes: inter-partition mode and hybrid mode.
PartitionedTableScan Append across partitions
In this mode, each worker handles a partition, and they work in parallel.
EXPLAIN (COSTS OFF) select /*+PARTEDSCAN(prt1) */ * from prt1;
QUERY PLAN
---------------------------------------------
Gather
Workers Planned: 4
-> Parallel PartitionedTableScan on prt1
-> Seq Scan on prt1
(4 rows)
In the preceding query plan, four workers are used in the execution, each of which works on a single partition. The Parallel PartitionedTableScan
operator in the query plan is the indicator of this mode.
Hybrid PartitionedTableScan Append
In hybrid mode, the system handles the execution in parallel both within and across partitions.
EXPLAIN (COSTS OFF) select /*+PARTEDSCAN(prt1) */ * from prt1;
QUERY PLAN
---------------------------------------------
Gather
Workers Planned: 8
-> Parallel PartitionedTableScan on prt1
-> Parallel Seq Scan on prt1
(4 rows)
In this example, eight workers are used. Again, the Parallel PartitionedTableScan
indicator is contained in this query plan.
Each of the modes have their cost model, and the optimizer will select the one that is less costly.
Partition pruning
PartitionedTableScan
supports partition pruning in all three stages. For more information about partition pruning, see Partition pruning.
Performance test
PartitionedTableScan
delivers higher efficiency. This performance test shows the performance improvement enabled by PartitionedTableScan
.
The following test results come from a development environment, and is provided only for reference. The actual results may vary based on different configurations and conditions. The tests are conducted in consistent environment configurations to test the performance improvement effect of PartitionedTableScan
. The number of partitions is the only variable.
The following SQL statements are used for testing:
explain select * from prt1 where b = 10;
explain select /*+PARTEDSCAN(prt1) */ * from prt1 where b = 10;
The Planning Time of a single SQL statement
Number of partitions | Append Planning Time | PartitionedTableScan Planning Time |
16 | 0.266ms | 0.067ms |
32 | 1.820ms | 0.258ms |
64 | 3.654ms | 0.402ms |
128 | 7.010ms | 0.664ms |
256 | 14.095ms | 1.247ms |
512 | 27.697ms | 2.328ms |
1024 | 73.176ms | 4.165ms |
Memory usage of a single SQL statement
Number of partitions | Append Memory | PartitionedTableScan Memory |
16 | 1,170 KB | 1,044 KB |
32 | 1,240 KB | 1,044 KB |
64 | 2,120 KB | 1,624 KB |
128 | 2,244 KB | 1,524 KB |
256 | 2,888 KB | 2,072 KB |
512 | 4,720 KB | 3,012 KB |
1024 | 8,236 KB | 5,280 KB |
Queries per Second (QPS)
pgbench -i --scale=10
pgbench -c 64 -j 64 -n -T60
Query:
explain select * from prt1 where b = 10;
explain select /*+PARTEDSCAN(prt1) */ * from prt1 where b = 10;
Number of partitions | Append QPS | PartitionedTableScan QPS |
16 | 25,318 | 93,950 |
32 | 10,906 | 61,879 |
64 | 5,281 | 30,839 |
128 | 2,195 | 16,684 |
256 | 920 | 8,372 |
512 | 92 | 3,708 |
1024 | 21 | 1,190 |
Conclusions
The preceding performance test results show that PartitionedTableScan
can significantly improve performance, especially when a large number of partitions are involved. If the tables that are involved in your business have a large number of partitions, and queries require a long time to be executed, we recommend that you use PartitionedTableScan
to improve the performance.