All Products
Search
Document Center

AnalyticDB for PostgreSQL:Partitioned tables

Last Updated:Aug 01, 2023

If a table is partitioned, the system scans only the partitions that meet specified query conditions, instead of scanning the full table. This improves query performance.

Use partitioned tables

AnalyticDB for PostgreSQL supports the range, list, and multi-level partitioned tables. The following figure shows a multi-level partitioned table that has date-based range partitions and region-based list partitions.

分区表使用

Scenarios

Take note of the following items when you decide whether to partition a table:

  • Data volumes: We recommend that you use partitioning on tables that contain large volumes of data, such as a fact table that has 10 million data rows. The standard varies based on your experience and query performance.

  • Partition columns: If a table has large volumes of data and an appropriate partition column such as day or month, you can use partitioning.

  • Data lifecycle: Typically, data must be managed in data warehouses based on data lifecycle. You can use partitioned tables to facilitate data management such as the deletion of old data.

  • Query statements: When a table is partitioned, its query performance can be improved only when its partition column is contained in query statements. If query statements that involve no partition columns are executed on the partitioned table, it may take longer for the queries to complete because all partitions are scanned.

Create a range partitioned table

You can use AnalyticDB for PostgreSQL to generate partitions by assigning a START value, an END value, and an EVERY clause that defines the partition increment value. By default, START values are inclusive and END values are exclusive. Sample statement:

CREATE TABLE sales (id int, date date, amt decimal(10,2))
DISTRIBUTED BY (id)
PARTITION BY RANGE (date)
( START (date '2016-01-01') INCLUSIVE
   END (date '2017-01-01') EXCLUSIVE
   EVERY (INTERVAL '1 day') );

You can also create a range partitioned table that uses a column of a numeric data type as the partition key. Sample statement:

CREATE TABLE rank (id int, rank int, year int, gender char(1), count int)
DISTRIBUTED BY (id)
PARTITION BY RANGE (year)
( START (2006) END (2016) EVERY (1), 
  DEFAULT PARTITION extra ); 

Create a list partitioned table

A list partitioned table can use a data type column that allows equality comparisons as the partition key. For list partitions, you must declare a partition specification for each partition (list value) that you want to create. Sample statement:

CREATE TABLE rank (id int, rank int, year int, gender 
char(1), count int ) 
DISTRIBUTED BY (id)
PARTITION BY LIST (gender)
( PARTITION girls VALUES ('F'), 
  PARTITION boys VALUES ('M'), 
  DEFAULT PARTITION other );

Create a multi-level partitioned table

You can create a table that has multi-level partitions. The following sample statement shows how to create a three-level partitioning table. Data in range partitions is partitioned by month, and data in list partitions is partitioned by region.

CREATE TABLE sales
  (id int, year int, month int, day int, region text)
DISTRIBUTED BY (id)
PARTITION BY RANGE (month)
  SUBPARTITION BY LIST (region)
    SUBPARTITION TEMPLATE (
    SUBPARTITION usa VALUES ('usa'),
    SUBPARTITION europe VALUES ('europe'),
    SUBPARTITION asia VALUES ('asia'),
    DEFAULT SUBPARTITION other_regions)
(START (1) END (13) EVERY (1), 
DEFAULT PARTITION other_months );

Optimize partitioned table queries

  • Partition granularity

    For a table partitioned by time, the granularity can be a day, week, or month. A finer granularity produces a smaller amount of data in each partition but a larger number of partitions. The number of partitions is not measured by an absolute standard. Typically, the number of partitions is considered quite large at around 200. A large number of partitions have a significant impact on database performance. For example, the query optimizer may take a longer time to generate execution plans, and maintenance operations such as VACUUM may slow down.

    In a multi-level partitioned table, the number of partitioning files may increase sharply. For example, if a table is partitioned by month and city in a scenario that contains 24 months and 100 cities, the total number of table partitions is 2,400. If the table is a column-oriented table that stores each column in a physical table and the table contains 100 columns, the system must manage more than 100,000 files for the table. You must estimate the number of partitions that are needed when you determine the partitioning method.

  • Partition pruning

    AnalyticDB for PostgreSQL supports partition pruning for partitioned tables. For more information, see Partition pruning.

Maintain partitioned tables

In partitioned tables, partitions can be managed by using various SQL statements. Examples:

  • Create a partition

    If a default partition exists, you cannot create partitions. However, you can split the default partition.

    ALTER TABLE test_partition_range ADD partition p2 start ('2017-02-01') end ('2017-02-28');
  • Delete a partition

    ALTER TABLE test_partition_range DROP partition p2;
  • Rename a partition

    ALTER TABLE test_partition_range RENAME PARTITION p2 TO Feb17;
  • Clear a partition

    ALTER TABLE test_range_partition TRUNCATE PARTITION p1;
  • Exchange a partition

    ALTER TABLE test_range_partition EXCHANGE PARTITION p2 WITH  TABLE {cos_table_name} ;
  • Split a partition

    -- Split the p2 partition into two partitions on February 20, 2017.
    ALTER TABLE test_partition_range SPLIT partition p2 at ('2017-02-20') into (partition p2, partition p3);