A global index is an index created on a non-partition key column of a partitioned table. A global index can also enforce a UNIQUE constraint.
Applicability
This feature is supported on PolarDB for PostgreSQL clusters that run PostgreSQL 14 with minor engine version 2.0.14.9.15.0 or later.
You can view the minor engine version number in the console or by running the SHOW polardb_version; statement. If the minor engine version does not meet the requirements, you can upgrade the minor engine version.
Background information
As business data grows, data partitioning becomes a key feature of enterprise-level databases and an important method for reducing data scale. A partitioned table is split into several independent child tables based on partition keys. The child tables are managed separately to improve manageability, overall performance, and load balancing.
Many users of PolarDB for PostgreSQL and use partitioned tables to manage their data. A typical use case is partitioning data by time:
Use time as the partition key for the partitioned table.
Create new subpartitions periodically, such as weekly or monthly, to store new data.
Archive old subpartitions regularly to reduce the operations and maintenance (O&M) costs of the partitioned table.
In the preceding scenario, time is typically used as the partition key, not as a primary key or unique ID. This causes two problems:
When you query data using a non-partition key, all subpartitions must be scanned because the system cannot determine which partition contains the data.
When you modify data using a non-partition key, data uniqueness cannot be guaranteed across the entire partitioned table.
To solve these problems, PolarDB for PostgreSQL and provide the global index feature. Unlike a local index, which is created on each subpartition by default, a single global index covers the data of the entire partitioned table, meaning one index corresponds to multiple subpartitions. This lets you enforce globally unique constraints and greatly improve query performance on non-partition keys.
Limits
You can still
ATTACHorDETACHsubpartitions from a partitioned table that has a global index.Specify the
GLOBALkeyword in the `CREATE INDEX` statement to create a global index. If you do not specify this keyword, a local index is created by default.You can create global indexes concurrently by specifying the
CONCURRENTLYkeyword.You cannot create global indexes on non-partitioned tables or on child tables that contain subpartitions.
Global indexes do not support expression indexes.
You cannot create global indexes on the partition key columns of a partitioned table.
Syntax
You can create a global index.
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON [ ONLY ] table_name [ USING method ]
( { column_name | ( expression ) } [ COLLATE collation ] [ opclass [ ( opclass_parameter = value [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
[ INCLUDE ( column_name [, ...] ) ]
[ WITH ( storage_parameter [= value] [, ... ] ) ]
[ GLOBAL/LOCAL ]
[ TABLESPACE tablespace_name ]
[ WHERE predicate ]Examples
Accelerate non-partition key queries
You can create a partitioned table with a time column as the partition key.
CREATE TABLE partition_range ( id INT, a INT, b INT, created_date TIMESTAMP WITHOUT TIME ZONE ) PARTITION BY RANGE (created_date); CREATE TABLE partition_range_part01 PARTITION OF partition_range FOR VALUES FROM (MINVALUE) TO ('2020-01-01 00:00:00'); CREATE TABLE partition_range_part02 PARTITION OF partition_range FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2020-02-01 00:00:00'); CREATE TABLE partition_range_part03 PARTITION OF partition_range FOR VALUES FROM ('2020-02-01 00:00:00') TO ('2020-03-01 00:00:00');You can query the partitioned table based on a non-partition key condition.
EXPLAIN (COSTS OFF) SELECT * FROM partition_range WHERE id = 1;The result shows that all subpartitions are scanned. The partition pruning feature cannot be used.
QUERY PLAN ------------------------------------------------------------ Append -> Seq Scan on partition_range_part01 partition_range_1 Filter: (id = 1) -> Seq Scan on partition_range_part02 partition_range_2 Filter: (id = 1) -> Seq Scan on partition_range_part03 partition_range_3 Filter: (id = 1) (7 rows)You can create a local index on the partitioned table and run the query again.
CREATE INDEX partition_range_idx_local ON partition_range(id); EXPLAIN (COSTS OFF) SELECT * FROM partition_range WHERE id = 1;The result shows that the local indexes of all subpartitions are still scanned because local indexes are created on each subpartition.
QUERY PLAN -------------------------------------------------------------------------------------------------- Append -> Index Scan using partition_range_part01_id_idx on partition_range_part01 partition_range_1 Index Cond: (id = 1) -> Index Scan using partition_range_part02_id_idx on partition_range_part02 partition_range_2 Index Cond: (id = 1) -> Index Scan using partition_range_part03_id_idx on partition_range_part03 partition_range_3 Index Cond: (id = 1) (7 rows)You can use the
GLOBALkeyword to create a global index on the partitioned table and run the query again.CREATE INDEX partition_range_idx_global ON partition_range(id) GLOBAL; EXPLAIN (COSTS OFF) SELECT * FROM partition_range WHERE id = 1;The result shows that the database uses the global index to directly find the subpartition that contains the data.
QUERY PLAN ----------------------------------------------------------------------- Global Index Scan using partition_range_idx_global on partition_range Index Cond: (id = 1) (2 rows)
Unique constraint for non-partition keys
Use the same partitioned table from the preceding example. The partition key is created_date, but the column that requires a unique constraint is id.
CREATE UNIQUE INDEX partition_range_id_unique_idx ON partition_range(id);
ERROR: unique constraint on partitioned table must include all partitioning columns
DETAIL: UNIQUE constraint on table "partition_range" lacks column "created_date" which is part of the partition key.In native PostgreSQL, if you try to create a UNIQUE constraint on a non-partition key using a local index, an error occurs. The error message indicates that the partition key must be included in the index. PolarDB for PostgreSQL and enhance the global index feature. By default, a UNIQUE constraint on a non-partition key is converted to a global constraint that is based on a global index. This behavior is controlled by the polar_pk_in_non_partition_column_mode parameter. Its default value is global_index. If you change the value to none, the behavior is the same as that of native PostgreSQL.
However, this restriction does not apply when you add a UNIQUE constraint to a global index, as shown below.
CREATE UNIQUE INDEX partition_range_id_unique_idx ON partition_range(id) GLOBAL;Performance test
You can use pgbench to generate data with a scale factor of 80000 and create a partitioned table and a non-partitioned table.
Point query performance on non-partition keys
Category | TPS | |||||
Prepared Statement | Not used | Used | ||||
Concurrency | 1 | 32 | 64 | 1 | 32 | 64 |
Standard table | 27,732 | 494,433 | 430,848 | 53,935 | 985,880 | 886,882 |
Partitioned table + local index | 367 | 4,155 | 3,688 | 856 | 8,742 | 6,790 |
Partitioned table + global index | 19,006 | 308,128 | 262,941 | 45,090 | 820,924 | 731,557 |
TPC-B performance on non-partition keys
This includes point queries and Data Manipulation Language (DML) statements.
Category | TPS | |||||
Prepared Statement | Not used | Used | ||||
Concurrency | 1 | 32 | 64 | 1 | 32 | 64 |
Standard table | 1,115 | 51,025 | 60,409 | 4,822 | 90,312 | 100,802 |
Partitioned table + local index | 271 | 2,903 | 2,524 | 550 | 5,276 | 4,237 |
Partitioned table + global index | Not supported | 4,334 | 69,040 | 75,232 | ||
Conclusion
Global indexes can improve the performance of point queries and DML statements on partitioned tables by an order of magnitude.