全域索引(Global Index)是分區表上的一種索引技術,可以建立在分區表的非分區鍵上,也支援提供唯一約束。
背景資訊
分區表的本地索引(Local Index),指每個分區都以一個Local Index分區,Local Index的分區規則和表的分區規則保持一致,因此Local Index只有包含分區鍵時才能作為唯一索引。
Global Index與Local Index不同,Global Index是分區表父表上的單個索引,該索引映射到許多基礎資料表分區。 父表本身沒有單個統一的基礎儲存,因此,它必須從物理分布式表中檢索滿足索引約束的資料。Global Index將資料存放區在一個位置,因此可以一次訪問跨多個分區的資料,而不是分別查詢每個分區。
Global Index是建立在分區表父表上的一個Btree Index,分區上則沒有。它支援建立在分區表的非分區鍵上,也支援提供唯一約束。
全域分區索引(Global Partitioned Index)
Global Partitioned Index(GPI)旨在分區表上建立一個全域的分區索引。GPI常用於資料量較大的分區表,其特點是資料量大,構建索引更複雜。相比於全域索引,GPI有自己的分區策略,和分區表的分區並不一致。傳統的本地分區索引與子表一一對應,相當於子表的本地索引,而GPI既有本地索引的分區特性,又有全域索引的全域特性。
建立Global Index
文法
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON [ ONLY ] table_name [ USING method ]
( { column_name | ( expression ) } [ COLLATE collation ] [ opclass [ ( opclass_parameter = value [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
[ INCLUDE ( column_name [, ...] ) ]
[ WITH ( storage_parameter [= value] [, ... ] ) ]
[ GLOBAL/LOCAL/global_partitioned_index ]
[ TABLESPACE tablespace_name ]
[ WHERE predicate ]
global_partitioned_index:
GLOBAL PARTITION BY {RANGE (column_list) (index_partitioning_clause) |
HASH (column_list) (hash_partitions_by_quantity)}在分區表上建立一個Global Index。
CREATE unique index m_city_id_idx_global ON measurement(city_id) global;在分區表上建立一個Global Partitioned Index。
CREATE UNIQUE INDEX m_peaktemp_idx_global ON measurement(peaktemp) global
PARTITION BY range (peaktemp)
(
PARTITION m_peaktemp_idx_global_p1 values less than(1000),
PARTITION m_peaktemp_idx_global_p2 values less than(5000),
PARTITION m_peaktemp_idx_global_p3 values less than(MAXVALUE)
);說明
GLOBAL/LOCAL參數指定為GLOBAL即建立Global Index。如果不指定建立
GLOBAL/LOCAL參數,則預設建立Local Index。Global Index的CREATE文法支援使用CONCURRENTLY模式建立。
非分區表,包括分區表的子表上不支援建立Global Index。
Global Index不支援運算式索引。
Global Partitioned Index支援將Global Index進行range或hash分區,分區的文法和partition table類似。
Global Partitioned Index不支援按照List分區。
Global Partitioned Index只支援在分區表主表上建立,支援建立在分區鍵的列上。
Global Partitioned Index支援全域唯一約束。
Global Partitioned Index分區列必須指定索引列的左首碼。
Global Partitioned Range Index支援有序掃描。
Global Partitioned Hash Index不支援有序掃描。
Global Partitioned Index支援索引剪枝最佳化,即最佳化器剪枝和執行器剪枝。
Global Index Scan
Global Index和普通的Btree Index一樣,當使用索引鍵進行查詢時,Index Scan將會提升查詢的效能。
Global Index支援以下幾種Index Scan:
Global Index Scan
Global Partitioned Index Scan
Global Index Only Scan
Global Partitioned Index Only Scan
Global Index Bitmap Scan
Global Partitioned Index Bitmap Scan
Global Index Parallel Scan
Global Only Index Parallel Scan
樣本
Global Index Scan
explain select * from measurement where city_id = 5; QUERY PLAN ------------------------------------------------------------------------------------------------ Global Index Scan using m_city_id_idx_global on measurement (cost=0.12..8.14 rows=1 width=20) Index Cond: (city_id = 5) (2 rows)Global Partitioned Index Scan
explain select * from measurement where peaktemp = 5; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Global Partitioned Index Scan(1/3) using m_peaktemp_idx_global on measurement (cost=0.50..8.52 rows=1 width=20) Index Cond: (peaktemp = 5) (2 rows)Global Index Only Scan
explain select city_id from measurement where city_id = 5; QUERY PLAN ---------------------------------------------------------------------------------------------------- Global Index Only Scan using m_city_id_idx_global on measurement (cost=0.12..8.14 rows=1 width=4) Index Cond: (city_id = 5) (2 rows)Global Partitioned Index Only Scan
explain select peaktemp from measurement where peaktemp = 5; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Global Partitioned Index Only Scan(1/3) using m_peaktemp_idx_global on measurement (cost=0.12..8.14 rows=1 width=4) Index Cond: (peaktemp = 5) (2 rows)Global Index Bitmap Scan
explain select * from measurement where city_id > 5 and city_id < 10000; QUERY PLAN ------------------------------------------------------------------------------------------------ Append (cost=1.03..14.47 rows=32 width=20) -> Bitmap Heap Scan on measurement_y2023q1 (cost=1.03..3.58 rows=8 width=20) Recheck Cond: ((city_id > 5) AND (city_id < 10000)) Filter: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Global Index Scan on m_city_id_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Heap Scan on measurement_y2023q2 (cost=1.03..3.58 rows=8 width=20) Recheck Cond: ((city_id > 5) AND (city_id < 10000)) Filter: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Global Index Scan on m_city_id_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Heap Scan on measurement_y2023q3 (cost=1.03..3.58 rows=8 width=20) Recheck Cond: ((city_id > 5) AND (city_id < 10000)) Filter: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Global Index Scan on m_city_id_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Heap Scan on measurement_y2023q4 (cost=1.03..3.58 rows=8 width=20) Recheck Cond: ((city_id > 5) AND (city_id < 10000)) Filter: ((city_id > 5) AND (city_id < 10000)) -> Bitmap Global Index Scan on m_city_id_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((city_id > 5) AND (city_id < 10000)) (21 rows)Global Partitioned Index Bitmap Scan
explain select peaktemp from measurement where peaktemp > 5 and peaktemp< 1000; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Append (cost=1.03..14.47 rows=32 width=4) -> Bitmap Heap Scan on measurement_y2023q1 (cost=1.03..3.58 rows=8 width=4) Recheck Cond: ((peaktemp > 5) AND (peaktemp < 1000)) Filter: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Global Partitioned Index Scan(1/3) on m_peaktemp_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Heap Scan on measurement_y2023q2 (cost=1.03..3.58 rows=8 width=4) Recheck Cond: ((peaktemp > 5) AND (peaktemp < 1000)) Filter: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Global Partitioned Index Scan(1/3) on m_peaktemp_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Heap Scan on measurement_y2023q3 (cost=1.03..3.58 rows=8 width=4) Recheck Cond: ((peaktemp > 5) AND (peaktemp < 1000)) Filter: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Global Partitioned Index Scan(1/3) on m_peaktemp_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Heap Scan on measurement_y2023q4 (cost=1.03..3.58 rows=8 width=4) Recheck Cond: ((peaktemp > 5) AND (peaktemp < 1000)) Filter: ((peaktemp > 5) AND (peaktemp < 1000)) -> Bitmap Global Partitioned Index Scan(1/3) on m_peaktemp_idx_global (cost=0.00..4.13 rows=1 width=0) Index Cond: ((peaktemp > 5) AND (peaktemp < 1000)) (21 rows)
Global Index作為主鍵/外鍵引用
PolarDB PostgreSQL版(相容Oracle)支援分區表中的任意一列作為主鍵或者外鍵引用,該優勢依賴於Global Index。無論主鍵還是外鍵引用都需要有唯一索引,而Local Index作為唯一索引的前提是它的索引鍵必須包括分區鍵。如果沒有Global Index,主鍵必須包含分區鍵,外鍵引用也必須包括分區鍵。但Global Index沒有該限制,它支援在任意一列上建立主鍵或者作為外鍵引用。
您可以在建立分區表時指定主鍵,PolarDB PostgreSQL版(相容Oracle)將根據您指定的主鍵,智能的選擇使用Local Index還是Global Index。簡單來說,當您指定主鍵包含分區鍵時,PolarDB PostgreSQL版(相容Oracle)會使用Local Index為您建立唯一索引,否則將使用Global Index建立唯一索引。
樣本
分區表建立主鍵。
-- local index CREATE TABLE pk_rel(a int primary key, b int, c int, d int) PARTITION BY RANGE (a); postgres=# \d pk_rel Table "public.pk_rel" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | not null | b | integer | | | c | integer | | | d | integer | | | Partition key: RANGE (a) Indexes: "pk_rel_pkey" PRIMARY KEY, btree (a) Number of partitions: 0 --- global index CREATE TABLE pk_rel(a int, b int primary key, c int, d int) PARTITION BY RANGE (a); postgres=# \d pk_rel Table "public.pk_rel" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | | b | integer | | not null | c | integer | | | d | integer | | | Partition key: RANGE (a) Indexes: "pk_rel_pkey" PRIMARY KEY, btree (b) GLOBAL Number of partitions: 0說明您也可以使用語句
alter table xxx add primary key using index xxx顯式的指定某種索引作為主鍵。PolarDB PostgreSQL版(相容Oracle)可以使用分區表的任意列作為外鍵引用,像一個普通表一樣,前提是需要為這些列建立唯一索引或者主鍵。同樣的,如果這些列是分區鍵,則使用Local Index,否則使用Global Index。
分區表引用外鍵。
CREATE TABLE fk_rel(a int, b int REFERENCES pk_rel(b), c int, d int) PARTITION BY RANGE (a); postgres=# \d fk_rel Table "public.fk_rel" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | | b | integer | | | c | integer | | | d | integer | | | Partition key: RANGE (a) Foreign-key constraints: "fk_rel_b_fkey" FOREIGN KEY (b) REFERENCES pk_rel(b) Number of partitions: 0
效能測試
使用pgbench工具產生scale為80000的資料,分別建立分區表與非分區表。
TPS結果不能作為絕對標準,根據不同實體環境的不同而有差異,這裡的資料用於對比Global Index和Local Index的效能。
非分區鍵上的點查效能
類別 | TPS | |||||
Prepared Statement | 不使用 | 使用 | ||||
並發數 | 1 | 32 | 64 | 1 | 32 | 64 |
普通表 | 27,732 | 494,433 | 430,848 | 53,935 | 985,880 | 886,882 |
分區表+Local Index | 367 | 4,155 | 3,688 | 856 | 8,742 | 6,790 |
分區表+Global Index | 19,006 | 308,128 | 262,941 | 45,090 | 820,924 | 731,557 |
非分區鍵上的TPC-B效能
包含了點查和DML。
類別 | TPS | |||||
Prepared Statement | 不使用 | 使用 | ||||
並發數 | 1 | 32 | 64 | 1 | 32 | 64 |
普通表 | 1,115 | 51,025 | 60,409 | 4,822 | 90,312 | 100,802 |
分區表+Local Index | 271 | 2,903 | 2,524 | 550 | 5,276 | 4,237 |
分區表+Global Index | 3,453 | 36,320 | 39,941 | 4,334 | 69,040 | 75,232 |
結論
Global Index對分區表的點查和DML能夠帶來一個數量級的效能提升。