By digoal
PostgreSQL 12 introduces several key enhancements as stated in the official release notes. In particular, we will discuss the improvements in the B-tree index based on sorting duplicate key values using CTID.
Improve performance and space utilization of b-tree indexes with many duplicates (Peter Geoghegan, Heikki Linnakangas). Previously, duplicate index entries were stored unordered within their duplicate groups. This caused overhead during index inserts, wasted space due to excessive page splits, and reduced VACUUM's ability to recycle entire pages. Duplicate index entries are now sorted in heap-storage order.
Indexes pg_upgraded from previous releases will not have these benefits.
Improvements of PG 12 b-tree index storage structure:
1) Duplicate key values are sorted and stored by CTID to reduce the index page splitting. The key values were originally stored in disorder by duplicate groups.
2) During vacuum operation, the table is scanned first. Then, the garbage tuple ctid is stored. After the vacuum memory is full, the indexes are scanned to collect the garbage index tuple. Therefore, when the key values are stored by CTID in order, the vacuum operation generates fewer Write-ahead Logging (WAL) files (fewer index pages involved) if all the duplicate keys are recycled.
postgres=# create table test_bt (id int, c1 int);
CREATE TABLE
postgres=# create index idx_test_bt_1 on test_bt (c1);
CREATE INDEX
postgres=# insert into test_bt select generate_series(1,10000000), random()*10000;
INSERT 0 10000000
Time: 44992.592 ms (00:44.993)
postgres=# \di+ idx_test_bt_1 ;
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+---------------+-------+----------+---------+--------+-------------
public | idx_test_bt_1 | index | postgres | test_bt | 281 MB |
(1 row)
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test_bt where c1=2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_test_bt_1 on public.test_bt (cost=0.38..849.01 rows=50000 width=8) (actual time=0.034..2.011 rows=1062 loops=1)
Output: id, c1
Index Cond: (test_bt.c1 = 2)
Buffers: shared hit=1059
Planning Time: 0.132 ms
Execution Time: 2.083 ms
(6 rows)
postgres=# create table test_bt (id int, c1 int);
CREATE TABLE
postgres=# create index idx_test_bt_1 on test_bt (c1);
CREATE INDEX
postgres=# insert into test_bt select generate_series(1,10000000), random()*10000;
INSERT 0 10000000
Time: 36449.404 ms (00:36.449)
postgres=# \di+ idx_test_bt_1
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+---------------+-------+----------+---------+--------+-------------
public | idx_test_bt_1 | index | postgres | test_bt | 235 MB |
(1 row)
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test_bt where c1=2;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_test_bt_1 on public.test_bt (cost=0.43..39395.43 rows=50000 width=8) (actual time=0.033..1.141 rows=1002 loops=1)
Output: id, c1
Index Cond: (test_bt.c1 = 2)
Buffers: shared hit=998
Planning Time: 0.074 ms
Execution Time: 1.226 ms
(6 rows)
PostGIS: How to Calculate Distance between Coordinates in Spherical and Projected Systems
PostgreSQL gzip Plug-in Function Interface: Compress and Decompress Text and Bytea Files
digoal - May 18, 2021
digoal - March 25, 2020
digoal - February 3, 2020
digoal - December 11, 2019
zhuodao - July 30, 2020
Alibaba Clouder - December 11, 2017
Alibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAn online MPP warehousing service based on the Greenplum Database open source program
Learn MoreAn on-demand database hosting service for PostgreSQL with automated monitoring, backup and disaster recovery capabilities
Learn MoreLeverage cloud-native database solutions dedicated for FinTech.
Learn MoreMore Posts by digoal