This topic explains the Generalized Inverted Index (GIN) in PolarDB.
GIN is an index that stores pairs of keys and posting lists. A key refers to a key value and a posting list is a set of positions where the key occurs. For example, the pair 'hello', '14:2 23:4' indicates that the key hello occurs at the positions of 14:2 and 23:4. These positions are tuple identifiers (TIDs), which are also known as row IDs. A row ID consists of a data block ID and an item point. The block ID is 32 bits in length, and the item point is 16 bits in length. GIN indexes allow you to find tuples that contain specified keywords. Therefore, GIN indexes are suitable for searching for multi-value elements.
Operators
Operator | Example |
<@ | select * from test where id <@ array[1,2]; |
@> | select * from test where id @> array[1,2]; |
= | select * from test where id = array[1,2]; |
&& | select * from test where id && array[1,2]; |
You can also install the btree_gin extension to support B-tree operators.
Index structure

Component structure | Description |
Entry | A GIN index element. |
Entry tree | A B-tree that is created based on the entry. |
Posting list | A list of physical locations of entries. |
Posting tree | A B-tree that is created based on the posting list. |
Pending list | A list for temporary storage of index tuples. The pending list stores the fast updates. |
Scenarios
Query multi-value elements
Create a table and insert data.
CREATE TABLE t_gin1 (id int, arr int[]); do language plpgsql $$ declare begin for i in 1..10000 loop insert into t_gin1 select i, array(select random()*1000 from generate_series(1,10)); end loop; end; $$;Query data.
select * from t_gin1 limit 3;id | arr 1 | {819,751,515,192,909,255,683,39,258,423} 2 | {844,43,834,710,447,344,480,749,814,18} 3 | {339,69,28,428,859,510,560,465,906,788}Create an index.
CREATE INDEX idx_t_gin1_1 ON t_gin1 USING gin (arr);View the execution plan.
EXPLAIN (analyze, verbose, timing, costs, buffers) SELECT * FROM t_gin1 WHERE arr && array[1,2];Bitmap Heap Scan on public.t_gin1 (cost=7.08..126.44 rows=204 width=65) (actual time=0.042..0.161 rows=204 loops=1) Output: id, arr Recheck Cond: (t_gin1.arr && '{1,2}'::integer[]) Heap Blocks: exact=106 Buffers: shared hit=111 (main=111 vm=0 fsm=0) -> Bitmap Index Scan on idx_t_gin1_1 (cost=0.00..7.03 rows=204 width=0) (actual time=0.026..0.026 rows=204 loops=1) Index Cond: (t_gin1.arr && '{1,2}'::integer[]) Buffers: shared hit=5 (main=5 vm=0 fsm=0) Query Identifier: -6536619930866726302 Planning: Buffers: shared hit=44 (main=44 vm=0 fsm=0) dirtied=2 (main=0 vm=0 fsm=0) Planning Time: 0.160 ms Execution Time: 0.196 ms
Query data by random columns
Create the extension.
create extension btree_gin;Create a table and insert data.
CREATE TABLE t_gin2 (id int, c1 int); INSERT INTO t_gin2 SELECT generate_series(1,100000), random()*10 ;Create an index.
CREATE INDEX idx_t_gin2_1 on t_gin2 using gin (c1);View the execution plan.
EXPLAIN (analyze,verbose,timing,costs,buffers) SELECT * FROM t_gin2 WHERE c1=1;Bitmap Heap Scan on public.t_gin2 (cost=81.66..745.49 rows=9827 width=8) (actual time=0.743..2.775 rows=9964 loops=1) Output: id, c1 Recheck Cond: (t_gin2.c1 = 1) Heap Blocks: exact=541 Buffers: shared hit=546 (main=546 vm=0 fsm=0) -> Bitmap Index Scan on idx_t_gin2_1 (cost=0.00..79.20 rows=9827 width=0) (actual time=0.691..0.692 rows=9964 loops=1) Index Cond: (t_gin2.c1 = 1) Buffers: shared hit=5 (main=5 vm=0 fsm=0) Query Identifier: -7119324855706379201 Planning: Buffers: shared hit=43 (main=43 vm=0 fsm=0) dirtied=1 (main=0 vm=0 fsm=0) Planning Time: 0.136 ms Execution Time: 3.490 ms
Query data that is sparsely distributed
Create a table and insert data.
CREATE TABLE t_gin3 (id int, c1 int, c2 int, c3 int, c4 int, c5 int, c6 int, c7 int, c8 int, c9 int); INSERT INTO t_gin3 SELECT generate_series(1,100000), random()*10, random()*20, random()*30, random()*40, random()*50, random()*60, random()*70, random()*80, random()*90;Create an index.
CREATE INDEX idx_t_gin3_1 ON t_gin3 USING gin (c1,c2,c3, c4, c5, c6,c7,c8,c9);View the execution plan.
EXPLAIN (analyze,verbose,timing,costs, buffers) SELECT * FROm t_gin3 WHERE c1=1 OR c2=1 AND c3=1 OR c4=1 AND (c6=1 OR c7=2) OR c8=9 OR c9=10;Bitmap Heap Scan on public.t_gin3 (cost=141.08..1445.89 rows=12051 width=40) (actual time=1.539..4.541 rows=12240 loops=1) Output: id, c1, c2, c3, c4, c5, c6, c7, c8, c9 Recheck Cond: ((t_gin3.c1 = 1) OR ((t_gin3.c2 = 1) AND (t_gin3.c3 = 1)) OR (((t_gin3.c4 = 1) AND (t_gin3.c6 = 1)) OR ((t_gin3.c4 = 1) AND (t_gin3.c7 = 2))) OR (t_gin3.c8 = 9) OR (t_gin3.c9 = 10)) Heap Blocks: exact=935 Buffers: shared hit=968 (main=968 vm=0 fsm=0) -> BitmapOr (cost=141.08..141.08 rows=12327 width=0) (actual time=1.441..1.445 rows=0 loops=1) Buffers: shared hit=33 (main=33 vm=0 fsm=0) -> Bitmap Index Scan on idx_t_gin3_1 (cost=0.00..79.12 rows=9670 width=0) (actual time=0.816..0.817 rows=9844 loops=1) Index Cond: (t_gin3.c1 = 1) Buffers: shared hit=6 (main=6 vm=0 fsm=0) -> Bitmap Index Scan on idx_t_gin3_1 (cost=0.00..7.09 rows=159 width=0) (actual time=0.262..0.262 rows=178 loops=1) Index Cond: ((t_gin3.c2 = 1) AND (t_gin3.c3 = 1)) Buffers: shared hit=8 (main=8 vm=0 fsm=0) -> BitmapOr (cost=17.85..17.85 rows=82 width=0) (actual time=0.222..0.224 rows=0 loops=1) Buffers: shared hit=13 (main=13 vm=0 fsm=0) -> Bitmap Index Scan on idx_t_gin3_1 (cost=0.00..5.94 rows=44 width=0) (actual time=0.119..0.119 rows=49 loops=1) Index Cond: ((t_gin3.c4 = 1) AND (t_gin3.c6 = 1)) Buffers: shared hit=6 (main=6 vm=0 fsm=0) -> Bitmap Index Scan on idx_t_gin3_1 (cost=0.00..5.88 rows=38 width=0) (actual time=0.103..0.103 rows=38 loops=1) Index Cond: ((t_gin3.c4 = 1) AND (t_gin3.c7 = 2)) Buffers: shared hit=7 (main=7 vm=0 fsm=0) -> Bitmap Index Scan on idx_t_gin3_1 (cost=0.00..12.93 rows=1283 width=0) (actual time=0.072..0.072 rows=1289 loops=1) Index Cond: (t_gin3.c8 = 9) Buffers: shared hit=3 (main=3 vm=0 fsm=0) -> Bitmap Index Scan on idx_t_gin3_1 (cost=0.00..11.80 rows=1133 width=0) (actual time=0.063..0.063 rows=1143 loops=1) Index Cond: (t_gin3.c9 = 10) Buffers: shared hit=3 (main=3 vm=0 fsm=0) Query Identifier: -7228854819162599043 Planning: Buffers: shared hit=111 (main=111 vm=0 fsm=0) dirtied=4 (main=0 vm=0 fsm=0) Planning Time: 0.220 ms Execution Time: 5.459 ms