All Products
Search
Document Center

PolarDB:Global secondary indexes (GSIs)

Last Updated:Apr 01, 2026

In a PolarDB-X distributed table, queries that don't include the partition key trigger a full scan across all data nodes. This collapses performance to the ceiling of a single node and eliminates linear scaling. A global secondary index (GSI) eliminates full partition scans by letting you define an independent partitioning scheme for an index — so any column can become an efficient query path.

How GSIs work

PolarDB-X stores table data across multiple data nodes, partitioned by a designated key. A query that includes the partition key is routed to the relevant partition directly. A query that doesn't include it fans out to every partition.

When a system has N data nodes, a partition-key query places an average load of 1/N on each node. A full partition scan places a load of 1 on every node — so the performance ceiling drops to that of a single node and the system loses linear scaling.

A GSI solves this by acting as a special partitioned table that duplicates specific columns from the primary table. It is horizontally partitioned by a user-defined key that is different from the primary table's partition key. When a query matches the GSI's partition key, PolarDB-X routes it to the relevant GSI partition, retrieves the matching rows, then fetches any additional columns from the primary table. This avoids the full partition scan.

GSI architecture diagram

PolarDB-X uses distributed transactions to maintain strong data consistency between the primary table and its GSIs.

GSIs also support:

  • Online DDL — create, modify, and drop GSIs without locking the table

  • Custom covering columns — store additional columns in the GSI to reduce or eliminate lookups on the primary table

  • Invisible indexes — test a GSI's effect on query plans without exposing it to the query optimizer, so you can validate index performance before enabling it in production

GSI types

GSI

A standard GSI partitions index data by a column other than the primary table's partition key. When a query matches the GSI's partition key, PolarDB-X routes the query to the relevant partition and looks up the primary table for any remaining columns.

Example: The user table is partitioned by user_id. To support efficient queries by name, create a GSI on name.

CREATE TABLE user(
  user_id bigint,
  name varchar(10),
  addr varchar(30),
  GLOBAL INDEX `g_i_name` (name) PARTITION BY HASH(name),
  PRIMARY KEY(user_id)
) PARTITION BY KEY(user_id);

With g_i_name in place, a query filtering by name routes to the matching partition of the index rather than scanning all partitions of user.

UGSI

A unique global secondary index (UGSI) has all the properties of a standard GSI and additionally enforces a global UNIQUE constraint across the entire distributed table.

Example: The user2 table is partitioned by user_id. To guarantee that phone is globally unique across all partitions, create a UGSI on phone.

CREATE TABLE user2(
  user_id bigint,
  phone varchar(20),
  addr varchar(30),
  UNIQUE GLOBAL INDEX `g_i_phone`(phone) PARTITION BY HASH(phone),
  PRIMARY KEY(user_id)
) PARTITION BY KEY(user_id);

Clustered GSI

By default, a clustered GSI stores all columns from the primary table — not just the index key and any covering columns. Because every column is present in the index, queries never need to look up the primary table, eliminating lookup overhead entirely.

The trade-off: a clustered GSI occupies the same amount of disk space as the primary table.

Example: The order_tbl table is partitioned by order_id. To support fast queries by user_id without any primary table lookups, create a clustered GSI on user_id.

CREATE TABLE order_tbl(
  order_id bigint,
  user_id bigint,
  addr varchar(30),
  info text,
  create_time datetime,
  CLUSTERED INDEX `cg_i_user`(user_id) PARTITION BY HASH(user_id),
  PRIMARY KEY(order_id)
) PARTITION BY KEY(order_id);

When querying by user_id, PolarDB-X routes the query to the matching partition of cg_i_user and reads all required data directly from the index.

When to use each type

Scenario Recommended type
Query by a non-partition-key column; primary table lookup is acceptable GSI
Query by a non-partition-key column; column must be globally unique UGSI
Query by a non-partition-key column; eliminating lookup overhead is a priority Clustered GSI

Performance

GSIs improve read performance and degrade write performance. The following Sysbench benchmarks illustrate the impact at scale.

Read performance

Table Threads select_random_ranges QPS Avg latency (ms) 95th percentile (ms) select_random_points QPS Avg latency (ms) 95th percentile (ms)
Partitioned table 128 2,769.17 46.21 99.33 5,226.99 24.48 42.61
256 3,415.64 144.97 144.97 5,476.76 46.73 82.96
512 3,272.46 156.31 257.95 5,290.67 96.72 179.94
1024 2,453.16 416.12 539.71 5,165.31 198.07 404.61
Partitioned table + GSI 128 9,662.11 13.24 25.28 22,584.89 5.66 9.73
256 10,431.73 24.52 51.02 25,558.26 10.01 17.95
512 15,634.51 32.72 73.13 27,116.56 18.86 39.65
1024 22,948.76 44.53 108.68 32,509.87 31.43 73.13

With a GSI on the index column:

  • Range query QPS: 3,415.64 → 22,948.76 — 571% improvement

  • Point query QPS: 5,476.76 → 32,509.87 — 493% improvement

Write performance

Table Threads Write-only QPS Avg latency (ms) 95th percentile (ms) Read-write QPS Avg latency (ms) 95th percentile (ms)
Partitioned table 128 86,548.12 8.87 10.27 113,655.28 22.52 26.2
256 115,774.71 13.26 19.29 149,677.52 34.19 44.17
512 143,928.94 20.51 34.95 14,555.16 70.28 112.67
1024 153,501.7 39.53 70.55 132,150.69 131.58 287.38
Partitioned table + GSI 128 52,069.22 14.25 18.28 90,074.59 28.41 33.72
256 66,250.79 23.17 32.53 114,420.32 44.73 57.87
512 75,700.74 39.1 59.99 111,093.61 92.09 142.39
1024 76,557.94 80.14 134.9 101,828.32 182.51 350.33

With a GSI on the index column:

  • Write-only QPS: 153,501.7 → 76,557.94 — 50% reduction

  • Mixed read-write QPS: 149,677.52 → 114,420.32 — 23% reduction

What's next