In a sharded database, data is distributed across physical shards based on a shard key. Queries that filter on a column other than the shard key force PolarDB-X to scan every shard — a full-shard scan that grows more expensive as data scales. A global secondary index (GSI) solves this by maintaining a separate index table sharded on the indexed column, so the optimizer can route queries directly to the relevant shard.
This topic explains how to create GSIs and use them to accelerate queries in PolarDB-X.
Prerequisites
Before you begin, ensure that you have:
-
MySQL 5.7 or later
-
PolarDB-X kernel minor version 5.4.1 or later
How GSIs work
A GSI consists of the following components:
| Component | Description |
|---|---|
| Index table | A separate distributed table created for the GSI, sharded independently from the primary table using the indexed column as its shard key. |
| Index column | The shard key of the index table. All columns referenced in the sharding clause become index columns. |
| Covering column | Additional columns stored in the index table beyond the index column. By default, the primary key and all shard keys of the primary table are included. When a query's required columns are all present in the index table, PolarDB-X reads directly from the index table without a primary table lookup — this is called a covering index read. |
| Sharding clause | Specifies how the index table is distributed across database and table partitions. The syntax is the same as the sharding clause in CREATE TABLE. |
When to use a GSI: Create a GSI when queries frequently filter on a column that is not the primary table's shard key, and a full-shard scan is causing performance problems. If most queries use the shard key directly, a GSI adds storage and write overhead without benefit.
Create a GSI
PolarDB-X extends MySQL data definition language (DDL) syntax to support GSI definition. The syntax follows the same pattern as standard MySQL index creation.
There are two ways to add a GSI:
-
Define a GSI inline when you create the table
-
Add a GSI to an existing table with
CREATE GLOBAL INDEX
The following examples use a t_order table sharded by order_id. The table also has buyer_id and seller_id columns that queries commonly filter on, making them good candidates for GSIs.
-- Define a GSI when creating the table
CREATE TABLE t_order (
`id` bigint(11) NOT NULL AUTO_INCREMENT,
`order_id` varchar(20) DEFAULT NULL,
`buyer_id` varchar(20) DEFAULT NULL,
`seller_id` varchar(20) DEFAULT NULL,
`order_snapshot` longtext DEFAULT NULL,
`order_detail` longtext DEFAULT NULL,
PRIMARY KEY (`id`),
GLOBAL INDEX `g_i_seller`(`seller_id`) COVERING (`id`, `order_id`, `buyer_id`, `order_snapshot`) dbpartition by hash(`seller_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 dbpartition by hash(`order_id`);
-- Add a GSI to an existing table
CREATE UNIQUE GLOBAL INDEX `g_i_buyer` ON `t_order`(`buyer_id`)
COVERING(`seller_id`, `order_snapshot`)
dbpartition by hash(`buyer_id`) tbpartition by hash(`buyer_id`) tbpartitions 3
For the full GSI creation syntax and options, see CREATE INDEX.
Use a GSI
After creating a GSI, queries can use it in three ways: automatic selection by the optimizer, explicit HINT syntax, or direct index table queries.
Automatic index selection
For queries on the primary table that have a matching GSI, PolarDB-X automatically selects the lowest-cost index. Only covering indexes are eligible for automatic selection — if the index table does not contain all columns required by the query, the optimizer will not select it automatically.
In the following example, the query filters on seller_id and selects id and order_snapshot. Because g_i_seller covers all three columns and seller_id is its shard key, the optimizer routes the query directly to the index table, avoiding a full scan of the t_order primary shards:
EXPLAIN SELECT t_order.id, t_order.order_snapshot FROM t_order WHERE t_order.seller_id = 's1';
IndexScan(tables="g_i_seller_sfL1_2", sql="SELECT `id`, `order_snapshot` FROM `g_i_seller` AS `g_i_seller` WHERE (`seller_id` = ?)")
Specify an index with HINT
Use index hints to override the optimizer's choice. PolarDB-X supports two HINT forms:
`FORCE INDEX` — Forces the query to use the named index:
SELECT a.*, b.order_id
FROM t_seller a
JOIN t_order b FORCE INDEX(g_i_seller) ON a.seller_id = b.seller_id
WHERE a.seller_nick = "abc";
TDDL hint — Specifies the index using a comment directive:
/*+TDDL:INDEX({table_name/table_alias}, {index_name})*/
Example:
/*+TDDL:index(a, g_i_buyer)*/ SELECT * FROM t_order a WHERE a.buyer_id = 123
If the query requires columns not stored in the index table, PolarDB-X performs a two-step lookup: it queries the index table first to retrieve the primary key and primary table shard keys, then queries the primary table for the remaining columns. For details, see INDEX HINT.
Query an index table directly
If the index table contains all the columns required by a query, you can query the index table directly to retrieve the result without involving the primary table.
Control index selection with USE INDEX and IGNORE INDEX
Use USE INDEX to restrict the optimizer to a specific set of indexes, or IGNORE INDEX to exclude indexes from consideration:
-- Prevent the optimizer from using g_i_seller
SELECT t_order.id, t_order.order_snapshot FROM t_order IGNORE INDEX(g_i_seller) WHERE t_order.seller_id = 's1';
-- Restrict the optimizer to only consider g_i_seller
SELECT t_order.id, t_order.order_snapshot FROM t_order USE INDEX(g_i_seller) WHERE t_order.seller_id = 's1';
Both accept a comma-separated list of index names: IGNORE INDEX(index1, index2, ...).
What's next
-
CREATE INDEX — Full GSI creation syntax reference
-
INDEX HINT — Detailed guide to index hint syntax and behavior