All Products
Search
Document Center

ApsaraDB for SelectDB:Index-based acceleration

Last Updated:Mar 28, 2026

Scanning full tables is a primary bottleneck for analytical queries on large datasets. ApsaraDB for SelectDB maintains two categories of indexes to minimize unnecessary data reads: built-in intelligent indexes that are created and managed automatically, and secondary indexes that you define for specific query patterns.

CategoryIndex types
Built-in intelligent indexesZone map index, prefix index
Secondary indexesInverted index, Bitmap index, Bloom filter index, NGram Bloom filter index

This topic covers the built-in intelligent indexes: how they work and how to get the most out of them.

Zone map index

The zone map index is automatically created and maintained for every column in the column-oriented storage format. For each column, SelectDB records:

  • Min: The minimum value in the column

  • Max: The maximum value in the column

  • NULL count: The number of NULL values in the column

When a query includes a range or equality filter, SelectDB compares the filter against each column's Min–Max bounds and skips any data where no rows can possibly match. This makes range queries and inequality filters significantly faster on large datasets.

No user action is needed — SelectDB manages zone map indexes automatically.

Prefix index

How the prefix index is built

ApsaraDB for SelectDB is an online analytical processing (OLAP) database with a massively parallel processing (MPP) architecture. Data is stored in a structure similar to a sorted string table (SSTable): rows are sorted and stored based on the key columns defined at table creation.

In the Aggregate Key, Unique Key, and Duplicate Key models, rows are sorted by the columns listed in the AGGREGATE KEY, UNIQUE KEY, or DUPLICATE KEY clause. The prefix index takes the first 36 bytes of each row's key columns and uses them as a shortcut lookup structure.

One constraint applies: if a key column is of type VARCHAR, the prefix index is truncated at that column's boundary, even if fewer than 36 bytes have been consumed.

Examples

Example 1: Prefix index spans multiple columns

Column nameType
user_idBIGINT
ageINT
messageVARCHAR(100)
max_dwell_timeDATETIME
min_dwell_timeDATETIME

The prefix index for this schema is user_id (8 bytes) + age (4 bytes) + message (prefix 20 bytes) = 36 bytes total.

A query that filters on the leading columns hits the prefix index and is significantly faster:

-- Hits the prefix index: user_id is the first key column
SELECT * FROM table WHERE user_id = 1829239 AND age = 20;

A query that skips the first key column does not hit the prefix index:

-- Does not hit the prefix index: user_id is not in the filter
SELECT * FROM table WHERE age = 20;

Example 2: Prefix index truncated by VARCHAR

Column nameType
user_nameVARCHAR(20)
ageINT
messageVARCHAR(100)
max_dwell_timeDATETIME
min_dwell_timeDATETIME

The prefix index for this schema is user_name (20 bytes) only. Because user_name is VARCHAR, the index is truncated after that column, even though only 20 bytes have been consumed. Subsequent columns (age, message) are not included in the prefix index.

Modify a prefix index using a materialized view

A table's key column order is fixed at creation, so a table has only one prefix index. To support a different column order — and therefore a different prefix index — create a materialized view that reorders the columns. For more information, see Materialized views.

What's next