The columnstore index pruner skips data blocks that cannot contain matching rows before a scan begins, reducing I/O and improving query performance on large tables. This topic describes the four pruner types, when to use each one, how to configure them, and how to verify they are working.
How pruners work
Columnstore index data is stored in single-column data blocks, with a default granularity of 64,000 rows per block. All column packs for a set of rows form a row group. A scan traverses every data block in a specified column unless a pruner can prove that a block contains no matching rows — in which case the block is skipped entirely.
Estimate the number of data blocks for a table with this formula:
Number of data blocks = total table rows / 64,000
For tables too large to fit in memory, skipping even a fraction of blocks produces a measurable reduction in scan time.
Pruner types
Four pruner types are supported. Minmax indexes for numeric types (INT, DECIMAL, DATETIME) are built automatically when you create a columnstore index. In PolarDB MySQL 8.0.1.1.35 or later, or 8.0.2.2.16 or later, pruners for string columns are also built automatically. On earlier versions, string pruners require explicit configuration.
| Pruner | How it filters | Best for | Not for |
|---|---|---|---|
| Bloom filter | Bit array that tests element membership | Equality (=) and IN conditions on high-cardinality string columns |
Numeric types; JSON, BLOB, TEXT columns |
| Minmax index | Stores min/max values per block; skips blocks outside the filter range | Range and equality filters on ordered or date columns | JSON, GEOMETRY columns |
| Token bloom filter | Splits strings at non-alphanumeric characters (for example, "I am IMCI" → I|am|IMCI) |
LIKE queries on natural-language text |
Numeric types |
| N-gram bloom filter | Splits strings into fixed-length substrings (for example, "storage" with N=3 → sto|tor|ora|rag|age) |
LIKE "%string%" substring searches |
Numeric types; queries where the search string is shorter than N |
Minmax indexes on numeric types are built automatically when you create a columnstore index. No additional configuration is needed for those columns.
When to use pruners
Use a pruner when:
-
Filter values are sparse across data blocks — most blocks contain no matching rows.
-
The column has strong data locality — similar values cluster together in insertion order (common with date columns or sorted IDs).
-
You are running
LIKE "%substring%"searches on large string columns.
Avoid a pruner when:
-
Matching values are distributed uniformly across all blocks — the pruner evaluates every block but cannot skip any, adding overhead without benefit.
-
You are applying a bloom filter to a column where distinct values are 3% or fewer of total rows and the data is uniformly distributed — effectiveness depends on locality, and performance may be worse than without the pruner.
-
The column type is not supported (see the table in Pruner types).
Adding a pruner to the wrong column can hurt performance rather than help it. A bloom filter cannot skip any block if the target value appears in every block. Test with realistic data before enabling pruners on production tables.
Memory usage
Enabling pruners on string columns adds memory overhead. Use these formulas to estimate usage before enabling pruners on large tables.
Bloom filter, token bloom filter, or n-gram bloom filter
-
When the distinct value count exceeds 3% of total rows:
Memory = 1.2 × (number of columns with bloom filter) × (total table rows) bytes -
When the distinct value count is 3% or fewer of total rows:
Memory = 1.2 × (number of columns with bloom filter) × (number of distinct values) bytes
Minmax index
Memory = 2 × (columns with minmax index) × (total rows / block size) × prefix length × character encoding length
Example: A table with 2 billion rows, minmax indexes on 10 columns, a prefix length of 20 characters, a block size of 64,000, and utf8mb4 encoding (4 bytes per character) uses approximately 46 MB of memory.
Usage notes
-
Strings must not contain
\0(for example,'polar\0db') for a pruner to be built on that column. -
Column-level
COMMENTattributes take precedence over table-levelCOMMENTattributes. -
To add or remove a pruner on an existing table, drop the columnstore index and recreate it. Update the
COMMENTattribute before rebuilding. -
In PolarDB MySQL version 8.0.1.1.35 or later, or 8.0.2.2.16 or later, pruners for string columns are built automatically when you create a columnstore index. Bloom filters use a Least Recently Used (LRU) cache for memory management. If you upgrade from an older version, rebuild the columnstore index to enable the pruner for string columns.
-
In PolarDB MySQL version 8.0.1.1.34 or earlier, or 8.0.2.2.15 or earlier, pruners stay in memory permanently and are not built automatically for string columns.
-
For PolarDB MySQL version 8.0.1.1.32 or earlier, or 8.0.2.2.13 or earlier, pruners are not built for data blocks that contain NULL values. Filtering with
IS NULLorIS NOT NULLis not supported. -
Before adding new pruners, check the existing pruner types on each column. See Check pruner status on table columns.
Build a pruner when creating a table
Pruner attributes are set in the COMMENT field of a CREATE TABLE statement. Table-level attributes apply to all supported columns; column-level attributes apply only to the specified column.
PRUNER_BLOOM,PRUNER_MINMAX, andPREFIX_LENrequire PolarDB MySQL 8.0.1.1.32 or later, or 8.0.2.2.13 or later.
Bloom filter
Build a bloom filter for all supported columns:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10),
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1 PRUNER_BLOOM=1";
Build a bloom filter for a specific column only:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10) COMMENT "PRUNER_BLOOM=1",
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1";
Minmax index
For long string columns, the index compares a string prefix against the stored min/max values to reduce memory usage. The default prefix length is 20 characters; the maximum is 255 characters. Character count is based on the number of characters, not the byte length. For example, the first 5 characters of "Alibaba Cloud PolarDB" are "Aliba".
Build minmax indexes for all string columns:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10),
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1 PRUNER_MINMAX=1 PREFIX_LEN=30";
Build minmax indexes with different prefix lengths per column:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10) COMMENT "PRUNER_MINMAX=1 PREFIX_LEN=30",
str_col2 VARCHAR(10) COMMENT "PRUNER_MINMAX=1 PREFIX_LEN=10"
) ENGINE=InnoDB COMMENT "COLUMNAR=1";
Prefix bloom filter
Requires PolarDB MySQL 8.0.1.1.42 or later, or 8.0.2.2.25 or later.
The prefix bloom filter builds on string prefixes to reduce performance and storage overhead caused by long strings.
Build a prefix bloom filter for all supported columns:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10),
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1 PRUNER_PREFIX_BLOOM=1";
Build a prefix bloom filter for a specific column:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10) COMMENT "PRUNER_PREFIX_BLOOM=1",
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1";
Token bloom filter
Requires PolarDB MySQL 8.0.1.1.39 or later, or 8.0.2.2.20 or later.
Build a token bloom filter for all supported columns:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10),
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1 PRUNER_TOKEN_BLOOM=1";
Build a token bloom filter for a specific column:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10) COMMENT "PRUNER_TOKEN_BLOOM=1",
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1";
N-gram bloom filter
Requires PolarDB MySQL 8.0.1.1.39 or later, or 8.0.2.2.20 or later.PRUNER_NGRAM_BLOOM=Nsets the substring length; N must be 2 or greater. Set N to a value less than or equal to the length of the search string inLIKE "%string%". If the search string is shorter than N, the n-gram bloom filter cannot be used.
Build an n-gram bloom filter for all supported columns (N=2):
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10),
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1 PRUNER_NGRAM_BLOOM=2";
Build an n-gram bloom filter for a specific column:
CREATE TABLE t1 (
id INT PRIMARY KEY,
str_col1 CHAR(10) COMMENT "PRUNER_NGRAM_BLOOM=3",
str_col2 VARCHAR(10)
) ENGINE=InnoDB COMMENT "COLUMNAR=1";
Add or remove a pruner on an existing table
To modify a pruner on an existing table, update the COMMENT attribute and rebuild the columnstore index by dropping and recreating it.
The following examples use PRUNER_MINMAX to illustrate the steps. The same pattern applies to other pruner types.
Add a pruner
Add minmax indexes to all string columns in table `t1`:
Assume the original table structure is:
CREATE TABLE `t1` (
`id` int(11) NOT NULL,
`str_col1` char(10) DEFAULT NULL,
`str_col2` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`),
COLUMNAR INDEX (`id`,`str_col1`,`str_col2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='COLUMNAR=1'
-
Drop the columnstore index:
ALTER TABLE t1 COMMENT = "COLUMNAR=0"; -
Add
PRUNER_MINMAXand rebuild the index:ALTER TABLE t1 COMMENT = "COLUMNAR=1 PRUNER_MINMAX=1"; -
Verify the updated table structure:
SHOW CREATE TABLE t1 FULL \GExpected output:
*************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` int(11) NOT NULL, `str_col1` char(10) DEFAULT NULL, `str_col2` varchar(10) DEFAULT NULL, PRIMARY KEY (`id`), COLUMNAR INDEX (`id`,`str_col1`,`str_col2`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='COLUMNAR=1 PRUNER_MINMAX=1'
The PRUNER_MINMAX attribute is now set on t1.
Add a minmax index to a specific column (`str_col1`) in table `t1`:
-
Add
PRUNER_MINMAXto the column comment:ALTER TABLE t1 MODIFY COLUMN str_col1 CHAR(10) COMMENT 'PRUNER_MINMAX=1'; -
Rebuild the columnstore index:
ALTER TABLE t1 COMMENT = "COLUMNAR=0"; ALTER TABLE t1 COMMENT = "COLUMNAR=1"; -
Verify the updated table structure:
SHOW CREATE TABLE t1 FULL \GExpected output:
*************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` int(11) NOT NULL, `str_col1` char(10) DEFAULT NULL COMMENT 'PRUNER_MINMAX=1', `str_col2` varchar(10) DEFAULT NULL, PRIMARY KEY (`id`), COLUMNAR INDEX (`id`,`str_col1`,`str_col2`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='COLUMNAR=1'
The str_col1 column now has the PRUNER_MINMAX attribute.
Remove a pruner
Remove the table-level `PRUNER_MINMAX` attribute:
Assume t1 has PRUNER_MINMAX=1 in the table comment.
-
Drop the columnstore index:
ALTER TABLE t1 COMMENT = "COLUMNAR=0"; -
Set
PRUNER_MINMAX=0and rebuild the index:ALTER TABLE t1 COMMENT = "COLUMNAR=1 PRUNER_MINMAX=0"; -
Verify the updated table structure:
SHOW CREATE TABLE t1 FULL \GExpected output:
*************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` int(11) NOT NULL, `str_col1` char(10) DEFAULT NULL, `str_col2` varchar(10) DEFAULT NULL, PRIMARY KEY (`id`), COLUMNAR INDEX (`id`,`str_col1`,`str_col2`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='COLUMNAR=1'
The PRUNER_MINMAX attribute has been removed from t1.
Remove the column-level `PRUNER_MINMAX` attribute from `str_col1`:
Assume str_col1 has COMMENT 'PRUNER_MINMAX=1'.
-
Set
PRUNER_MINMAX=0on the column:ALTER TABLE t1 MODIFY COLUMN str_col1 CHAR(10) COMMENT 'PRUNER_MINMAX=0'; -
Rebuild the columnstore index:
ALTER TABLE t1 COMMENT = "COLUMNAR=0"; ALTER TABLE t1 COMMENT = "COLUMNAR=1"; -
Verify the updated table structure:
SHOW CREATE TABLE t1 FULL \GExpected output:
*************************** 1. row *************************** Table: t1 Create Table: CREATE TABLE `t1` ( `id` int(11) NOT NULL, `str_col1` char(10) DEFAULT NULL, `str_col2` varchar(10) DEFAULT NULL, PRIMARY KEY (`id`), COLUMNAR INDEX (`id`,`str_col1`,`str_col2`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='COLUMNAR=1'
The PRUNER_MINMAX attribute has been removed from str_col1.
Check pruner status on table columns
Verify pruner effectiveness
Check using SHOW STATUS
Run SHOW STATUS LIKE 'imci_pruner%' before and after a query to measure how many data blocks the pruner skipped. Values are cumulative for the current session; for multi-table queries, results are summed across all tables.
| Status variable | Description |
|---|---|
imci_pruner_accepted |
Blocks that fully satisfy the filter condition. These blocks skip per-row filtering, though materialization may still access them. |
imci_pruner_rejected |
Blocks that fail the filter condition entirely. No I/O occurs for these blocks. |
Total skipped blocks = imci_pruner_accepted + imci_pruner_rejected.
Example
The following example shows how to verify that the pruner is effective for SELECT COUNT(1) FROM t1 WHERE str_col1='polardb' on a table with 10 data blocks, where matching rows are concentrated in one block.
The table schema:
CREATE TABLE `t1` (
`id` int(11) NOT NULL,
`str_col1` char(10) DEFAULT NULL,
`str_col2` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`),
COLUMNAR INDEX (`id`,`str_col1`,`str_col2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='COLUMNAR=1 PRUNER=1'
-
Check the baseline pruner status:
SHOW STATUS LIKE 'imci_pruner%';Output:
+----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | imci_pruner_accepted | 0 | | imci_pruner_rejected | 0 | +----------------------+-------+ -
Run the query:
SELECT COUNT(1) FROM t1 WHERE str_col1='polardb';Output:
+----------+ | count(1) | +----------+ | 1 | +----------+ -
Check the pruner status again:
SHOW STATUS LIKE 'imci_pruner%';Output:
+----------------------+-------+ | Variable_name | Value | +----------------------+-------+ | imci_pruner_accepted | 0 | | imci_pruner_rejected | 9 | +----------------------+-------+
imci_pruner_rejected increased by 9, confirming that the pruner skipped 9 out of 10 blocks.
Check using imci_sql_profiling
Requires PolarDB MySQL 8.0.1.1.5 or later, or 8.0.2.2.29 or later.
information_schema.imci_sql_profiling shows table-level block filtering for a specific query, tied directly to that query's execution plan. Unlike SHOW STATUS, which is session-scoped and cumulative, this view gives per-query results without requiring manual delta calculation.
-
Enable query profiling:
SET imci_analyze_query = ON; -
Run the query:
SELECT COUNT(*) FROM t1 WHERE v1 > 100; -
Check the profiling table:
SELECT `Operator`, `Extra Info` FROM INFORMATION_SCHEMA.IMCI_SQL_PROFILING;Example output:
+----+--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+ | ID | Operator | Extra Info | +----+--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+ | 1 | Select Statement | IMCI Execution Plan (max_dop = 32, real_dop = 32, max_query_mem = unlimited, real_query_mem = unlimited) | | 2 | └─Aggregation | | | 3 | └─Table Scan Cond: (v1 > 100), Pruner Counter: [AC: 8, RE: 1, PA: 2] | | +----+--------------------------------------------------+-----------------------------------------------------------------------------------------------------------+
The Pruner Counter on the Table Scan row breaks down block filtering for this query:
| Counter | Value | Meaning |
|---|---|---|
AC |
8 | Accepted blocks (all rows match; skip per-row filtering) |
RE |
1 | Rejected blocks (no rows match; no I/O) |
PA |
2 | Partially matched blocks (require further per-row filtering) |
Performance benchmark
The following test was run on a table with 120 million rows (approximately 1,800 data blocks) on a 2-core, 4 GB cluster. The string column col had 80 million distinct values. The test query was:
SELECT COUNT(1) FROM t1 WHERE col = 'xxx'
| Configuration | Query time |
|---|---|
| With bloom filter | 0.15 s |
| Without bloom filter | 18.6 s |
With the bloom filter enabled, the pruner eliminated most of the 1,800 blocks. Only the blocks that could contain the target value were scanned, reducing query time from 18.6 s to 0.15 s.