A columnstore index lets PolarDB for PostgreSQL scan only the columns a query references—not every row—cutting analytical query time on wide tables by up to 30x compared with native parallel execution.
How it works
SaaS applications often store dozens or even hundreds of columns per row. Analytical queries typically touch only a few of those columns, which creates two problems when the row store engine handles them:
-
The engine reads every column for each matching row, including irrelevant ones, driving up I/O.
-
Composite indexes cover only the filter patterns they were built for. When query conditions change, existing indexes stop being effective.
A columnstore index stores each column independently on disk. Reading one column does not touch any other, and the index works regardless of which columns are filtered or the order of filter conditions.
Performance
With a 100-million-row dataset and a degree of parallelism (DOP) of 4:
| Query | Native parallel execution (PostgreSQL) | Columnstore index |
|---|---|---|
| Q1 | 243 s | 7.9 s |
The columnstore index delivers a 30x speedup over native parallel execution in PostgreSQL.
Prerequisites
Before you begin, make sure that:
-
Your cluster runs one of the following versions:
-
PostgreSQL 16, minor engine version 2.0.16.8.3.0 or later
-
PostgreSQL 14, minor engine version 2.0.14.10.20.0 or later
Check the minor engine version in the PolarDB console or by running
SHOW polardb_version;. If the version does not meet the requirement, upgrade the minor engine version. -
-
The source table has a primary key, and the primary key column is included in the columnstore index.
-
The
wal_levelparameter is set tologicalto enable logical replication support in the write-ahead log (WAL).ImportantChanging
wal_levelrestarts the cluster. Set the parameter in the console and plan for a maintenance window before making the change.
Step 1: Enable the columnstore index
The procedure varies by minor engine version.
Step 2: Create a wide table and a columnstore index
This example uses a 24-column table named widecolumntable. The query analyzes five columns—id_1, domain, consumption, start_time, and end_time—to calculate the total amount spent by each customer across multiple domains in the past year.
-
Create the table and insert your test data:
CREATE TABLE widecolumntable ( id_1 BIGINT NOT NULL PRIMARY KEY, id_2 BIGINT, id_3 BIGINT, id_4 BIGINT, id_5 BIGINT, id_6 BIGINT, version INT, domain TEXT, consumption DECIMAL(18,3), c_level CHARACTER VARYING(1) NOT NULL, priority BIGINT, operator TEXT, notify_policy TEXT, call_id UUID NOT NULL, provider_id BIGINT NOT NULL, name_1 TEXT NOT NULL, name_2 TEXT NOT NULL, name_3 TEXT, start_time TIMESTAMP WITH TIME ZONE NOT NULL, end_time TIMESTAMP WITH TIME ZONE NOT NULL, comment JSONB NOT NULL, description TEXT[] NOT NULL, created_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL, updated_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL ); -
Create a columnstore index on the five columns to be analyzed. Include the primary key column (
id_1).CREATE INDEX idx_wide_csi ON widecolumntable USING CSI(id_1, domain, consumption, start_time, end_time);
Step 3: Run the query
The following query calculates the total amount spent by each customer in each domain over the past year, sorted by total amount. Run it with the columnstore index enabled, then disabled, to compare performance.
With the columnstore index (DOP = 4):
-- Enable the columnstore index and set the degree of parallelism to 4.
SET polar_csi.enable_query TO on;
SET polar_csi.exec_parallel TO 4;
-- Q1
EXPLAIN ANALYZE
SELECT id_1, domain, SUM(consumption)
FROM widecolumntable
WHERE start_time > '20230101' AND end_time < '20240101'
GROUP BY id_1, domain
ORDER BY SUM(consumption);
With the row store engine (DOP = 4):
-- Disable the columnstore index and use the row store engine with DOP 4.
SET polar_csi.enable_query TO off;
SET max_parallel_workers_per_gather TO 4;
-- Q1
EXPLAIN ANALYZE
SELECT id_1, domain, SUM(consumption)
FROM widecolumntable
WHERE start_time > '20230101' AND end_time < '20240101'
GROUP BY id_1, domain
ORDER BY SUM(consumption);
Compare the Execution Time values in the EXPLAIN ANALYZE output. With the columnstore index, Q1 completes in 7.9 s versus 243 s for the row store engine—a 30x improvement.
Limitations and considerations
Be aware of the following constraints before deploying the columnstore index in production:
-
Primary key required: The source table must have a primary key. The primary key column must be included in the columnstore index definition.
-
`wal_level` change restarts the cluster: Setting
wal_leveltologicaltriggers a cluster restart. Schedule this change during a maintenance window. -
Single-node clusters: You cannot add a columnstore index read-only node (IMCI node) to a single-node cluster. The cluster must have at least one existing read-only node.
-
Memory share when using the pre-installed extension: Without a dedicated IMCI node, the columnstore engine uses only 25% of the node's memory. Under heavy AP workloads, this constraint limits throughput and can affect TP workloads running on the same node.
-
Database-level scope for `polar_csi`: On older minor engine versions that use the
polar_csiextension, the extension is scoped per database. Install it separately in every database where you need columnstore index support. -
Privileged account required for extension installation: Only a privileged database account can install the
polar_csiextension.




