Enabling index-only scans with covering indexes for query optimization - PolarDB

Index-only scans let PostgreSQL answer queries directly from an index, without touching the table's main data storage (called the *heap*). On read-heavy tables with stable data, this can dramatically reduce physical I/O and speed up queries.

How it works

All indexes in PostgreSQL are *secondary indexes*: each index is stored separately from the heap. A standard index scan therefore fetches data from two places — the index to find matching rows, then the heap to retrieve column values. Because matching index entries are usually clustered together while the corresponding heap rows can be scattered anywhere, that second fetch involves significant random I/O.

An index-only scan eliminates the heap fetch entirely by returning values directly from index entries. Instead of two reads, it needs only one.

Requirements

Two conditions must be met before the query planner can use an index-only scan.

The index type must support index-only scans.

Index type	Index-only scan support
B-tree	Always supported
GiST	Supported for some operator classes
SP-GiST	Supported for some operator classes
GIN	Not supported
Others	Not supported

The underlying requirement is that the index must physically store, or else be able to reconstruct, the original data value for each entry. GIN indexes cannot meet this requirement because each entry typically holds only part of the original value.

The query must reference only columns stored in the index.

Given an index on columns x and y of a table that also has column z, the following queries can use index-only scans:

SELECT x, y FROM tab WHERE x = 'key';
SELECT x FROM tab WHERE x = 'key' AND y < 42;

These queries cannot, because they reference z, which is not in the index:

SELECT x, z FROM tab WHERE x = 'key';
SELECT x FROM tab WHERE x = 'key' AND z < 42;

The visibility map check

Meeting the two requirements above makes an index-only scan *physically possible*, but there is one more factor: PostgreSQL must confirm that each retrieved row is visible to the current Multi-Version Concurrency Control (MVCC) snapshot. Visibility information lives only in heap entries, not in index entries.

To avoid a full heap fetch just for visibility checks, PostgreSQL maintains a *visibility map* — a compact bitmap that tracks, per heap page, whether all rows on that page are visible to all current and future transactions. The visibility map is four orders of magnitude smaller than the heap, so it stays cached in memory almost all the time.

When an index-only scan finds a matching entry, it checks the visibility map for the corresponding heap page:

Bit set — all rows on that page are known visible. The value is returned from the index with no heap access.
Bit not set — the heap page must be visited to determine visibility. The query falls back to a standard index scan for that row, with no performance gain.

Index-only scans are most effective on tables where most heap pages have their all-visible bit set — that is, tables whose rows change infrequently.

Verify that index-only scans are being used

Run EXPLAIN on your query and look for Index Only Scan in the output:

EXPLAIN SELECT x, y FROM tab WHERE x = 'key';

Index Only Scan using tab_x_y on tab  (cost=0.29..8.31 rows=1 width=8)
  Index Cond: (x = 'key'::text)

The cost values above are representative examples. Actual values depend on your table statistics and PostgreSQL configuration.

If you see Index Scan instead, the query is fetching from the heap. Check whether the query references columns outside the index, or whether the table has recently had many updates (which would clear visibility map bits).

Warning

Adding a column to the WHERE clause of a query that relies on an index-only scan can silently break the optimization. If the new column is not in the index, the query planner switches to a standard index scan. The query may still return fewer rows, but it will be slower — not faster. Always run EXPLAIN after modifying a query that uses an index-only scan.

Create a covering index with INCLUDE

A *covering index* is an index that contains all the columns a query needs, enabling an index-only scan for that query. Use the INCLUDE clause to add *payload columns* — columns returned by the query but not used for filtering — without making them part of the index key.

For a query that frequently runs as:

SELECT y FROM tab WHERE x = 'key';

An index on x alone would require a heap fetch to get y. A covering index avoids that:

CREATE INDEX tab_x_y ON tab(x) INCLUDE (y);

With this index, the query planner can serve the entire query from the index.

INCLUDE columns vs. key columns

INCLUDE columns differ from key columns in important ways:

Property	Key column	INCLUDE column
Used for index navigation	Yes	No
Affects index sort order	Yes	No
Enforced in uniqueness check	Yes	No
Stored in upper B-tree levels	Sometimes (suffix truncation may also remove trailing key columns)	No
Data type restrictions	Must be indexable	None

Because INCLUDE columns are not part of the index key, the y column in the example above does not need to be a type that the index machinery can process — it is simply stored at the leaf level.

Uniqueness applies only to key columns. This index:

CREATE UNIQUE INDEX tab_x_y ON tab(x) INCLUDE (y);

enforces uniqueness on x alone, not on the combination of x and y. The same applies to UNIQUE and PRIMARY KEY constraints that include an INCLUDE clause.

When to use INCLUDE

Add a column to INCLUDE when:

The query selects that column but does not filter or sort by it.
The table changes slowly enough that visibility map bits are usually set, making index-only scans likely to succeed without a heap fallback.

Skip INCLUDE when:

The table is heavily updated. If the heap must be visited anyway for visibility checks, there is no benefit to storing the column in the index.
The column is wide. INCLUDE columns duplicate data from the table and increase index size, which can slow index scans. If an index tuple exceeds the maximum size for the index type, inserts will fail.
Expressions are involved. Expressions are not supported as INCLUDE columns. Only B-tree, GiST, and SP-GiST indexes support INCLUDE columns.

Suffix truncation

B-tree indexes apply *suffix truncation* automatically: non-key (INCLUDE) columns are never stored in upper B-tree levels, only at the leaf level. The truncation process may also remove trailing key columns from upper levels when the remaining prefix is sufficient to navigate the tree.

This means that covering indexes without an INCLUDE clause — created by listing payload columns as ordinary key columns — may behave similarly in practice. However, defining payload columns with INCLUDE is the reliable way to keep upper-level tuples small and, critically, the only way to enforce uniqueness on the key columns alone.

Pre-INCLUDE workaround

Before the INCLUDE feature was available, a common workaround was to list payload columns as trailing key columns:

CREATE INDEX tab_x_y ON tab(x, y);

This works for index-only scans as long as y is a trailing column — making payload columns leading columns is unwise. However, it enforces uniqueness on both x and y if you use a UNIQUE index, which is usually not the intent.

Edge cases

Expression indexes

An index on f(x) can support an index-only scan for SELECT f(x) FROM tab WHERE f(x) < 1, in principle. In practice, the query planner currently requires all raw *columns* referenced by the query to be available from the index. Because x itself is not stored in the index — only f(x) is — the planner does not recognize this as a valid index-only scan candidate.

Work around this limitation by including x as a non-key column:

CREATE INDEX tab_f_x ON tab (f(x)) INCLUDE (x);

Note that the planner may not match uses of f(x) that appear outside of indexable WHERE clauses to the index column, particularly in queries involving joins.

Partial indexes

A partial index with a WHERE clause can support index-only scans even when the WHERE clause references a column not in the index. Consider:

CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
    WHERE success;

This index can serve:

SELECT target FROM tests WHERE subject = 'some-subject' AND success;

Because all entries in the index already satisfy success = true, the planner does not need to recheck that condition at runtime — and success does not need to be a result column. This behavior is supported in PostgreSQL 9.6 and later.