All Products
Search
Document Center

PolarDB:Read-ahead and pre-extension

Last Updated:Mar 28, 2026

PolarDB for PostgreSQL optimizes I/O performance for PolarFileSystem (PFS) through three built-in features: heap table read-ahead, heap table pre-extension, and index creation pre-extension. Together, these features can double sequential scan and data loading throughput on PFS, and improve index creation performance by up to 30%.

Prerequisites

Before you begin, ensure that your PolarDB for PostgreSQL cluster runs one of the following engine versions:

  • PostgreSQL 18 (revision version 2.0.18.0.1.0 or later)

  • PostgreSQL 17 (revision version 2.0.17.2.1.0 or later)

  • PostgreSQL 16 (revision version 2.0.16.6.2.0 or later)

  • PostgreSQL 15 (revision version 2.0.15.12.4.0 or later)

  • PostgreSQL 14 (revision version 2.0.14.5.1.0 or later)

  • PostgreSQL 11 (revision version 2.0.11.2.0.0 or later)

To check your revision version, run SHOW polardb_version; or check the cluster details in the PolarDB console. Upgrade the version if necessary.

Why PFS requires I/O optimization

PolarDB for PostgreSQL uses PolarFileSystem (PFS) as its underlying file system. Unlike standalone file systems such as ext4, PFS has two characteristics that affect I/O behavior:

  • Page extension overhead: PFS incurs high overhead for metadata updates during page extension. The minimum extension size must be a multiple of 4 MB, whereas standard PostgreSQL extends pages in multiples of 8 MB. This mismatch causes performance degradation when writing tables or creating indexes.

  • Large I/O efficiency: PFS is more efficient when reading or writing large contiguous blocks of data rather than small individual pages.

The three features described in this topic address both characteristics to match PFS's optimal I/O patterns.

Feature overview

  • Heap table read-ahead: When sequential reads require two or more pages, PolarDB reads 128 KB of data per I/O instead of 8 KB per page. This batches multiple small reads into a single large I/O, improving sequential scan and vacuum performance. Read-ahead also increases the performance for creating indexes by 18%.

  • Heap table pre-extension: Instead of extending tablespace one 8 KB page at a time, PolarDB extends 4 MB of pages in a single I/O operation. This eliminates the N I/O operations previously required to extend N pages, which is critical for write-heavy workloads.

  • Index creation pre-extension: Similar to heap table pre-extension, but applied during index builds. PolarDB extends 4 MB of index pages per I/O, reducing the overhead of metadata updates that would otherwise slow down index creation. Supported index types: B-tree, GIN, GiST, SP-GiST, and Bloom.

PostgreSQL 17 does not support heap table read-ahead.
In PostgreSQL 11, index creation pre-extension supports only B-tree indexes. GIN, GiST, SP-GiST, and Bloom indexes are not supported.

How it works

Heap table read-ahead

When two or more pages are needed, read-ahead is triggered automatically. The implementation proceeds in four steps:

  1. Allocate N buffers from the buffer pool.

  2. Use palloc to allocate a contiguous memory region of N × page size, referred to as p.

  3. Use PFS to read N × page size of data from the heap table into p in a single I/O.

  4. Copy the N pages from p into the N buffers allocated in step 1.

Subsequent read operations hit the buffer pool directly. The following diagram shows the data flow:

image

Heap table pre-extension

Pre-extension separates buffer allocation from file system extension. The implementation proceeds in three steps:

  1. Allocate N buffers from the buffer pool without triggering page extension in the file system.

  2. Use the PFS file write interface to extend 4 MB of pages in a single batch write of all-zero pages.

  3. Initialize each page individually, record available space, and mark the pre-extension complete.

Index creation pre-extension

Index creation pre-extension follows the same batching approach but does not require buffer allocation:

  1. Use the PFS file write interface to extend 4 MB of index pages in a single batch write of all-zero pages.

  2. Write the index pages built in the buffer pool to the file system.

Configure read-ahead and pre-extension

All three features are enabled by default. The default values are tuned for PFS and do not require adjustment under standard configurations.

128 KB matches PFS's optimal I/O unit size for sequential reads. 4 MB matches PFS's minimum metadata-efficient extension size for writes. Increasing these values beyond their defaults does not produce measurable performance gains.

Heap table read-ahead

The polar_bulk_read_size parameter controls the read-ahead batch size.

To disable heap table read-ahead:

ALTER SYSTEM SET polar_bulk_read_size = 0;
SELECT pg_reload_conf();

To enable heap table read-ahead at the default 128 KB:

ALTER SYSTEM SET polar_bulk_read_size = '128 KB';
SELECT pg_reload_conf();

This setting is most impactful during sequential scans and vacuum operations on large tables.

Heap table pre-extension

The polar_heap_bulk_extend_size parameter controls the pre-extension batch size. In PostgreSQL 11, the equivalent parameter is polar_bulk_extend_size.

To disable heap table pre-extension:

ALTER SYSTEM SET polar_heap_bulk_extend_size = 0;
SELECT pg_reload_conf();

To enable heap table pre-extension at the default 4 MB:

ALTER SYSTEM SET polar_heap_bulk_extend_size = '4 MB';
SELECT pg_reload_conf();

This setting is most impactful during bulk data loading, for example, when running COPY on large tables.

Index creation pre-extension

The polar_index_bulk_extend_size parameter controls the index pre-extension batch size. In PostgreSQL 11, the equivalent parameter is polar_index_create_bulk_extend_size.

In PostgreSQL 11, index creation pre-extension supports only B-tree indexes, regardless of the parameter value. Setting the parameter to a non-zero value on PostgreSQL 11 has no effect on GIN, GiST, SP-GiST, or Bloom indexes.

To disable index creation pre-extension:

ALTER SYSTEM SET polar_index_bulk_extend_size = 0;
SELECT pg_reload_conf();

To enable index creation pre-extension at the default 4 MB:

ALTER SYSTEM SET polar_index_bulk_extend_size = '4 MB';
SELECT pg_reload_conf();

This setting is most impactful during CREATE INDEX on large tables.

Performance results

The following results were measured on a PolarDB for PostgreSQL cluster running PostgreSQL 14, with 8 cores, 32 GB of memory, and a 400 GB pgbench dataset.

Heap table read-ahead

  • Performance comparison for vacuum on a 400 GB table: vacuum性能对比

  • Performance comparison for sequential scan on a 400 GB table: seqscan性能对比

Conclusions:

  • Heap table read-ahead doubles or triples vacuum and sequential scan performance.

  • Increasing polar_bulk_read_size beyond 128 KB produces no significant additional improvement.

Heap table pre-extension

Performance comparison for data loading on a 400 GB table: 数据装载性能对比

Conclusions:

  • Heap table pre-extension doubles data loading performance.

  • Increasing polar_heap_bulk_extend_size beyond 4 MB produces no significant additional improvement.

Index creation pre-extension

Performance comparison for creating indexes on a 400 GB table: 创建索引性能对比

Conclusions:

  • Index creation pre-extension improves index creation performance by 30%.

  • Increasing polar_index_bulk_extend_size beyond 4 MB produces no significant additional improvement.

Parameter reference

FeatureParameterDefaultPostgreSQL 11 parameterEnabled by default
Heap table read-aheadpolar_bulk_read_size128 KBpolar_bulk_read_sizeYes
Heap table pre-extensionpolar_heap_bulk_extend_size4 MBpolar_bulk_extend_sizeYes
Index creation pre-extensionpolar_index_bulk_extend_size4 MBpolar_index_create_bulk_extend_sizeYes