PolarDB for PostgreSQL (Compatible with Oracle) clusters include single-row and bulk insert interfaces in their Table AM definitions. When you insert many tuples, you can enable the bulk insert interface to improve data insertion efficiency.
Prerequisites
Before you begin, ensure that you have:
Oracle syntax compatible 2.0 with a minor engine version of 2.0.14.13.28.0 or later
To check your minor engine version, you can view the minor engine version in the console or by running:
SHOW polardb_version;If the version does not meet the requirement, upgrade the minor engine version.
Enable the bulk insert interface
The polar_enable_tableam_multi_insert parameter controls the bulk insert Table AM interface. It is enabled by default:
SHOW polar_enable_tableam_multi_insert;
polar_enable_tableam_multi_insert
-----------------------------------
on
(1 row)Supported syntaxes
When wal_level is set to logical, the following syntaxes use the bulk insert interface:
Whenwal_levelis set to any value other thanlogical, these syntaxes fall back to the bulk import infrastructure for bulk import optimization.
How it works
Background
An Access Method (AM) defines how PostgreSQL stores and retrieves data in tables and indexes. The executor passes data rows to the Table AM and Index AM without needing to know the internal storage implementation.
PostgreSQL 12 introduced an extensible Table AM with built-in support for the Heap AM. The Heap AM originally provided two insert interfaces:
Bulk insert interface: processes multiple rows at once; used only by
COPY FROMSingle-row insert interface: processes one row at a time; used by all other insert syntaxes
The single-row interface is significantly less efficient for large-scale inserts. PolarDB extends the Table AM with a new optional bulk insert interface for the Heap AM, providing complete lifecycle control for bulk inserts and a more efficient insertion algorithm and Write-Ahead Log (WAL) format. Both the single-row and bulk insert interfaces support logical replication for heap tables.
Buffer layer algorithm
Single-row insert algorithm
Single-row insert processes one tuple per page lock cycle:
Fill the tuple information.
Calculate the free space required for the tuple.
Find a page that can accommodate the tuple.
Lock the page and place the tuple on it.
Mark the page as dirty.
Generate a WAL record for the page modification.
Release the page lock.
Bulk insert algorithm
Bulk insert stages a batch of tuples and fills multiple tuples per page lock cycle:
Stage a batch of tuples in memory.
Fill the tuple information for the batch.
Find a page that can accommodate the tuples.
Lock the page and place the tuples on it.
If the page has enough space, continue placing tuples until the page is full or all staged tuples are inserted.
Mark the page as dirty.
Generate a WAL record for the page modification.
Release the page lock.
The bulk insert algorithm significantly reduces the frequency of page locks. It also reduces the number of WAL records generated and makes the log content more compact.
WAL log format
Single-row insertion generates one Heap INSERT record per row:
-[ RECORD 1 ]----+--------------------------------------------
start_lsn | 0/40BE24E0
end_lsn | 0/40BE2520
prev_lsn | 0/40BE24B8
xid | 792
resource_manager | Heap
record_type | INSERT
record_length | 61
main_data_length | 3
fpi_length | 0
description | off: 8, flags: 0x00
block_ref | blkref #0: rel 1663/5/16412 fork main blk 0Bulk insertion generates one Heap2 MULTI_INSERT record that covers multiple rows inserted on the same page:
-[ RECORD 1 ]----+------------------------------------------------------------------
start_lsn | 0/40BE2548
end_lsn | 0/40BE2610
prev_lsn | 0/40BE2520
xid | 793
resource_manager | Heap2
record_type | MULTI_INSERT
record_length | 194
main_data_length | 20
fpi_length | 0
description | ntuples: 8, flags: 0x02, offsets: [9, 10, 11, 12, 13, 14, 15, 16]
block_ref | blkref #0: rel 1663/5/16412 fork main blk 0The ntuples: 8 field shows that eight rows are captured in a single WAL record, compared to eight separate INSERT records with single-row insertion. This significantly reduces the number of WAL records generated for large inserts.
See also
: the bulk import path used when
wal_levelis not set tologicalVersion management: view and upgrade your minor engine version