By default, IMCI uses the primary key to synchronize data updates and deletions. However, this strategy is inefficient if a table lacks a primary key or has an unordered one, such as a UUID. These scenarios cause excessive random I/O during data synchronization, which degrades performance. This guide explains how to define a custom replication key to improve data synchronization performance in these scenarios.
Prerequisites
The table must have a column with monotonically increasing values in physical storage.
Considerations
Before you begin, consider the following:
Must be based on a unique index: A custom replication key must be based on a unique index, specified with the
USING INDEXclause.Index columns must be non-null: All columns in the unique index for the replication key must be defined as
NOT NULL.Use an ordered index for optimal performance: For optimal performance, we strongly recommend building the unique index on a monotonically increasing column, such as an auto-incrementing column of the
SERIALorBIGSERIALtype.
How it works
REPLICA IDENTITY mechanism
The REPLICA IDENTITY property determines what information is written to the WAL (write-ahead log) during logical replication to identify changed rows. IMCI uses this mechanism to locate and update data in the columnar store. REPLICA IDENTITY supports the following four modes:
Mode | Description | Use case |
| Uses the primary key as the replication identity. | The default and recommended mode. No change is needed if the table has a primary key. |
| Uses a specified unique index as the replication identity. | Specify an ordered unique index to improve efficiency when the table lacks a primary key or has an unordered primary key. |
| Writes the values of all columns in the row to the WAL. | Use only when a table has neither a primary key nor a unique index but must support UPDATE or DELETE operations. |
| Records no identifying information. | Use only for insert-only tables that do not support updates or deletes. |
Ordered unique index as replication key
The core principle is to use the USING INDEX mode to set the replication key to a unique index on an ordered column, such as an auto-incrementing column.
The workflow is as follows:
Create an ordered unique index: Create a unique index on a physically ordered column, such as an auto-incrementing ID of the
SERIALtype.Change the replication identity: Run the
ALTER TABLE ... REPLICA IDENTITY USING INDEX ...command to switch the table's replication identity to the newly created unique index.Efficient logical decoding: During logical replication, the system efficiently decodes the ordered index key values from the WAL.
Accelerate data synchronization: IMCI uses these ordered key values to locate and update data in the columnar store. Because the key values are ordered, operations become efficient sequential I/O, which significantly boosts data synchronization performance.
Procedure
Step 1: Create a unique index
Create a unique index on the ordered auto-incrementing column.
-- Assuming table t has an auto-incrementing column named serial_id CREATE UNIQUE INDEX idx_t_serial_id ON t(serial_id);Step 2: Change the replication identity
Switch the table's
REPLICA IDENTITYto the unique index created in the previous step.ALTER TABLE t REPLICA IDENTITY USING INDEX idx_t_serial_id;Step 3: Create or rebuild the columnar store index
Ensure the columnar store index (CSI) includes the column used for the replication key. If a columnar store index already exists, we recommend rebuilding it for optimal performance.
-- Create a columnar store index that includes all columns CREATE INDEX ON t USING csi;Step 4: Compare performance
We conducted an experiment to verify the performance improvement.
Test setup:
Create a table with an unordered primary key (
random_id) and an ordered auto-incrementing column (serial_id).create table t (random_id text primary key, serial_id serial, a text);Insert 10 million rows of data.
insert into t(random_id, a) select md5(i::text), i::text from generate_series(1, 10000000) i;Create a columnar store index and an ordered unique index,
idx_a, on theserial_idcolumn.create unique index idx_a on t(serial_id); create index on t using csi;
Test procedure:
Default mode test (using the unordered primary key
random_idas the replication key):-- Update 10,000 rows UPDATE t SET a = '0' WHERE serial_id <= 10000;Custom mode test (switching the replication key to the ordered unique index
idx_a):ALTER TABLE t REPLICA IDENTITY USING INDEX idx_a; -- Update another 10,000 rows UPDATE t SET a = '1' WHERE serial_id <= 10000;
Test results:
Replication key type
Default (unordered primary key)
Custom (ordered unique index)
Synchronization time (10,000 rows)
~0.5 seconds
~0.01 seconds
Conclusion
The approximately 50-fold improvement in performance demonstrates that a custom, ordered replication key is critical for optimizing IMCI data synchronization efficiency.