All Products
Search
Document Center

PolarDB:Custom data replication key

Last Updated:Mar 28, 2026

By default, IMCI uses the primary key to synchronize data updates and deletions. However, this strategy is inefficient if a table lacks a primary key or has an unordered one, such as a UUID. These scenarios cause excessive random I/O during data synchronization, which degrades performance. This guide explains how to define a custom replication key to improve data synchronization performance in these scenarios.

Prerequisites

The table must have a column with monotonically increasing values in physical storage.

Considerations

Before you begin, consider the following:

  • Must be based on a unique index: A custom replication key must be based on a unique index, specified with the USING INDEX clause.

  • Index columns must be non-null: All columns in the unique index for the replication key must be defined as NOT NULL.

  • Use an ordered index for optimal performance: For optimal performance, we strongly recommend building the unique index on a monotonically increasing column, such as an auto-incrementing column of the SERIAL or BIGSERIAL type.

How it works

REPLICA IDENTITY mechanism

The REPLICA IDENTITY property determines what information is written to the WAL (write-ahead log) during logical replication to identify changed rows. IMCI uses this mechanism to locate and update data in the columnar store. REPLICA IDENTITY supports the following four modes:

Mode

Description

Use case

DEFAULT

Uses the primary key as the replication identity.

The default and recommended mode. No change is needed if the table has a primary key.

USING INDEX

Uses a specified unique index as the replication identity.

Specify an ordered unique index to improve efficiency when the table lacks a primary key or has an unordered primary key.

FULL

Writes the values of all columns in the row to the WAL.

Use only when a table has neither a primary key nor a unique index but must support UPDATE or DELETE operations.

NOTHING

Records no identifying information.

Use only for insert-only tables that do not support updates or deletes.

Ordered unique index as replication key

The core principle is to use the USING INDEX mode to set the replication key to a unique index on an ordered column, such as an auto-incrementing column.

The workflow is as follows:

  1. Create an ordered unique index: Create a unique index on a physically ordered column, such as an auto-incrementing ID of the SERIAL type.

  2. Change the replication identity: Run the ALTER TABLE ... REPLICA IDENTITY USING INDEX ... command to switch the table's replication identity to the newly created unique index.

  3. Efficient logical decoding: During logical replication, the system efficiently decodes the ordered index key values from the WAL.

  4. Accelerate data synchronization: IMCI uses these ordered key values to locate and update data in the columnar store. Because the key values are ordered, operations become efficient sequential I/O, which significantly boosts data synchronization performance.

Procedure

  • Step 1: Create a unique index

    Create a unique index on the ordered auto-incrementing column.

    -- Assuming table t has an auto-incrementing column named serial_id
    CREATE UNIQUE INDEX idx_t_serial_id ON t(serial_id);
  • Step 2: Change the replication identity

    Switch the table's REPLICA IDENTITY to the unique index created in the previous step.

    ALTER TABLE t REPLICA IDENTITY USING INDEX idx_t_serial_id;
  • Step 3: Create or rebuild the columnar store index

    Ensure the columnar store index (CSI) includes the column used for the replication key. If a columnar store index already exists, we recommend rebuilding it for optimal performance.

    -- Create a columnar store index that includes all columns
    CREATE INDEX ON t USING csi;
  • Step 4: Compare performance

    We conducted an experiment to verify the performance improvement.

    • Test setup:

      1. Create a table with an unordered primary key (random_id) and an ordered auto-incrementing column (serial_id).

        create table t (random_id text primary key, serial_id serial, a text);
      2. Insert 10 million rows of data.

        insert into t(random_id, a)
          select md5(i::text), i::text
            from generate_series(1, 10000000) i;
      3. Create a columnar store index and an ordered unique index, idx_a, on the serial_id column.

        create unique index idx_a on t(serial_id);
        create index on t using csi;
    • Test procedure:

      1. Default mode test (using the unordered primary key random_id as the replication key):

        -- Update 10,000 rows
        UPDATE t SET a = '0' WHERE serial_id <= 10000;
      2. Custom mode test (switching the replication key to the ordered unique index idx_a):

        ALTER TABLE t REPLICA IDENTITY USING INDEX idx_a;
        -- Update another 10,000 rows
        UPDATE t SET a = '1' WHERE serial_id <= 10000;
    • Test results:

      Replication key type

      Default (unordered primary key)

      Custom (ordered unique index)

      Synchronization time (10,000 rows)

      ~0.5 seconds

      ~0.01 seconds

Conclusion

The approximately 50-fold improvement in performance demonstrates that a custom, ordered replication key is critical for optimizing IMCI data synchronization efficiency.