Diagnose WAL Replay Lag and LSN Latency in PolarDB - PolarDB

PolarDB tracks four write-ahead log (WAL) sequence positions that together describe the health of replication and crash recovery. Monitoring the gaps between these positions — and knowing what each gap indicates — lets you catch and resolve latency problems before they affect read consistency, WAL storage, or cluster availability.

Note

To use the polar_* functions described in this topic, install the polar_monitor plug-in first.

Available monitoring metrics

The following table lists all metrics you can query. Use them together to get a complete picture of cluster health.

Metric	Query	Unit	Description
Write LSN (WP)	`pg_current_wal_insert_lsn()`	LSN	Current WAL insert position on the primary node
Apply LSN (AP)	`polar_oldest_apply_lsn()`	LSN	Minimum WAL replay position across all secondary nodes
Consistent LSN (CP)	`polar_consistent_lsn()`	LSN	LSN up to which the primary has flushed all dirty pages. See Consistency LSNs.
Checkpoint REDO LSN (REDO)	`redo_lsn` column of `pg_control_checkpoint()`	LSN	Starting WAL position for log replay on crash recovery
Dirty page count	`SELECT size FROM polar_flushlist`	Pages	Number of dirty pages in the buffer pool
Copy buffer pool usage	`SELECT copy - release AS cbuf_size FROM polar_cbuf`	Buffers	Number of buffers currently held in the copy buffer pool

Key concepts

The differences between the four LSN positions are the primary diagnostic signals:

Gap	High value indicates	Primary risk
`WP − AP`	Secondary nodes are falling behind on WAL replay	Read staleness; WAL storage growth; read/write splitting degraded
`WP − CP`	The primary cannot flush dirty pages fast enough	Checkpoint delays; longer crash recovery time
`WP − REDO`	Checkpoint execution is lagging	More WAL to replay on crash recovery; reduced cluster availability
`AP − CP`	Consistent LSN is lagging behind apply LSN	Checkpoint blocked by replication lag

Monitor LSN positions

Run the following query to read all four positions and compute the gaps in human-readable format:

SELECT *,
  pg_size_pretty(pg_wal_lsn_diff(WP, AP))       AS "WP-AP",
  pg_size_pretty(pg_wal_lsn_diff(AP, CP))       AS "AP-CP",
  pg_size_pretty(pg_wal_lsn_diff(WP, CP))       AS "WP-CP",
  pg_size_pretty(pg_wal_lsn_diff(WP, redo_lsn)) AS "WP-REDO"
FROM (
  SELECT
    pg_current_wal_insert_lsn() AS WP,
    polar_oldest_apply_lsn()    AS AP,
    polar_consistent_lsn()      AS CP,
    redo_lsn
  FROM pg_control_checkpoint()
) AS lsn_info;

Use the gap values to identify which layer needs attention, then follow the corresponding troubleshooting steps below.

Troubleshoot secondary node replay latency (high WP−AP)

A large WP-AP value means secondary nodes are not keeping up with WAL generated on the primary. This causes:

WAL accumulates on the primary because it cannot be reused — storage usage grows.
Secondary nodes serve stale data; queries against them miss recent primary updates.
A read/write splitting proxy may fall back to routing all traffic to the primary.
Consistent LSN and checkpoint latency increases, reducing cluster availability.

Step 1: Check whether primary/secondary synchronization is active.

SELECT * FROM polar_replication_slots;

If the active column is f for any replication slot, that slot's synchronization has stopped. Investigate the root cause before proceeding.

Note

PolarDB has two replication slot types, identified by the slot_name field:

replica — Shares the same data directory as the primary node. A disruption here has an immediate and severe impact on the primary. Resolve replica synchronization issues immediately.
standby — Has its own independent data, similar to a traditional PostgreSQL primary/standby relationship.

Step 2: If synchronization is active, identify where the lag originates.

SELECT
  pid,
  usename,
  application_name,
  client_addr,
  state,
  pg_wal_lsn_diff(pg_current_wal_insert_lsn(), sent_lsn)   AS sent_delay,
  pg_wal_lsn_diff(pg_current_wal_insert_lsn(), write_lsn)  AS write_delay,
  pg_wal_lsn_diff(pg_current_wal_insert_lsn(), flush_lsn)  AS flush_delay,
  pg_wal_lsn_diff(pg_current_wal_insert_lsn(), replay_lsn) AS replay_delay
FROM pg_stat_replication;

Use the delay columns to pinpoint the bottleneck:

Column	Large value means
`sent_delay`	The primary node is slow to send logs to the secondary
`write_delay`	The secondary is slow to write received WAL to storage
`flush_delay`	The secondary is slow to flush received WAL to storage
`replay_delay`	The secondary is slow to replay WAL

Focus on the secondary node with the highest delay values. If replay_delay is large, log in to the host where that secondary node runs and check the status of its startup process for further diagnosis.

Troubleshoot consistent LSN latency (high WP−CP)

A large WP-CP value means dirty pages are not being flushed fast enough. The consistent LSN cannot advance until dirty pages are written to storage, so this directly delays checkpoints and extends crash recovery time.

Common causes and remedies:

Secondary node lag — If WP-AP is also high, resolve the secondary node replay latency first. Secondary lag often blocks the consistent LSN from advancing.
Bulk data operations — Large imports or batch updates generate a surge of dirty pages. The cluster may not flush them immediately, causing a temporary spike. This is expected behavior; the latency decreases after the operation completes.

Buffer pool size exceeds flushing capacity — A large shared_buffer value increases the buffer pool, which means more dirty pages can accumulate. If the rate at which dirty pages are generated exceeds the peak flushing rate (bounded by disk I/O and parallel flusher capacity), the consistent LSN falls progressively further behind. To improve flushing throughput, adjust the following parameters: Increase polar_parallel_bgwriter_workers and decrease polar_parallel_bgwriter_delay to raise flushing throughput.

Warning

Too many flushing processes consume significant I/O bandwidth. This can reduce WAL write throughput and degrade overall SQL execution performance across the cluster. Tune incrementally and monitor the effect.

Parameter	Default	Description
`polar_parallel_bgwriter_workers`	5	Number of parallel flushing processes. The actual number of processes running at any time is `min(2 × polar_parallel_bgwriter_workers, 16)`, because PolarDB dynamically scales the pool based on replication lag.
`polar_parallel_bgwriter_delay`	10 ms	Wait interval between flush cycles for each parallel flushing process.

Troubleshoot checkpoint latency (high WP−REDO)

A large WP-REDO value means checkpoints are not executing frequently enough. The more WAL that accumulates between checkpoints, the longer the cluster takes to recover from a crash.

Common causes:

Consistent LSN latency — Checkpoints cannot advance past the consistent LSN. If WP-CP is also high, resolve that first.
Bulk data operations and flushing inefficiency — The same conditions that cause consistent LSN latency also delay checkpoints.
Long checkpoint intervals — If many data modifications occur within a single checkpoint window, the accumulated WAL grows large. Decrease checkpoint_timeout or max_wal_size to trigger checkpoints more frequently.

Other monitoring metrics

Beyond the LSN gaps, two additional metrics help characterize buffer pool behavior:

Dirty page count

SELECT size FROM polar_flushlist;

polar_flushlist records partial statistics of the flush list. The size field is the number of dirty pages currently in the buffer pool. To convert to bytes: size × page size (the default page size is 8 KB).

Copy buffer pool usage

SELECT copy - release AS cbuf_size FROM polar_cbuf;

polar_cbuf tracks the copy buffer pool. copy is the total number of buffers copied into the pool; release is the number released. The difference is the number of buffers currently held in the copy buffer pool.