This topic describes the table size caching feature of PolarDB for PostgreSQL and .
Background information
When a database executes an SQL statement, it queries system tables, user tables, and their indexes multiple times. During this process, PolarDB for PostgreSQL and frequently use the smgrnblocks interface of the storage management layer (smgr) to retrieve the number of blocks in a table file. The smgrnblocks interface retrieves the file length using the lseek interface of the file system. It then calculates the number of blocks based on the block size. The smgrnblocks interface is on the critical path for SQL processing. Frequent calls to the file system interface can severely affect database performance. PolarDB for PostgreSQL and use the distributed file system PolarFileSystem (PolarFS). The latency for calling PolarFS lseek is much higher than for the lseek interface on a standalone file system. This slows down the running time of SQL statements.
To reduce calls to the file system, PolarDB for PostgreSQL and implement a cache for the number of blocks in a table file at the storage management layer. This cache is called the Relation Size Cache (RSC). The RSC caches the number of blocks of a table in shared memory. When the number of blocks in a table changes, the cached value in shared memory is updated. Queries for the number of blocks in a table file use the cache first. This reduces requests to the file system and speeds up SQL execution.
Applicability
This feature is supported on the following versions of PolarDB for PostgreSQL:
PostgreSQL 17 (minor engine version 2.0.17.2.1.0 or later)
PostgreSQL 16 (minor engine version 2.0.16.3.1.1 or later)
PostgreSQL 15 (minor engine version 2.0.15.7.1.1 or later)
PostgreSQL 14 (minor engine version 2.0.14.12.23.1 or later)
You can use the console to view the minor engine version number or run the SHOW polardb_version; statement. If the minor engine version requirement is not met, upgrade the minor engine version.
Feature introduction
RSC query
The RSC consists of a one-dimensional array and a hash table in shared memory:
Each cache entry in the array stores the number of blocks for a relation, which can be a table or an index.
The hash table records the mapping between the unique storage layer identifier of a relation, RelFileNode, and its cache entry in the array.
All processes query the RSC using the following two levels of indexes:
Level 1 index: This caches the pointer and generation (version number) of the last accessed RSC cache entry. If the current table is the same as the last one and the generation of the pointed-to RSC cache entry has not changed, the process can directly retrieve the table's block count from the pointer. In read-heavy scenarios for a specific table, the level 1 index has a high hit rate. When a mapping in the RSC hash table is modified, the generation of the corresponding RSC cache entry is incremented, which invalidates the level 1 index.
Level 2 index: When the level 1 index misses, the process queries the hash table in shared memory to find the RSC cache entry for the current table. It retrieves the block count and updates the level 1 index.
If both index levels fail to find the table and its block count in the RSC, cache eviction is triggered. A Segmented Least Recently Used (SLRU) algorithm selects a less recently used RSC cache entry to be cleared. Then, the file system's lseek interface is called. The resulting block count is then used to update the cleared RSC cache entry, and both index levels are also updated.
RSC updates on primary and standby nodes
After you enable the RSC feature, storage management layer interfaces that change a table file's size must also update the corresponding RSC cache entry in shared memory. This ensures the cached block count matches the actual block count in the file system.
The interface for extending a table file updates the block count of the extended table file in the RSC.
The interface for truncating a table file updates the block count of the truncated table file in the RSC.
Standby nodes use physical replication to synchronize with the primary node. During Write-Ahead Log (WAL) replay, they call the same storage management layer interfaces to modify table file lengths. Therefore, the RSC update method is the same as on the primary node.
RSC updates on replica nodes
Replica nodes share physical storage with the primary node and use the LogIndex mechanism to synchronize. Replica nodes do not write to storage. Therefore, they cannot update RSC cache values in memory through the storage management layer interfaces.
Replica nodes parse the data block numbers referenced in WAL logs and compare them with the current block count in the RSC. This determines if the file length in shared storage has changed. If a block number from a WAL log is greater than the current cached block count, the RSC cache value is updated to the new block count. If a WAL log that indicates a file truncation is parsed, the cached block count in the RSC is discarded. The cache will be repopulated on the next file system call.
Usage guide
GUC parameters
The following Grand Unified Configuration (GUC) parameters are available for RSC:
Parameter | Description |
polar_enable_rel_size_cache | Specifies whether to enable the RSC feature. Valid values:
|
polar_enable_replica_rel_size_cache | Specifies whether to enable the RSC feature on replica nodes. Valid values:
|
polar_enable_standby_rel_size_cache | Specifies whether to enable the RSC feature on standby nodes. Valid values:
|
Performance test
When RSC is disabled, the request latency for the block count of a 32 GB table is about 55 us:
SHOW polar_enable_rel_size_cache; polar_enable_rel_size_cache ----------------------------- off (1 row) SELECT polar_smgrperf_nblocks(32, true, false); NOTICE: testing logical file length with 32 GB INFO: iops=18341.1/s, lat=54.52us INFO: iops=17504.0/s, lat=57.13us INFO: iops=17960.8/s, lat=55.68us INFO: iops=17973.0/s, lat=55.64us INFO: iops=17603.5/s, lat=56.81us INFO: iops=17403.8/s, lat=57.46us INFO: iops=17506.2/s, lat=57.12us INFO: iops=18061.7/s, lat=55.37usWhen RSC is enabled, the request latency for the block count of a 32 GB table is about 0.07 us:
SHOW polar_enable_rel_size_cache; polar_enable_rel_size_cache ----------------------------- on (1 row) SELECT polar_smgrperf_nblocks(32, true, false); NOTICE: testing logical file length with 32 GB INFO: iops=14155515.6/s, lat=0.07us INFO: iops=13897273.6/s, lat=0.07us INFO: iops=13869926.3/s, lat=0.07us INFO: iops=13779602.7/s, lat=0.07us INFO: iops=14159120.5/s, lat=0.07us INFO: iops=14147065.6/s, lat=0.07us INFO: iops=14124141.9/s, lat=0.07us INFO: iops=14162773.3/s, lat=0.07us