This topic describes the global RelCache.

Background information

A RelCache is used exclusively by a session and does not have an eviction mechanism. As a result, when many connections or tables are involved, a RelCache may consume too much memory and even face the risk of out-of-memory (OOM).

PolarDB for PostgreSQL uses a two-layer cache architecture: global RelCache and local RelCache. Some system tables are stored only in the local RelCache. Other common user tables and most system tables are automatically loaded to the global RelCache. The local RelCache only requires a relatively small capacity and can load data from the global RelCache. If the maximum capacity of the local RelCache is reached, the local RelCache is evicted.

Overview

In PostgreSQL, a RelCache stores the metadata of all tables and is used exclusively by a process. A RelCache may consume too much memory if a lot of connections are established. The global RelCache occupies the shared memory and can be shared by all processes to improve memory usage.

Parameters

The following GUC parameters related to the global RelCache are provided. You can use these parameters to enable or disable the global RelCache and local RelCache features and improve performance.

ParameterDescriptionLevel
polar_local_relcache_sizeSpecifies the capacity of the local RelCache. Valid values: 0 to INT_MAX. Default value: 0, which indicates that the capacity is not limited and eviction is not enabled. Unit: KB.
Note We recommend that you set this parameter based on the number of active tables. In general, you can use the default value. You can also set this parameter after checking the query hit ratio described in the Monitoring section.
PGC_USERSET
polar_enable_global_relcacheSpecifies whether to enable the global RelCache feature. Valid values: on and off. Default value: on.
Note
  • Typically, the global RelCache feature has almost no impact on performance. It may still have detectable performance overhead in some extreme cases. If you are sure that the RelCache does not consume too much memory, you can use this parameter to disable the global RelCache feature to avoid performance overhead.
  • Although the global RelCache feature is also disabled when the polar_global_relcache_size is set to 0, the modified polar_global_relcache_size value can only take effect after the cluster is restarted. Therefore, this parameter is not suitable for enabling or disabling the global RelCache feature.
  • After this parameter is modified, the new value takes effect on each session when the next transaction starts.
PGC_SIGHUP
polar_global_relcache_sizeSpecifies the capacity of the global RelCache. Valid values: 0 to INT_MAX. Default value: 1024. Unit: MB.
Note We recommend that you set this parameter to be sufficient to accommodate all tables, including indexes, toast tables, and views. This memory part is shared by all sessions and is initialized when the cluster starts. Therefore, you must restart the cluster to validate the value of this parameter.
PGC_POSTMASTER

O&M

All monitoring interfaces are stored in the polar_grc extension. To use the monitoring interfaces, execute the create extension polar_grc; statement first. You can execute the create/drop extension statement to implement O&M.

  • Local RelCache forced eviction

    The local RelCache is evicted in a passive and gradual manner. If a session has scheduled tasks or sudden fluctuation of very high and zero traffic, the timing mechanism of the local RelCache can be used to quickly reduce memory usage. However, you may need to manually intervene the memory usage of the local RelCache for a session in some scenarios.

    For example, you may find that the specified capacity of the local RelCache is very high and that too much memory is consumed. After you revise down the capacity of the local RelCache, the session has already occupied excess memory and is now in the idle state. Then, you can use this interface to manually intervene the memory usage of the local RelCache.
    Note In PostgreSQL, config reload uses the lazy mode. After the SIGHUP signal is received, the config file may be actually reloaded upon the execution of the next SQL statement. This is reasonable for PostgreSQL, because the modifications are not used during this period.
    You can execute the following statement to call the polar_relcache_evict function, which can evict all relations that can be removed from the local RelCache to reduce memory usage.
    select polar_relcache_evict($backend_pid);
  • Reset monitoring data
    Most monitoring data use ever-increasing or decreasing counters. The counters can be reset by using the reset parameter in the function that is used to obtain monitoring data, or by using an additional reset interface. You also can specify the parameter to reset monitoring data when you use the polar_relcache_get_stat or polar_relcache_get_partition_stat function to obtain monitoring data.
    CREATE FUNCTION polar_relcache_reset_stat(
            OUT num_reset int4)
    RETURNS integer
    AS 'MODULE_PATHNAME', 'polar_relcache_reset_stat'
    LANGUAGE C;
    
    CREATE FUNCTION polar_grc_reset_stat(
            OUT num_reset int4)
    RETURNS integer
    AS 'MODULE_PATHNAME', 'polar_grc_reset_stat'
    LANGUAGE C;

Monitoring

The global RelCache and local RelCache provide several functions and views. Functions are supersets of views. Views are created only for ease of use. All functions and views are defined in the polar_grc extension.

  • Local RelCache
    • polar_relcache_get_stat function and polar_relcache_stat view

      The the polar_relcache_get_stat function returns exactly the same output as the polar_relcache_stat view. They are mainly used to obtain various states and statistics of the local RelCache.

      The difference is that the reset parameter can be specified in the function to reset the statistical counter. Most statistics use counters. The counters continue to accumulate before you reset them. If you want to obtain the statistics within a specified time range, you can reset the counters.

      Metrics
      MetricDescription
      resetSpecifies whether to reset all statistic counters. Valid values:
      • true: resets all statistic counters. All counters are reset after the statistics are returned for the current task.
      • false: does not reset all statistic counters.
      backend_pidThe backend process for the statistics.
      num_entryThe number of relations that are stored in the current local RelCache.
      count_insertThe number of insert operations on the local RelCache since the last counter reset.
      count_deleteThe number of delete operations on the local RelCache since the last counter reset.
      count_nailed_lookupThe number of nailed table queries on the local RelCache since the last counter reset.
      count_lookupThe number of queries on the local RelCache since the last counter reset.
      Note Only the number of non-nailed table queries is included. The total number of queries is the sum of the values of this metric and the count_nailed_lookup metric.
      count_lookup_missThe number of query misses on the local RelCache since the last counter reset.
      Note The count_nailed_lookup and count_lookup metrics contain the number of hits and misses. This metric indicates the number of misses. You can use the formula of (count_lookup - count_lookup_miss)/count_lookup to calculate the query hit rate for nailed tables, and use the formula of (count_lookup + count_nailed_lookup-count_lookup_miss)/count_lookup to calculate the hit rate for all queries on the local RelCache. In general, the preceding hit rate metric is more useful.
      count_evictThe total number of relations that are evicted by the local RelCache since the last counter reset.
      count_forced_evictThe total number of relations that are evicted by the local RelCache and using the polar_relcache_evict function since the last counter reset.
      count_timeout_evictThe total number of relations that are evicted by the local RelCache and using the timeout mechanism since the last counter reset.
      Note For active connections, the value of this metric is generally 0. The sum of count_timeout_evict and count_evict reflects the number of relations that are normally evicted by the local RelCache.
      count_evictcycleThe total number of eviction tasks on the local RelCache since the last counter reset.
      count_empty_evictcycleThe total number of invalid eviction tasks on the local RelCache since the last counter reset. The value of this metric is generally 0.
      avg_step_lengthThe average number of relations that are evicted each time by the local RelCache since the last counter reset.
      Note In general, the value of this metric is not very large, because the eviction algorithm tries to keep an even eviction process.
      max_step_lengthThe maximum number of relations that are evicted by the local RelCache at a time since the last counter reset.
      Note The value of this metric may be very large in the case of sudden traffic surge.
      avg_idletimeThe average idle period for each relation that is evicted by the local RelCache since the last counter reset.
      Note A smaller value indicates that the current local RelCache is under great pressure to evict relations and that the current capacity specified by polar_local_relcache_size is low. If possible, increase the capacity.
      max_idletimeThe maximum idle period among all relations that are evicted by the local RelCache since the last counter reset.
      Note This metric reflects the pressure to evict relations on the current local RelCache. If the value of this metric is large, the current capacity is high. You can decrease the capacity. If the value of this metric is very small, the current capacity is low.
      avg_evict_timeThe average time consumed for each eviction task on the local RelCache since the last counter reset.
      max_evict_timeThe maximum time consumed among all eviction tasks on the local RelCache since the last counter reset.
      Note This metric and avg_evict_time together reflect the performance of the eviction algorithm. In general, the values of the two metrics are very low. If a relatively large value is found, troubleshoot exceptions.
      memory_context_total_spaceThe total space of memory context for the current RelCache. Unit: bytes.
      memory_context_free_spaceThe free space of memory context for the current RelCache. It includes unused blocks and idle chunks.
      Note This metric and memory_context_total_space are used together to evaluate the current memory usage, as well as internal fragments.
    • polar_relcache_get_relation function and polar_relcache_relation view

      The polar_relcache_get_relation function and polar_relcache_relation view are used to obtain the relations stored in the local RelCache. The local RelCache is used exclusive by a session. Therefore, only the relations for the current session are returned. The main purposes include debugging, troubleshooting, and fault discovery.

  • Global RelCache

    polar_grc_get_stat function and polar_grc_stat view

    Similar to polar_relcache_stat, the polar_grc_get_stat function and polar_grc_stat view are mainly used to obtain various states and statistics of the global RelCache. The the polar_grc_get_stat function returns exactly the same output as the polar_grc_stat view. The difference is that the reset parameter can be specified in the function to reset the statistical counter.
    CREATE FUNCTION polar_grc_get_stat(
            IN    reset boolean,
            OUT num_entry int8,
            OUT count_insert int8,
            OUT count_update int8,
            OUT count_delete int8,
            OUT count_lookup int8,
            OUT count_lookup_miss int8,
            OUT count_OOM int8,
            OUT    count_invalid_lookup int8,
            OUT    count_error_clean int8,
            OUT num_partitions int4,
            OUT shm_free_space int8,
            OUT shmaset_total_space int8,
            OUT shmaset_free_space int8
    )
    RETURNS record
    AS 'MODULE_PATHNAME', 'polar_grc_get_stat'
    LANGUAGE C;
    
    CREATE VIEW polar_grc_stat AS
        SELECT * FROM polar_grc_get_stat(false);
    Metrics
    MetricDescription
    num_entryThe number of relations that are stored in the global RelCache, including all databases.
    count_insertThe number of insert operations on the global RelCache since the last counter reset.
    count_updateThe number of update operations on the global RelCache since the last counter reset.
    count_deleteThe number of delete operations on the global RelCache since the last counter reset.
    count_lookupThe number of queries on the global RelCache since the last counter reset.
    count_lookup_missThe number of query misses on the global RelCache since the last counter reset.
    Note You can use the formula of (count_lookup - count_lookup_miss)/count_lookup to calculate the query hit rate for the global RelCache. Generally, the query hit rate for the global RelCache remains a high value after cache prefetching.
    count_OOMThe number of out-of-memory (OOM) errors on the global RelCache since the last counter reset.
    Note The shared memory is initialized when the cluster starts and has a fixed capacity. Therefore, if the capacity of the shared memory is low and becomes insufficient during use, insert and update operations fail on the global RelCache. This metric records the number of OOM errors.
    count_invalid_lookupThe number of query hits on the global RelCache.
    Note If the current relation is being concurrently updated by other sessions and in the invalid state, this query is invalid. Typically, the value of this metric is 0 or very small. If a relatively large value is found and you are not using a special business scenario, troubleshoot exceptions.
    count_error_cleanIf a process crashes or an error occurs when transactions are committed, the relations in the global RelCache are cleared before the rollback or exit of the process. This metric records the number of such clear operations.
    Note If a very large value is found, the cluster is abnormal.
    num_partitionsThe number of hash-partitioned tables used by the global RelCache.
    Note Typically, the value of this metric is equal to the polar_global_relcache_partitions value. If the value of this metric is smaller, an OOM error occurs and some hash-partitioned tables are disabled.
    shm_free_spaceThe free space of the shared memory for the global RelCache.
    Note If the value of this metric is very small, the capacity specified by polar_global_relcache_size is low. We recommend that you increase the capacity. Otherwise, an OOM error may occur and the performance may be degraded.
    shmaset_total_spaceThe total space of memory context for the global RelCache.
    Note shmaset_total_space + shm_free_space = polar_global_relcache_size
    shmaset_free_spaceThe free space of memory context for the global RelCache.
    Note This metric and shmaset_total_space are used together to indicate the current shared memory usage, as well as internal fragments.