This topic describes the global RelCache.
Background information
A RelCache is used exclusively by a session and does not have an eviction mechanism. As a result, when many connections or tables are involved, a RelCache may consume too much memory and even face the risk of out-of-memory (OOM).
PolarDB for PostgreSQL uses a two-layer cache architecture: global RelCache and local RelCache. Some system tables are stored only in the local RelCache. Other common user tables and most system tables are automatically loaded to the global RelCache. The local RelCache only requires a relatively small capacity and can load data from the global RelCache. If the maximum capacity of the local RelCache is reached, the local RelCache is evicted.
Overview
In PostgreSQL, a RelCache stores the metadata of all tables and is used exclusively by a process. A RelCache may consume too much memory if a lot of connections are established. The global RelCache occupies the shared memory and can be shared by all processes to improve memory usage.
Parameters
The following GUC parameters related to the global RelCache are provided. You can use these parameters to enable or disable the global RelCache and local RelCache features and improve performance.
Parameter | Description | Level |
---|---|---|
polar_local_relcache_size | Specifies the capacity of the local RelCache. Valid values: 0 to INT_MAX. Default value: 0, which indicates that the capacity is not limited and eviction is not enabled. Unit: KB. Note We recommend that you set this parameter based on the number of active tables. In general, you can use the default value. You can also set this parameter after checking the query hit ratio described in the Monitoring section. | PGC_USERSET |
polar_enable_global_relcache | Specifies whether to enable the global RelCache feature. Valid values: on and off. Default value: on. Note
| PGC_SIGHUP |
polar_global_relcache_size | Specifies the capacity of the global RelCache. Valid values: 0 to INT_MAX. Default value: 1024. Unit: MB. Note We recommend that you set this parameter to be sufficient to accommodate all tables, including indexes, toast tables, and views. This memory part is shared by all sessions and is initialized when the cluster starts. Therefore, you must restart the cluster to validate the value of this parameter. | PGC_POSTMASTER |
O&M
All monitoring interfaces are stored in the polar_grc extension. To use the monitoring interfaces, execute the create extension polar_grc;
statement first. You can execute the create/drop extension
statement to implement O&M.
- Local RelCache forced eviction
The local RelCache is evicted in a passive and gradual manner. If a session has scheduled tasks or sudden fluctuation of very high and zero traffic, the timing mechanism of the local RelCache can be used to quickly reduce memory usage. However, you may need to manually intervene the memory usage of the local RelCache for a session in some scenarios.
For example, you may find that the specified capacity of the local RelCache is very high and that too much memory is consumed. After you revise down the capacity of the local RelCache, the session has already occupied excess memory and is now in the idle state. Then, you can use this interface to manually intervene the memory usage of the local RelCache.Note In PostgreSQL,config reload
uses the lazy mode. After the SIGHUP signal is received, theconfig file
may be actually reloaded upon the execution of the next SQL statement. This is reasonable for PostgreSQL, because the modifications are not used during this period.You can execute the following statement to call the polar_relcache_evict function, which can evict all relations that can be removed from the local RelCache to reduce memory usage.select polar_relcache_evict($backend_pid);
- Reset monitoring dataMost monitoring data use ever-increasing or decreasing counters. The counters can be reset by using the
reset
parameter in the function that is used to obtain monitoring data, or by using an additional reset interface. You also can specify the parameter to reset monitoring data when you use thepolar_relcache_get_stat
orpolar_relcache_get_partition_stat
function to obtain monitoring data.CREATE FUNCTION polar_relcache_reset_stat( OUT num_reset int4) RETURNS integer AS 'MODULE_PATHNAME', 'polar_relcache_reset_stat' LANGUAGE C; CREATE FUNCTION polar_grc_reset_stat( OUT num_reset int4) RETURNS integer AS 'MODULE_PATHNAME', 'polar_grc_reset_stat' LANGUAGE C;
Monitoring
The global RelCache and local RelCache provide several functions and views. Functions are supersets of views. Views are created only for ease of use. All functions and views are defined in the polar_grc extension.
- Local RelCache
- polar_relcache_get_stat function and polar_relcache_stat view
The the polar_relcache_get_stat function returns exactly the same output as the polar_relcache_stat view. They are mainly used to obtain various states and statistics of the local RelCache.
The difference is that the reset parameter can be specified in the function to reset the statistical counter. Most statistics use counters. The counters continue to accumulate before you reset them. If you want to obtain the statistics within a specified time range, you can reset the counters.
MetricsMetric Description reset Specifies whether to reset all statistic counters. Valid values: - true: resets all statistic counters. All counters are reset after the statistics are returned for the current task.
- false: does not reset all statistic counters.
backend_pid The backend process for the statistics. num_entry The number of relations that are stored in the current local RelCache. count_insert The number of insert operations on the local RelCache since the last counter reset. count_delete The number of delete operations on the local RelCache since the last counter reset. count_nailed_lookup The number of nailed table queries on the local RelCache since the last counter reset. count_lookup The number of queries on the local RelCache since the last counter reset. Note Only the number of non-nailed table queries is included. The total number of queries is the sum of the values of this metric and the count_nailed_lookup metric.count_lookup_miss The number of query misses on the local RelCache since the last counter reset. Note The count_nailed_lookup and count_lookup metrics contain the number of hits and misses. This metric indicates the number of misses. You can use the formula of (count_lookup - count_lookup_miss)/count_lookup to calculate the query hit rate for nailed tables, and use the formula of (count_lookup + count_nailed_lookup-count_lookup_miss)/count_lookup to calculate the hit rate for all queries on the local RelCache. In general, the preceding hit rate metric is more useful.count_evict The total number of relations that are evicted by the local RelCache since the last counter reset. count_forced_evict The total number of relations that are evicted by the local RelCache and using the polar_relcache_evict function since the last counter reset. count_timeout_evict The total number of relations that are evicted by the local RelCache and using the timeout mechanism since the last counter reset. Note For active connections, the value of this metric is generally 0. The sum of count_timeout_evict and count_evict reflects the number of relations that are normally evicted by the local RelCache.count_evictcycle The total number of eviction tasks on the local RelCache since the last counter reset. count_empty_evictcycle The total number of invalid eviction tasks on the local RelCache since the last counter reset. The value of this metric is generally 0. avg_step_length The average number of relations that are evicted each time by the local RelCache since the last counter reset. Note In general, the value of this metric is not very large, because the eviction algorithm tries to keep an even eviction process.max_step_length The maximum number of relations that are evicted by the local RelCache at a time since the last counter reset. Note The value of this metric may be very large in the case of sudden traffic surge.avg_idletime The average idle period for each relation that is evicted by the local RelCache since the last counter reset. Note A smaller value indicates that the current local RelCache is under great pressure to evict relations and that the current capacity specified by polar_local_relcache_size is low. If possible, increase the capacity.max_idletime The maximum idle period among all relations that are evicted by the local RelCache since the last counter reset. Note This metric reflects the pressure to evict relations on the current local RelCache. If the value of this metric is large, the current capacity is high. You can decrease the capacity. If the value of this metric is very small, the current capacity is low.avg_evict_time The average time consumed for each eviction task on the local RelCache since the last counter reset. max_evict_time The maximum time consumed among all eviction tasks on the local RelCache since the last counter reset. Note This metric and avg_evict_time together reflect the performance of the eviction algorithm. In general, the values of the two metrics are very low. If a relatively large value is found, troubleshoot exceptions.memory_context_total_space The total space of memory context for the current RelCache. Unit: bytes. memory_context_free_space The free space of memory context for the current RelCache. It includes unused blocks and idle chunks. Note This metric and memory_context_total_space are used together to evaluate the current memory usage, as well as internal fragments. - polar_relcache_get_relation function and polar_relcache_relation view
The polar_relcache_get_relation function and polar_relcache_relation view are used to obtain the relations stored in the local RelCache. The local RelCache is used exclusive by a session. Therefore, only the relations for the current session are returned. The main purposes include debugging, troubleshooting, and fault discovery.
- polar_relcache_get_stat function and polar_relcache_stat view
- Global RelCache
polar_grc_get_stat function and polar_grc_stat view
Similar to polar_relcache_stat, the polar_grc_get_stat function and polar_grc_stat view are mainly used to obtain various states and statistics of the global RelCache. The the polar_grc_get_stat function returns exactly the same output as the polar_grc_stat view. The difference is that the reset parameter can be specified in the function to reset the statistical counter.
MetricsCREATE FUNCTION polar_grc_get_stat( IN reset boolean, OUT num_entry int8, OUT count_insert int8, OUT count_update int8, OUT count_delete int8, OUT count_lookup int8, OUT count_lookup_miss int8, OUT count_OOM int8, OUT count_invalid_lookup int8, OUT count_error_clean int8, OUT num_partitions int4, OUT shm_free_space int8, OUT shmaset_total_space int8, OUT shmaset_free_space int8 ) RETURNS record AS 'MODULE_PATHNAME', 'polar_grc_get_stat' LANGUAGE C; CREATE VIEW polar_grc_stat AS SELECT * FROM polar_grc_get_stat(false);
Metric Description num_entry The number of relations that are stored in the global RelCache, including all databases. count_insert The number of insert operations on the global RelCache since the last counter reset. count_update The number of update operations on the global RelCache since the last counter reset. count_delete The number of delete operations on the global RelCache since the last counter reset. count_lookup The number of queries on the global RelCache since the last counter reset. count_lookup_miss The number of query misses on the global RelCache since the last counter reset. Note You can use the formula of (count_lookup - count_lookup_miss)/count_lookup to calculate the query hit rate for the global RelCache. Generally, the query hit rate for the global RelCache remains a high value after cache prefetching.count_OOM The number of out-of-memory (OOM) errors on the global RelCache since the last counter reset. Note The shared memory is initialized when the cluster starts and has a fixed capacity. Therefore, if the capacity of the shared memory is low and becomes insufficient during use, insert and update operations fail on the global RelCache. This metric records the number of OOM errors.count_invalid_lookup The number of query hits on the global RelCache. Note If the current relation is being concurrently updated by other sessions and in the invalid state, this query is invalid. Typically, the value of this metric is 0 or very small. If a relatively large value is found and you are not using a special business scenario, troubleshoot exceptions.count_error_clean If a process crashes or an error occurs when transactions are committed, the relations in the global RelCache are cleared before the rollback or exit of the process. This metric records the number of such clear operations. Note If a very large value is found, the cluster is abnormal.num_partitions The number of hash-partitioned tables used by the global RelCache. Note Typically, the value of this metric is equal to thepolar_global_relcache_partitions
value. If the value of this metric is smaller, an OOM error occurs and some hash-partitioned tables are disabled.shm_free_space The free space of the shared memory for the global RelCache. Note If the value of this metric is very small, the capacity specified bypolar_global_relcache_size
is low. We recommend that you increase the capacity. Otherwise, an OOM error may occur and the performance may be degraded.shmaset_total_space The total space of memory context for the global RelCache. Note shmaset_total_space + shm_free_space = polar_global_relcache_sizeshmaset_free_space The free space of memory context for the global RelCache. Note This metric and shmaset_total_space are used together to indicate the current shared memory usage, as well as internal fragments.