Problem description
The number of slow SQL queries on an ApsaraDB RDS for PostgreSQL instance increases because the stats collector process consumes a large number of CPU resources and causes heavy I/O loads.
On the Enhanced Monitoring tab, you can use the os.cpu_process.pgstat metric to monitor the CPU utilization of the stats collector process. For more information, see View the Enhanced Monitoring metrics.
Causes
If your RDS instance runs PostgreSQL 14 or earlier, the stats collector process is used to write the statistics of the RDS instance to files, and the autovacuum worker process is used to frequently update the statistics. The autovacuum worker process reads the statistics, updates the statistics, and then sends the updated statistics to the stats collector process. Then, the stats collector process writes the updated statistics to files. If autovacuum is frequently called, the autovacuum worker and stats collector processes consume a large number of CPU resources and cause heavy I/O loads. As a result, a large number of slow SQL queries are generated. If autovacuum is called to clean up dead tuples, the cleanup efficiency decreases. This also increases the number of slow SQL queries.
If the number of tables on an RDS instance increases, the size of statistics to be collected increases. As a result, the efficiencies of statistics writes and reads are affected, and a large number of slow SQL queries occur.
Solutions
Short-term solutions
Execute the
VACUUM FREEZEstatement at regular intervals to reduce the frequency to callautovacuum.Increase the values of parameters such as
autovacuum_vacuum_scale_factorandautovacuum_vacuum_insert_scale_factorto reduce the frequency to callautovacuum. For more information, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance.Use general Enterprise SSDs (ESSDs) to improve the I/O throughput. For more information, see General ESSDs. You can change the storage type from ESSD to general ESSD. For more information, see Change the storage type from ESSD to general ESSD.
Long-term solution
Upgrade the major engine version of your RDS instance to PostgreSQL 15 or later. If your RDS instance runs PostgreSQL 15 or later, the statistics of the RDS instance are stored in the shared memory and the stats collector process is removed. This prevents the consumption of physical I/O resources. For more information about how to upgrade the major engine version of an RDS instance, see Upgrade the major engine version.