PolarDB-X collects statistics in the background so the cost-based optimizer (CBO) can generate accurate execution plans. Stale or missing statistics lead to suboptimal plans and slower queries.
This topic describes the three statistics collection policies, explains when each policy triggers, and shows how to control or stop collection.
Statistics collection policies
PolarDB-X runs three collection policies, each with a different trigger condition.
| Policy | Trigger | SQL pattern | Parameter |
|---|---|---|---|
| Auto analyze | Once a week, between 02:00 and 05:00 | SQL with a WHERE (RAND() < 3.9623395E-4) clause; SQL with a null IP address in the comment; SQL that calls the HYPERLOGLOG function | ENABLE_BACKGROUND_STATISTIC_COLLECTION |
| Queries on INFORMATION_SCHEMA | Once a day, between 02:00 and 05:00 | SELECT table_schema, table_name, table_rows FROM information_schema.tables WHERE table_schema IN (...) AND table_name IN (...) | N/A |
| Real-time feedback collection | When INSERT or DELETE operations change more than 20% of the rows previously stored in the table | Same SQL patterns as auto analyze | N/A |
IOPS impact
IOPS increase during the collection window
For large tables, IOPS on data nodes increases during the 02:00–05:00 collection window. This is expected behavior.
IOPS increase outside the collection window
If IOPS rises outside the scheduled window, one of the following is happening:
Collection task running overtime: The table is large enough that a single logical table takes longer than the scheduled window to process. The task started at 02:00 but is still running past 05:00.
Real-time feedback collection triggered: A large INSERT or DELETE operation exceeded the 20% threshold, and the resulting collection task is running for an extended period.
Disable background statistics collection
If you need to prevent collection from running during early morning hours, set ENABLE_BACKGROUND_STATISTIC_COLLECTION to False on the Parameter Settings page in the PolarDB-X console.
This setting disables both auto analyze and real-time feedback collection. After disabling, run the following statement manually during off-peak hours to keep statistics current:
ANALYZE TABLE <table_name>;Run ANALYZE TABLE regularly if your tables are frequently updated. For tables with infrequent or predictable updates, you can schedule it less often.
Stop a running collection task
To stop a collection task that is already running:
View the SQL statements currently running on the data node:
SHOW FULL PHYSICAL_PROCESSLIST WHERE INFO LIKE '%rand%';Identify the statement collecting statistics, then run a
KILLstatement to stop it:KILL '<node_id>-<shard_id>-<statement_id>';The kill identifier uses the following format:
Field Description Example node_idID of the data node 0shard_idID of the database shard 0statement_idID of the SQL statement to stop 222For example,
KILL '0-0-222';stops statement222on node0, shard0.