Accurate statistics are necessary for the query optimizer to choose an efficient query plan. If the statistics are invalid or outdated, the query optimizer may generate an inefficient query plan. You can use the ANALYZE statement to collect statistics.
Execute the ANALYZE statement to collect statistics:
ANALYZE [VERBOSE] [ROOTPARTITION [ALL] ] [table [ (column [, ...] ) ]]
AUTO ANALYZE
Auto-analyze automatically executes the ANALYZE statement. Auto-analyze checks for tables that have a large number of INSERT, UPDATE, or DELETE operations. When necessary, auto-analyze executes the ANALYZE statement on the tables to collect their statistics after the operations. By default, when those operations are performed on more than 10% of rows in a table, auto-analyze executes the ANALYZE statement on the table.
If an AnalyticDB for PostgreSQL instance has multiple coordinator nodes, auto-analyze checks for change operations only on the primary coordinator node. Auto-analyze cannot be triggered if change operations are performed on secondary coordinator nodes.
Generate statistics by table or column
The execution of the ANALYZE statement without parameters updates statistics for all tables in a database. This can be a long-running process. You can specify a table or column to collect its statistics. When your data changes, we recommend that you execute the ANALYZE statement on tables with changed data.
It takes a long time to execute the ANALYZE statement on a large table. To reduce
the amount of time needed, you can execute only the ANALYZE table(column, ...)
statement to generate statistics for selected columns, such as columns used in joins,
WHERE clauses, SORT clauses, GROUP BY clauses, or HAVING clauses. For a partition
table, you can find its name in the pg_partitions system catalog by executing the
following statement and then execute the ANALYZE statement on partitions with changed
data:
SELECT partitiontablename from pg_partitions WHERE tablename='parent_table';
When to execute the ANALYZE statement
Execute the ANALYZE statement after the following operations:
- Data loading
- CREATE INDEX operations
- A large number of INSERT, UPDATE, and DELETE operations
The ANALYZE statement requires only a read lock on a table and can be executed in parallel with other database activities. Do not execute the ANALYZE statement when you import data or execute INSERT, UPDATE, DELETE, or CREATE INDEX statements.