Accurate statistics are necessary for the query optimizer to choose an efficient query plan. If the statistics are invalid or outdated, the query optimizer may generate an inefficient query plan. You can use the ANALYZE statement to collect statistics.

Execute the ANALYZE statement to collect statistics:

ANALYZE [VERBOSE] [ROOTPARTITION [ALL] ] [table [ (column [, ...] ) ]]

AUTO ANALYZE

Auto-analyze automatically executes the ANALYZE statement. Auto-analyze checks for tables that have a large number of INSERT, UPDATE, or DELETE operations. When necessary, auto-analyze executes the ANALYZE statement on the tables to collect their statistics after the operations. By default, when those operations are performed on more than 10% of rows in a table, auto-analyze executes the ANALYZE statement on the table.

If an AnalyticDB for PostgreSQL instance has multiple coordinator nodes, auto-analyze checks for change operations only on the primary coordinator node. Auto-analyze cannot be triggered if change operations are performed on secondary coordinator nodes.

Note Auto-analyze is available only for AnalyticDB for PostgreSQL 20210527 and later minor versions. For more information about how to upgrade the minor version, see Update the minor kernel version of an AnalyticDB for PostgreSQL instance.

Generate statistics by table or column

The execution of the ANALYZE statement without parameters updates statistics for all tables in a database. This can be a long-running process. You can specify a table or column to collect its statistics. When your data changes, we recommend that you execute the ANALYZE statement on tables with changed data.

It takes a long time to execute the ANALYZE statement on a large table. To reduce the amount of time needed, you can execute only the ANALYZE table(column, ...) statement to generate statistics for selected columns, such as columns used in joins, WHERE clauses, SORT clauses, GROUP BY clauses, or HAVING clauses. For a partition table, you can find its name in the pg_partitions system catalog by executing the following statement and then execute the ANALYZE statement on partitions with changed data:

SELECT partitiontablename from pg_partitions WHERE tablename='parent_table';

When to execute the ANALYZE statement

Execute the ANALYZE statement after the following operations:

  • Data loading
  • CREATE INDEX operations
  • A large number of INSERT, UPDATE, and DELETE operations

The ANALYZE statement requires only a read lock on a table and can be executed in parallel with other database activities. Do not execute the ANALYZE statement when you import data or execute INSERT, UPDATE, DELETE, or CREATE INDEX statements.