Accurate statistics are necessary for the query optimizer to choose an efficient query plan. If the statistics are incorrect or outdated, the query optimizer may generate an inefficient query plan. You can use the ANALYZE statement to update statistics.

The syntax is as follows:

ANALYZE [VERBOSE] [ROOTPARTITION [ALL] ] [table [ (column [, ...] ) ]]

Generate statistics selectively

The execution of an ANALYZE statement with no parameters updates statistics for all tables in a database. This can be a very long-running process. You can specify a table or column to collect its statistics. If data is changed, you can only execute ANALYZE statements for tables with changed data.

The execution of an ANALYZE statement on a large table takes a long time. Therefore, to shorten the time needed, you can execute only the ANALYZE table(column, ...) statement to generate statistics for selected columns, such as columns used in joins, WHERE clauses, SORT clauses, GROUP BY clauses, or HAVING clauses. For a partitioned table, you can find its name in the pg_partitions system catalog by executing the following statement and then execute ANALYZE statements on partitions with changed data:

SELECT partitiontablename from pg_partitions WHERE tablename='parent_table';

When to execute ANALYZE statements

Execute ANALYZE statements after:

  • Data loading
  • CREATE INDEX operations
  • A large number of INSERT, UPDATE, and DELETE operations

An ANALYZE statement requires only a read lock on a table. Therefore, it can be executed in parallel with other database activities. Do not execute ANALYZE statements when you import data or execute INSERT, UPDATE, DELETE, or CREATE INDEX statements.