Automatic index recommendation - PolarDB - Alibaba Cloud Documentation Center

This topic describes the background information and usage of the automatic index recommendation feature.

Prerequisites

Your PolarDB for PostgreSQL (Compatible with Oracle) cluster runs the following engine:

PolarDB for PostgreSQL (Compatible with Oracle) 2.0 whose revision version is 2.0.14.19.0 or later.

Note

You can execute the following statement to view the revision version of your PolarDB for PostgreSQL (Compatible with Oracle) cluster:

show polar_version;

Background information

Database performance optimization is key to database management. Index plays an important role in improving the performance of SQL queries. Optimized indexes allow the database to quickly locate a specific row in a table and obtain sorted data. This greatly reduces the query time.

However, not all indexes can improve database performance. An appropriate index can greatly improve the query performance. An inappropriate index, however, may cause additional maintenance overheads that reduce the overall performance of the database. Invalid indexes also occupy additional storage space. You need to perform in-depth analysis of the execution paths of queries and data distribution to determine when to create indexes on which columns. This requires strong technical skills and rich experience in this area.

PolarDB for PostgreSQL (Compatible with Oracle) provides the automatic index recommendation feature to help you overcome this challenge and optimize database performance. This feature simplifies the process of creating indexes and reduces the difficulty in developing and maintaining indexes. You can execute EXPLAIN statements to analyze the performance of slow SQL statements. The system can provide suggestions on how to create appropriate indexes and estimation on the query performance improvements based on the analysis.

The automatic index recommendation feature provides the following main benefits:

Precise recommendation: uses the statistics that are automatically collected by the database and the optimizer cost model to recommend appropriate index configurations for specific SQL queries. This can help avoid the performance burden and trial and error costs that are caused by invalid indexes.
Efficiency improvements: reduces the time that is required by manual analysis of slow queries. This can help accelerate the identification and resolution of performance issues.
Easy to use: uses a simple EXPLAIN statement to provide the index recommendation information, SQL statements for creating the index, and estimated performance improvements. This optimizes subsequent operations.

Usage

Note

If the recommended indexes cannot meet your requirements or no indexes are recommended, you can contact us for troubleshooting and optimization.
We recommend that you use the PostgreSQL client to perform the following operations, because the DMS client does not display the notice prompt.

Configure automatic index recommendation

Add polar_advisor to session_preload_libraries to load polar_advisor for all users.

-- The value of session_preload_libraries is empty by default. In this example, polar_advisor is added.
alter role all set session_preload_libraries to 'polar_advisor';

Enable index recommendation on specific nodes for all users. We recommend that you enable index recommendation only on read-only nodes to avoid consuming computing resources on the primary node.

-- Enable index recommendation only on read-only nodes (Recommend).
alter role all set polar_advisor_type to 'index(ro)';

-- Enable index recommendation on all nodes.
alter role all set polar_advisor_type to 'index';

Use automatic index recommendation

Execute the EXPLAIN statement without the ANALYZE clause to generate an execution plan for a slow SQL query. The automatic index recommendation feature automatically runs and returns the following information if a recommended index is detected.

advise: the statement that is used to create the recommended index.
old cost: the cost of the current plan.
new cost: the estimated cost of the plan after the recommended index is created.
saved cost: the cost saved by the new plan compared with the old plan.
estimate: the estimated acceleration.

The automatic index recommendation feature supports the following types of operators and can recommend only B-tree indexes. Examples:

Use the following statements to create a table and insert data.

create table t( a int,b int);
insert into t select i,i from generate_series(1,10000)i;
analyze t;

Recommend a single-column index based on an equivalence condition.

/*FORCE_SLAVE*/ explain select * from t where a = 1;
INFO:  IND ADV: advise "CREATE INDEX CONCURRENTLY ON public.t USING btree (a)"
NOTICE:  IND ADV:   old cost 0.00..170.00
NOTICE:  IND ADV:   new cost 0.04..2.25
NOTICE:  IND ADV: saved cost -0.04..167.75
NOTICE:  IND ADV: estimate 75.5 times faster
                    QUERY PLAN
---------------------------------------------------
 Seq Scan on t  (cost=0.00..170.00 rows=1 width=8)
   Filter: (a = 1)
(2 rows)

Recommend a single-column index based on a comparison condition.

/*FORCE_SLAVE*/ explain select * from t where a > 10; -- The amount of returned data is small. No index is recommended.
                      QUERY PLAN
------------------------------------------------------
 Seq Scan on t  (cost=0.00..170.00 rows=9990 width=8)
   Filter: (a > 10)
(2 rows)

 /*FORCE_SLAVE*/ explain select * from t where a < 10;
INFO:  IND ADV: advise "CREATE INDEX CONCURRENTLY ON public.t USING btree (a)"
NOTICE:  IND ADV:   old cost 0.00..170.00
NOTICE:  IND ADV:   new cost 0.04..2.39
NOTICE:  IND ADV: saved cost -0.04..167.61
NOTICE:  IND ADV: estimate 71.1 times faster
                    QUERY PLAN
---------------------------------------------------
 Seq Scan on t  (cost=0.00..170.00 rows=9 width=8)
   Filter: (a < 10)
(2 rows)

Recommend a combined index based on multiple equivalence conditions or comparison conditions.

/*FORCE_SLAVE*/ explain select * from t where a = 1 and b = 1;
INFO:  IND ADV: advise "CREATE INDEX CONCURRENTLY ON public.t USING btree (b, a)"
NOTICE:  IND ADV:   old cost 0.00..195.00
NOTICE:  IND ADV:   new cost 0.04..1.16
NOTICE:  IND ADV: saved cost -0.04..193.84
NOTICE:  IND ADV: estimate 168.8 times faster
                    QUERY PLAN
---------------------------------------------------
 Seq Scan on t  (cost=0.00..195.00 rows=1 width=8)
   Filter: ((a = 1) AND (b = 1))
(2 rows)

Recommend a single-column index based on the Order BY operator.

 /*FORCE_SLAVE*/ explain select * from t order by a limit 10;
INFO:  IND ADV: advise "CREATE INDEX CONCURRENTLY ON public.t USING btree (a)"
NOTICE:  IND ADV:   old cost 361.10..361.12
NOTICE:  IND ADV:   new cost 0.04..0.26
NOTICE:  IND ADV: saved cost 361.06..360.86
NOTICE:  IND ADV: estimate 1366.8 times faster
                            QUERY PLAN
-------------------------------------------------------------------
 Limit  (cost=361.10..361.12 rows=10 width=8)
   ->  Sort  (cost=361.10..386.10 rows=10000 width=8)
         Sort Key: a
         ->  Seq Scan on t  (cost=0.00..145.00 rows=10000 width=8)
(4 rows)

Recommend a single-column index based on the Group BY operator.

/*FORCE_SLAVE*/ explain select a,sum(b) from t group by a having a < 10 ;
INFO:  IND ADV: advise "CREATE INDEX CONCURRENTLY ON public.t USING btree (a)"
NOTICE:  IND ADV:   old cost 170.14..170.30
NOTICE:  IND ADV:   new cost 0.04..2.53
NOTICE:  IND ADV: saved cost 170.11..167.77
NOTICE:  IND ADV: estimate 67.4 times faster
                          QUERY PLAN
---------------------------------------------------------------
 GroupAggregate  (cost=170.14..170.30 rows=9 width=12)
   Group Key: a
   ->  Sort  (cost=170.14..170.17 rows=9 width=8)
         Sort Key: a
         ->  Seq Scan on t  (cost=0.00..170.00 rows=9 width=8)
               Filter: (a < 10)
(6 rows)

Recommend a single-column index based on an equi join operation.

/*FORCE_SLAVE*/ explain select * from t t1,t t2 where t1.a = t2.a limit 10;
INFO:  IND ADV: advise "CREATE INDEX CONCURRENTLY ON public.t USING btree (a)"
NOTICE:  IND ADV:   old cost 270.00..270.28
NOTICE:  IND ADV:   new cost 0.07..0.70
NOTICE:  IND ADV: saved cost 269.93..269.58
NOTICE:  IND ADV: estimate 384.3 times faster
                                 QUERY PLAN
----------------------------------------------------------------------------
 Limit  (cost=270.00..270.28 rows=10 width=16)
   ->  Hash Join  (cost=270.00..552.50 rows=10000 width=16)
         Hash Cond: (t1.a = t2.a)
         ->  Seq Scan on t t1  (cost=0.00..145.00 rows=10000 width=8)
         ->  Hash  (cost=145.00..145.00 rows=10000 width=8)
               ->  Seq Scan on t t2  (cost=0.00..145.00 rows=10000 width=8)
(6 rows)