This topic describes how to use the index_adviser plug-in on an ApsaraDB RDS for PostgreSQL instance. This plug-in helps you determine the columns on which you need to create indexes to improve query performance for specified workloads. This plug-in can recognize only single-column or composite B-tree indexes. This plug-in cannot recognize other types of indexes that can improve performance. For example, this plug-in cannot recognize GIN, GiST, or Hash indexes.

Prerequisites

The minor engine version of the RDS instance is 20220130 or later. For more information about how to view and update the minor engine version of an RDS instance, see Update the minor engine version of an ApsaraDB RDS for PostgreSQL instance.

Components of the index_adviser plug-in

When you execute the statement that is used to create the index_adviser plug-in, the index_advisory table, show_index_advisory() function, and select_index_advisory view are also created.

Component Description
index_advisory A table that is created when the index_adviser plug-in is created. This table is used to record indexing suggestions.
show_index_advisory() A PL/pgSQL function that interprets and displays the suggestions made during a specific session. The session is identified by its backend process ID.
select_index_advisory A view that is created by the index_adviser plug-in based on the information stored in the index_advisory table during query analysis. The format of the view is the same as the format of the output of the show_index_advisory() function. The view contains all indexing suggestions for the specified session.

Usage

  1. Create an index_adviser plug-in.
    postgres=# create extension index_adviser;
    CREATE EXTENSION
  2. Load the index_adviser plug-in.
    postgres=# LOAD 'index_adviser';
    LOAD
    Note The preceding statement is valid only for the current session. If you want all sessions to load the index_adviser plug-in by default, you must configure the shared_preload_libraries parameter in the index_adviser plug-in and restart the RDS instance. However, this may affect the performance of the RDS instance. Perform the following configuration:
    shared_preload_libraries='index_adviser'

Examples

  • Create a table
    CREATE TABLE t( a INT, b INT );
    INSERT INTO t SELECT s, 99999 - s FROM generate_series(0,99999) AS s;
    ANALYZE t;
    The table contains the following rows:
       a   |   b
    -------+-------
         0 | 99999
         1 | 99998
         2 | 99997
         3 | 99996
           .
           .
           .
     99997 |     2
     99998 |     1
     99999 |     0
  • Query the indexing suggestions for a single SQL statement.
    • If you want to use the index_adviser plug-in to analyze a query and obtain the indexing suggestions but you do not want to execute the query, use the EXPLAIN keyword as the prefix of the SQL statement. Example:
      postgres=# EXPLAIN SELECT * FROM t WHERE a < 10000;
                                         QUERY PLAN
      ---------------------------------------------------------------------------------
       Seq Scan on t  (cost=0.00..1693.00 rows=9983 width=8)
         Filter: (a < 10000)
       Result  (cost=0.00..0.00 rows=0 width=0)
         One-Time Filter: '** plan (using Index Adviser) **'::text
         ->  Index Scan using "<1>t_a_idx" on t  (cost=0.42..256.52 rows=9983 width=8)
               Index Cond: (a < 10000)
      (6 rows)
      postgres=# EXPLAIN SELECT * FROM t WHERE a = 100;
                                       QUERY PLAN
      ----------------------------------------------------------------------------
       Seq Scan on t  (cost=0.00..1693.00 rows=1 width=8)
         Filter: (a = 100)
       Result  (cost=0.00..0.00 rows=0 width=0)
         One-Time Filter: '** plan (using Index Adviser) **'::text
         ->  Index Scan using "<1>t_a_idx" on t  (cost=0.42..2.64 rows=1 width=8)
               Index Cond: (a = 100)
      (6 rows)
      postgres=# EXPLAIN SELECT * FROM t WHERE b = 10000;
                                       QUERY PLAN
      ----------------------------------------------------------------------------
       Seq Scan on t  (cost=0.00..1693.00 rows=1 width=8)
         Filter: (b = 10000)
       Result  (cost=0.00..0.00 rows=0 width=0)
         One-Time Filter: '** plan (using Index Adviser) **'::text
         ->  Index Scan using "<1>t_b_idx" on t  (cost=0.42..2.64 rows=1 width=8)
               Index Cond: (b = 10000)
      (6 rows)
    • You can use the PostgreSQL CLI to query indexing suggestions from the index_advisory table. Example:
      postgres=# SELECT * FROM index_advisory;
       reloid | relname | attrs | benefit | original_cost | new_cost | index_size | backend_pid |            timestamp
      --------+---------+-------+---------+---------------+----------+------------+-------------+----------------------------------
        16438 | t       | {1}   | 1337.43 |          1693 |  355.575 |       2624 |       79370 | 18-JUN-21 08:55:51.492388 +00:00
        16438 | t       | {1}   | 1684.56 |          1693 |    8.435 |       2624 |       79370 | 18-JUN-21 08:59:00.319336 +00:00
        16438 | t       | {2}   | 1684.56 |          1693 |    8.435 |       2624 |       79370 | 18-JUN-21 08:59:07.814453 +00:00
      (3 rows)
      Field Type Description
      reloid oid The OID of the table for the index.
      relname name The name of the table for the index.
      attrs integer[] The column to which the indexing suggestion is generated. The column is identified by a number.
      benefit real The benefit of using the index to accelerate the query.
      original_cost real The average amount of time that is required to execute the SQL statement before you use the index to accelerate the query.
      new_cost real The average amount of time that is required to execute the SQL statement after you use the index to accelerate the query.
      index_size integer The estimated index size in the disk page.
      backend_pid integer The ID of the process that generated this suggestion.
      timestamp timestamp The date and time when this suggestion was generated.
    • If the SQL statement is not prefixed with the EXPLAIN keyword, the index_adviser plug-in analyzes the SQL statement when the query is being executed and records indexing suggestions.
    Note You cannot use the index_adviser plug-in in read-only transactions.
  • Query the indexing suggestions for a specified workload.
    • Obtain the indexing suggestions for a session by using the show_index_advisory() function.
      This function is used to obtain the indexing suggestions for a session. The session is identified by its backend process ID. You can call this function by specifying the process ID of the session.
      SELECT show_index_advisory( pid );
      Note pid indicates the process ID of the current session. You can obtain the process ID by using the backend_pid parameter in the index_advisory table. You can also specify null as a passed value to return the result set for the current session.
      postgres=# SELECT show_index_advisory(null);
                                                                   show_index_advisory
      ----------------------------------------------------------------------------------------------------------------------------------------------------
       create index idx_t_a on public.t(a);/* size: 2624 KB, benefit: 3021.99, gain: 1.15167301457103, original_cost: 1693, new_cost: 182.005006313324 */
       create index idx_t_b on public.t(b);/* size: 2624 KB, benefit: 1684.56, gain: 0.641983590474943, original_cost: 1693, new_cost: 8.4350004196167 */
      (2 rows)
      Note The following description shows the meaning of each row in the result set:
      • The SQL statement that is used to create an index from the indexing suggestions.
      • The estimated size of the index page.
      • The benefit of using the index to accelerate the query.
      • The gain of using the index. The following formula is used to calculate the gain of the index: Gain of using the index = Benefit of using the index/Consumed size of the index.
      • The average amount of time that is required to execute the SQL statement before you use the index to accelerate the query.
      • The average amount of time that is required to execute the SQL statement after you use the index to accelerate the query.
    • Obtain the indexing suggestions for a session by using the select_index_advisory view.
      This view contains calculated metrics and CREATE INDEX statements and provides indexing suggestions for all sessions in the index_advisory table. The following example shows the indexing suggestions for Column a and Column b of Table t:
      postgres=# SELECT * FROM select_index_advisory;
       backend_pid |                                                             show_index_advisory
      -------------+----------------------------------------------------------------------------------------------------------------------------------------------------
             79370 | create index t_a_idx on public.t(a);/* size: 2624 KB, benefit: 3021.99, gain: 1.15167301457103, original_cost: 1693, new_cost: 182.005006313324 */
             79370 | create index t_b_idx on public.t(b);/* size: 2624 KB, benefit: 1684.56, gain: 0.641983590474943, original_cost: 1693, new_cost: 8.4350004196167 */
      (2 rows)
      In each session, the results of all queries that benefit from the same indexing suggestion are combined into a set of metrics for the indexing suggestion. The metric is represented by a field named benefit and a field named gain. The following formula shows how to calculate the values of the two fields:
      size = MAX(index size of all queries)
      benefit = SUM(benefit of each query)
      gain = SUM(benefit of each query) / MAX(index size of all queries)
      Note If the indexing suggestions recommend that you create multiple indexes for a single SQL statement, the new_cost field of the index_advisory table records the cost after multiple indexes are created.

      The gain field is useful for comparing the advantages between different recommended indexes during the specified session. A larger value of the gain field indicates a higher benefit of the recommended index. The benefit can offset the disk space that the recommended index may consume.