In most cases, index optimization requires database administrators or developers to learn how queries are executed and optimized by database engines. PolarDB-X provides the index recommendation feature to improve user experience and make query optimization easier. This feature uses a cost-based optimizer to analyze queries and recommend indexes. This reduces the round-trip time (RTT) and improves the database performance.

Precautions

The index recommendation feature analyzes only the SQL queries that you specify and then recommends indexes. Before you create the recommended index, evaluate the impacts of the index on other queries.

How index recommendation works

TPC-H is a benchmark that is commonly used in the industry. TPC-H is defined and released by the Transaction Processing Performance Council to evaluate the capability of a database engine to analyze queries. The TPC-H benchmark involves 8 tables and 22 complex SQL queries (from Q1 to Q22). The following figure shows the information returned for the Q17 query. In the TPC-H benchmark, Q17 is used to query the revenue of small quantity orders. The returned information shows that 28.76 seconds is consumed to handle the query. This topic describes how to use the intelligent index recommendation feature to accelerate the Q17 query.

456789
  1. Query the recommended index information
    To query the index information recommended for a specific query, you need only to specify EXPLAIN ADVISOR at the start of the query. The following code block provides an example:
    EXPLAIN ADVISOR
    SELECT sum(l_extendedprice) / 7.0 AS avg_yearly
    FROM lineitem,
         part
    WHERE p_partkey = l_partkey
      AND p_brand = 'Brand#23'
      AND p_container = 'MED BOX'
      AND l_quantity <
        (SELECT 0.2 * avg(`l_quantity`)
         FROM lineitem
         WHERE l_partkey = p_partkey);
    After PolarDB-X processes the preceding query, PolarDB-X returns the index information, including the recommended index creation statement and the costs before and after the recommended index is created. The following code block shows the index information and comments.
    Note
    • In this example, the disk I/O cost is reduced by 3024.7%. This indicates that the recommended index significantly improves the performance.
    • If PolarDB-X fails to recommend indexes, PolarDB-X returns a message that prompts you to execute the ANALYZE TABLE statement on the base table during off-peak hours. The ANALYZE TABLE statement updates key distribution information and consumes a large amount of I/O resources. After the key distribution information is updated, you can obtain more accurate information about the recommended index.
      IMPROVE_VALUE: 2465.3%        # The estimated percentage of overall cost reduction.
        IMPROVE_CPU: 59377.4%       # The estimated percentage of CPU cost reduction.
        IMPROVE_MEM: 0.4%           # The estimated percentage of memory cost reduction.
         IMPROVE_IO: 3024.7%        # The estimated percentage of disk I/O cost reduction.
        IMPROVE_NET: 2011.1%        # The estimated percentage of reduction in the network transmission cost.
       BEFORE_VALUE: 4.711359845E8  # The estimated overall cost before the recommended index is used.
         BEFORE_CPU: 1.19405577E7   # The estimated CPU cost before the recommended index is used.
         BEFORE_MEM: 426811.2       # The estimated memory cost before the recommended index is used.
          BEFORE_IO: 44339          # The estimated disk I/O cost before the recommended index is used.
         BEFORE_NET: 47.5           # The estimated network transmission cost before the recommended index is used.
        AFTER_VALUE: 1.83655008E7   # The estimated overall cost after the recommended index is used.
          AFTER_CPU: 20075.8        # The estimated CPU cost after the recommended index is used.
          AFTER_MEM: 425016         # The estimated memory cost after the recommended index is used.
           AFTER_IO: 1419           # The estimated disk I/O cost after the recommended index is used.
          AFTER_NET: 2.2            # The estimated network transmission cost after the recommended index is used.
       ADVISE_INDEX: ALTER TABLE `lineitem` ADD  INDEX `__advise_index_lineiteml_partkey`(`l_partkey`);
      /* The value of ADVISE_INDEX is the recommended index creation statement. */
           NEW_PLAN:                # The execution plan that uses the recommended index.
      Project(avg_yearly="$f0 / ?0")
        HashAgg($f0="SUM(l_extendedprice)")
          Filter(condition="l_quantity < $16 * f17w0$o0")
            SortWindow(p_partkey="p_partkey", l_partkey="l_partkey", l_quantity="l_quantity", l_extendedprice="l_extendedprice", $16="$16", f5w0$o0="window#0AVG($2)", Reference Windows="window#0=window(partition {1} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED PRECEDING aggs [AVG($2)])")
              MemSort(sort="l_partkey ASC")
                BKAJoin(condition="l_partkey = p_partkey", type="inner")
                  Gather(concurrent=true)
                    LogicalView(tables="[0000,0001].part", shardCount=2, sql="SELECT `p_partkey` FROM `part` AS `part` WHERE ((`p_brand` = ?) AND (`p_container` = ?))")
                  Gather(concurrent=true)
                    LogicalView(tables="[0000,0001].lineitem", shardCount=2, sql="SELECT `l_partkey`, `l_quantity`, `l_extendedprice`, ? AS `$16` FROM `lineitem` AS `lineitem` WHERE (`l_partkey` IN (...))")
      
               INFO: LOCAL_INDEX    # Other information.
  2. Create the recommended index
    1. Evaluate the performance improvements that are enabled by the recommended index. Then, create the recommended index by executing the SQL statement returned in ADVISE_INDEX.
      ALTER TABLE `lineitem` ADD  INDEX `__advise_index_lineiteml_partkey`(`l_partkey`);
    2. Execute the Q17 query again. Only 1.41 seconds is consumed to handle the query. The query efficiency is greatly improved. 456789