In most cases, index optimization requires database administrators or developers to learn how queries are executed and optimized by database engines. PolarDB-X 1.0 provides the index recommendation feature to improve user experience and make query optimization easier. This feature uses a cost-based optimizer to analyze queries and recommend indexes. This reduces the round-trip time (RTT) and improves the database performance.

Usage notes

The index recommendation feature analyzes only the SQL statements that you specify and then recommends indexes. Before you create the recommended index, evaluate the impacts of the index on other queries.

How index recommendation works

TPC-H is a benchmark that is commonly used in the industry. TPC-H is defined and released by the Transaction Processing Performance Council to evaluate the capability of a database engine to analyze queries. The TPC-H benchmark involves 8 tables and 22 complex SQL queries (from Q1 to Q22). The following figure shows the information returned for the Q17 query. In the TPC-H benchmark, Q17 is used to query the revenue of small quantity orders. The returned information shows that 28.76 seconds is consumed to handle the query. In this topic, the intelligent index recommendation feature is used to accelerate the query.

Note For more information about the query statements in TCP-H, visit TCP-H documentation.

Figure 1. Before intelligent index optimization
Before intelligent index optimization

Step 1: Query the information about an intelligent index recommendation

To query the intelligent index recommendation for a query statement, you need only to add the EXPLAIN ADVISOR statement at the start of the query statement. The following code provides an example on how to query the intelligent index recommendation for a statement:

SELECT sum(l_extendedprice) / 7.0 AS avg_yearly
FROM lineitem,
WHERE p_partkey = l_partkey
  AND p_brand = 'Brand#23'
  AND p_container = 'MED BOX'
  AND l_quantity <
    (SELECT 0.2 * avg(`l_quantity`)
     FROM lineitem
     WHERE l_partkey = p_partkey);

PolarDB-X 1.0 returns the index recommendation, including the statement used to create the recommended index and the costs before and after the index is created. The following code provides an example of the returned information and related comments:

  • In this example, the disk I/O cost is reduced by 3024.7%. This indicates that the recommended index significantly improves the performance.
  • If PolarDB-X 1.0 cannot recommend an index, a message is returned to recommend you to execute the Analyze Table statement during off-peak hours of your business. The statement refreshes the statistics of the table for which you want to create an index. This operation consumes a large number of I/O resources. After the key distribution information is updated, you can obtain more accurate information about the recommended index.
IMPROVE_VALUE: 2465.3%        # The estimated percentage of overall cost reduction.
  IMPROVE_CPU: 59377.4%       # The estimated percentage of CPU cost reduction.
  IMPROVE_MEM: 0.4%           # The estimated percentage of memory cost reduction.
   IMPROVE_IO: 3024.7%        # The estimated percentage of disk I/O cost reduction.
  IMPROVE_NET: 2011.1%        # The estimated percentage of reduction in the network transmission cost.
 BEFORE_VALUE: 4.711359845E8  # The estimated overall cost before the recommended index is used.
   BEFORE_CPU: 1.19405577E7   # The estimated CPU cost before the recommended index is used.
   BEFORE_MEM: 426811.2       # The estimated memory cost before the recommended index is used.
    BEFORE_IO: 44339          # The estimated disk I/O cost before the recommended index is used.
   BEFORE_NET: 47.5           # The estimated network transmission cost before the recommended index is used.
  AFTER_VALUE: 1.83655008E7   # The estimated overall cost after the recommended index is used.
    AFTER_CPU: 20075.8        # The estimated CPU cost after the recommended index is used.
    AFTER_MEM: 425016         # The estimated memory cost after the recommended index is used.
     AFTER_IO: 1419           # The estimated disk I/O cost after the recommended index is used.
    AFTER_NET: 2.2            # The estimated network transmission cost after the recommended index is used.
 ADVISE_INDEX: ALTER TABLE `lineitem` ADD  INDEX `__advise_index_lineiteml_partkey`(`l_partkey`);
/* The value of ADVISE_INDEX is the recommended statement to create the index. */
     NEW_PLAN:                # The execution plan that uses the recommended index.
Project(avg_yearly="$f0 / ?0")
    Filter(condition="l_quantity < $16 * f17w0$o0")
      SortWindow(p_partkey="p_partkey", l_partkey="l_partkey", l_quantity="l_quantity", l_extendedprice="l_extendedprice", $16="$16", f5w0$o0="window#0AVG($2)", Reference Windows="window#0=window(partition {1} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED PRECEDING aggs [AVG($2)])")
        MemSort(sort="l_partkey ASC")
          BKAJoin(condition="l_partkey = p_partkey", type="inner")
              LogicalView(tables="[0000,0001].part", shardCount=2, sql="SELECT `p_partkey` FROM `part` AS `part` WHERE ((`p_brand` = ?) AND (`p_container` = ?))")
              LogicalView(tables="[0000,0001].lineitem", shardCount=2, sql="SELECT `l_partkey`, `l_quantity`, `l_extendedprice`, ? AS `$16` FROM `lineitem` AS `lineitem` WHERE (`l_partkey` IN (...))")

         INFO: LOCAL_INDEX    # The other information.

Step 2: Create the index based on the recommendation

  1. Evaluate the benefits of creating the index. Then, create the index by executing the returned SQL statement that is indicated by ADVISE_INDEX.
    ALTER TABLE `lineitem` ADD  INDEX `__advise_index_lineiteml_partkey`(`l_partkey`);
  2. Execute the Q17 statement again. In the TPC-H benchmark, Q17 represents Small-Quantity-Order Revenue Query. The time that is consumed to execute the statement is reduced to 1.41 seconds. The query efficiency is significantly improved.
    Figure 2. After intelligent index optimization
    After intelligent index optimization