Are You Using the MySQL Index Correctly?

This article starts with the problems encountered and analyzes the TDDL optimizer, the MySQL index, and the selection of shard keys for table partitioning.

By Chengcheng

Background

Recently, due to the need for multi-tenant transformation of the application, some problems with the application were found. The HSF consumer success rate of the application was not always 100 but occasionally 99.99. After reviewing the application log, it was found that the slow SQL caused the service timeout failure.

Locate a specific statement.

(
  select
    `iop_xxx_msg`.`id`,
    `iii_sss_msg`.`message_id`,
    `iii_sss_msg`.`title`,
    `iii_sss_msg`.`content`,
    `iii_sss_msg`.`id_seller`,
    `iii_sss_msg`.`id_user`,
    `iii_sss_msg`.`gmt_create`,
    `iii_sss_msg`.`gmt_modified`,
    `iii_sss_msg`.`is_read`,
    `iii_sss_msg`.`category`,
    `iii_sss_msg`.`sub_category`,
    `iii_sss_msg`.`description`,
    `iii_sss_msg`.`need_side_notice`,
    `iii_sss_msg`.`link_url`,
    `iii_sss_msg`.`btn_name`,
    `iii_sss_msg`.`gmt_create_l`,
    `iii_sss_msg`.`mobile_content`,
    `iii_sss_msg`.`tier`,
    `iii_sss_msg`.`requirement_id`,
    `iii_sss_msg`.`fk_template_id`,
    `iii_sss_msg`.`business_part`,
    `iii_sss_msg`.`business_id`
  from
    `iii_sss_msg_29` `iii_sss_msg`
  WHERE
    (
      (
        (
          `iii_sss_msg`.`gmt_create` >= '2023-07-24 00:00:00'
        )
        AND (
          `iii_sss_msg`.`gmt_create` < '2023-07-31 15:46:45.684'
        )
        AND (`iii_sss_msg`.`id_user` = 500173482096)
        AND (`iii_sss_msg`.`tier` IN ('S', 'A'))
        AND (
          `iii_sss_msg`.`sub_category` IN (1000305, 1000306, 1000501, 1000502)
        )
      )
      OR (
        (`iii_sss_msg`.`category` IN (10003, 10005))
        AND (
          `iii_sss_msg`.`gmt_create` >= '2023-07-24 00:00:00'
        )
        AND (
          `iii_sss_msg`.`gmt_create` < '2023-07-31 15:46:45.684'
        )
        AND (`iii_sss_msg`.`id_user` = ***)
        AND (
          `iii_sss_msg`.`sub_category` IN (1000305, 1000306, 1000501, 1000502)
        )
      )
    )
  order by
    `iii_sss_msg`.`gmt_create` desc
  limit
    0, 5
)union all ...

The statement is relatively complicated. Generally speaking, the meaning of the SQL statement is to find the last five messages of certain categories or with tier='S' of a certain merchant within seven days. The iii_sss_msg table is split into 31 tables based on the creation time. The routing rules are as follows:

<property name="tbRuleArray">
    <value>"iii_sss_msg_" + getCalendar(#gmt_create,1_date,31#).get(Calendar.DAY_OF_MONTH)</value>
  </property>

Therefore, the query connects seven tables iii_sss_msg_29, iii_sss_msg_30...(This table partitioning rule is quite unreasonable, and the analysis will be described later in this article.)

Let's look at the XML file corresponding to MyBatis.

<select id="selectByQuery" resultMap="webMsgResultMap" parameterType="map" >
        select
        <include refid="Base_Column_List" />
        from iii_sss_msg
        where
        id_user = #{userId}
        <if test="startTime != null">
             and gmt_create <![CDATA[>=]]> #{startTime}
        </if>
        <if test="endTime != null">
           and gmt_create <![CDATA[<]]> #{endTime}
        </if>
       ...

It can be found that the where condition sorting of the SQL statement corresponding to MyBatis is different from that of the last SQL statement we see. Is there anything that changed this statement?

TDDL Optimizer

Since our application uses TDDL, the SQL generated in MyBatis will be changed by TDDL which will parse and optimize the MySQL at the Matrix layer. Its workflow is as follows.

The answer to the question above is that the TDDL changes this statement to optimize the query. So, the next question is: Has the purpose been achieved?

Let's continue the analysis. The table index information is as follows:

The table has an index named idx_user which is a joint index of the four fields gmt_create, id_user, category, and sub_category. These fields are all in the where condition of the slow SQL above. It seems that TDDL adjusts the order of SQL statements for the purpose of using that index.

We use the execution plan to analyze the statement. The result is as follows:

The index used is not idx_user(gmt_create,id_user,category,sub_category), but idx_uer_query(id_user, category). It seems that TDDL has done something useless. Why does MySQL choose the index idx_user_query(id_user, category)?

To verify this problem we use force index (idx_user (gmt_create,id_user, category,sub_category)) to force the use of the idx_user.

5_1

By comparing the two execution plans, we find that the number of scanned rows using the index idx_uer_query (id_user, category) is 13,948, but the number of scanned rows using idx_user (gmt_create,id_user,category,sub_category) index is 1,552,218, which is over 100 times the amount of the former. How does MySQL know that it is better to use the index idx_user?

How to Select the Index in MySQL

Selecting an index is the job of the optimizer, which aims to find an optimal execution scheme and execute the statement at the least cost. The optimizer considers the following conditions:

Conditions in the query statement: MySQL selects the most appropriate index based on the conditions in the query statement to locate rows that meet the conditions as quickly as possible. If the query condition includes multiple columns, we can consider creating a multi-column index to make matching more efficient.
Selectivity (cardinality) of the index: MySQL selects the most appropriate index based on the selectivity of the index. Selectivity is the ratio of the number of distinct values in an indexed column to the total number of records in the table. The higher the selectivity is, the more efficient the index is.
Size and data type of the index: The size and data type of the index will also affect the choice of index. Smaller indexes are usually more efficient than larger indexes and different data types also affect the efficiency of indexes.
Size of data blocks: MySQL uses data blocks (or pages) to store index data and table data. The size of data blocks also affects the choice of index. Smaller blocks increase cache efficiency and reduce disk I/O operations.
Coverage of the index: If a query can be satisfied by using a covering index, MySQL usually prefers to use a covering index, which means that all columns required in a query statement are included in the index. Therefore, you do not need to read data from the table again.

According to point 1 and point 5, the idx_user (gmt_create,id_user,category,sub_category) is better, but what about point 2?

We use:

SHOW INDEX FROM `iii_sss_msg_29`

to analyze the cardinality of two indexes:

It is found that the cardinality of the column gmt_create of idx_user (gmt_create,id_user,category,sub_category) is 99,933, while the cardinality of id_user of idx_uer_query (id_user, category) is 286,528 which is better than the former.

Therefore, the optimizer uses the idx_uer_query (id_user, category) index.

Does MySQL Select the Wrong Index?

The difference between the cardinality of the two indexes we analyze above is not very large (the optimizer may also make a mistake). However, the index idx_user(gmt_create,id_user,category,sub_category) matches greater in the conditions of the query statement. That prevents sorting because the index idx_user(gmt_create,id_user,category,sub_category) has already been sorted. If the index idx_user(gmt_create,id_user,category,sub_category) is selected, sorting is not required, and only traversal is required, so the cost is determined to be smaller even if the number of scanned rows is larger. Will this happen?

Let's execute it multiple times and look at the execution plans of other table partitionings. We will find that the optimizer uses the index idx_user this time. The combination of the TDDL optimizer and the idx_user(gmt_create,id_user,category,sub_category) causes the optimizer to mis-select the wrong index. The execution plan with more than one million scanned rows is selected.

The index idx_user(gmt_create,id_user,category,sub_category) takes 995 milliseconds for execution:

The index idx_uer_query(id_user,category) takes 95 milliseconds for execution:

This may be the reason for the slow SQL: the TDDL optimizer chooses the wrong index.

How to Create an Index?

Let's look at the following index:

Why is the index idx_user(gmt_create,id_user,category,sub_category) created? Probably it is created for business requirements that have many statements with the conditions above. But is the index reasonable?

The preceding figure shows the B+ tree structure of the index idx_user(gmt_create,id_user,category,sub_category). We can see that the B+ tree only maintains the orderliness of the first column gmt_create but does not maintain the orderliness of other fields. According to the table partitioning rules, the iii_sss_msg_29 is created between (2023-07-29 00:00:00-2023-07-29 23:59:59).

Combine the where conditions of the previous SQL statement:

WHERE
    (
      (
        (
          `iii_sss_msg`.`gmt_create` >= '2023-07-24 00:00:00'
        )
        AND (
          `iii_sss_msg`.`gmt_create` < '2023-07-31 15:46:45.684'
        )

It can be seen that if this index is used, all leaf nodes of idx_user will be scanned again. Previously, when it was forced to use this index to query, the number of scanned rows was more than 1 million, and the cardinality of the primary key id index was more than 3 million, both of which were of the same order of magnitude.

The index idx_user(gmt_create,id_user,category,sub_category) is not reasonable in the preceding SQL application scenario, especially with the table partitioning rules based on the creation time. The author changes it into idx_user(id_user,category,sub_category,gmt_create,is_read). Since whether it has been read is also common in business conditions, it is added here. After the change, let's check the execution plan of the SQL statement again. This time, the optimizer firmly selects the index idx_user(id_user,category,sub_category,gmt_create,is_read). and only over 10,000 rows are scanned. Compared with the index idx_uer_query(id_user,category), this index reduces the scanning of the base table.

MySQL's index cannot be created completely when the table is created. Index accelerates querying data and is strongly related to business scenarios. Therefore, the index should be changed according to the business query, because unreasonable indexes will hinder queries and mislead the optimizer.

How to Choose Shard Keys for Table Partitioning

Is this shard key reasonable for the iii_sss_msg table to be split based on gmt_create?

The choice of shard keys should focus on:

It can maximize the even distribution of data across each physical table shard, achieving load balancing and uniform partitioning. The best approach is the modulo operation on an auto-incrementing primary key. However, the primary key may not be auto-incrementing, or even the absolute uniform distribution cannot be achieved without numbers.
The shard key cannot be null and must have an index.
All where statements to query SQL should be followed by this field. If it cannot be achieved, there should be methods to route the data to a specific physical table.
The related data series are partitioned into the same table.

The most common scenario of our application is to check the messages of some categories of a certain merchant within seven days. However, the iii_sss_msg message table is split according to gmt_create, which frequently triggers multi-table join queries. Each query retrieves data from seven tables. If we want to see the messages within one month in the future, we must retrieve data from thirty tables. It seems that this shard key is very unreasonable. According to the business scenario, id_user is a better choice as the shard key.

Summary

This article starts with the problems encountered and analyzes the TDDL optimizer, the MySQL index, and the selection of shard keys for table partitioning. The following is a summary:

TDDL parses and optimizes MySQL at the Matrix layer.
Selecting an index is the job of the optimizer, which aims to find an optimal execution scheme and execute the statement at the least cost. The optimizer mainly considers the following conditions: the conditions in the query statement, the selectivity (cardinality) of the index, the size and data type of the index, the size of the data block, and the coverage of the index.
The MySQL optimizer may select the wrong index. If the optimizer misjudges, we can use force index on the application to forcibly specify the index, modify statements to guide the optimizer, or add or delete indexes to bypass this problem.
The choice of shard keys should focus on maximizing the even distribution of data across each physical table, ensuring the shard key is never null and must have an index, making all where statements to query SQL are followed by this field, and ensuring that related data series are partitioned into the same table.

Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

Community

Are You Using the MySQL Index Correctly?

Background

TDDL Optimizer

How to Select the Index in MySQL

Does MySQL Select the Wrong Index?

How to Create an Index?

How to Choose Shard Keys for Table Partitioning

Summary

Read previous post:

Read next post:

Alibaba Cloud Community

You may also like

Comments

Alibaba Cloud Community

Related Products

PolarDB for MySQL

AnalyticDB for MySQL

ApsaraDB RDS for MySQL

ApsaraDB for MyBase