All Products
Search
Document Center

ApsaraDB RDS:Select or create an optimal index for faster data access

Last Updated:Mar 12, 2024

This topic describes how to select or create an optimal index to improve the read performance of an ApsaraDB RDS for MySQL instance.

Background information

When you use an RDS instance, the instance may be faulty due to SQL performance issues. Index issues such as the lack of indexes, implicit conversion, and inappropriate indexing are a common type of SQL performance issue.

  • Lack of indexes: If you want to use SQL statements to access a table that is not indexed, the RDS instance performs a full table scan. If the amount of data in the table is large, the SQL statements are executed at low speeds and the connections to the RDS instance are consumed. When the maximum number of connections that can be established to the RDS instance is reached, new requests from the application are rejected and errors occur.

  • Implicit conversion: If the data type that is specified in the SQL statements is different from the data type of the fields in the table, indexing fails. As a result, slow SQL statements are accumulated, and the number of connections that are established to the RDS instance exceeds the upper limit.

Precautions

  • If you use the like keyword and add a percent sign (%) as a prefix, indexing fails.

  • In most cases, an index is not created on a column that contains null values. Columns that contain null values are excluded from the columns that are eligible for indexing.

  • If you use the or keyword and the field to the left or right of the or keyword is not indexed, indexing fails.

  • If you use the != operator, indexing fails. In this case, the indexed range is uncertain, and the RDS instance automatically performs full table scans. As a result, indexes cannot help significantly improve query performance.

  • Do not perform operations on indexed fields.

  • When you use a composite index to improve query performance, you must follow the leftmost prefix matching principle and use the first field of the composite index. Otherwise, indexing fails. Make sure that the order of fields is consistent with the order of indexes.

  • To prevent implicit conversions, make sure that the data type that is specified in the SQL statements is consistent with the data type of fields in the table.

Usage notes

  • Create indexes on fields that are frequently queried but are not frequently updated by add, delete, and modify operations.

  • Use indexed fields directly after ORDER BY and GROUP BY statements.

  • Make sure that the number of indexes that are created on a table does not exceed 6.

  • Make sure that the length of an indexed field is fixed and short.

  • Make sure that the number of duplicate fields among indexed fields is small.

  • Create indexes on fields of high filterability.

Example 1: Optimization for a table that has no indexes

  1. Execute the SHOW CREATE TABLE customers; statement to view the schema of the table.

    Note

    The customers table contains 20 entries.

     CREATE TABLE `customers` (
       `cust_id` int(11) NOT NULL AUTO_INCREMENT,
       `cust_name` char(50) NOT NULL,
       `cust_address` char(50) DEFAULT NULL,
       `cust_city` char(50) DEFAULT NULL,
       `cust_state` char(5) DEFAULT NULL,
       `cust_zip` char(10) DEFAULT NULL,
       `cust_country` char(50) DEFAULT NULL,
       `cust_contact` char(50) DEFAULT NULL,
       `cust_email` char(255) DEFAULT NULL,
       PRIMARY KEY (`cust_id`)
     ) ENGINE=InnoDB AUTO_INCREMENT=10006 DEFAULT CHARSET=utf8;
  2. Execute the EXPLAIN SELECT * FROM customers where cust_zip = '44444' limit 0,1; statement to view the execution plan of the SQL statement that is used for the data query.

                id: 1
      select_type: SIMPLE
            table: customers
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 20
         filtered: 10.00
            Extra: Using where
    Note

    In the execution plan, the value of the type parameter is ALL, which indicates a full table scan. A total of 20 rows of data are scanned each time. This significantly deteriorates database performance.

  3. Execute the ALTER TABLE customers add index idx_cus(cust_zip); statement to create an index.

  4. Re-execute the EXPLAIN SELECT * FROM customers where cust_zip = '44444' limit 0,1; statement to view the new execution plan of the SQL statement that is used for the data query.

               id: 1
      select_type: SIMPLE
            table: customers
       partitions: NULL
             type: ref
    possible_keys: idx_cus
              key: idx_cus
          key_len: 31
              ref: const
             rows: 1
         filtered: 100.00
            Extra: NULL
    Note

    In the new execution plan, the value of the type parameter is ref. This setting specifies that the RDS instance performs index-based equivalent queries or equi-joins between tables. A row of data is scanned. This significantly improves query performance.

Example 2: Optimization for a table that has no indexes

  1. Execute the DROP TABLE if exists customers; statement to delete the previously created table named customers.

  2. Execute the following statement in the database to create a test table named customers:

     CREATE TABLE `customers` (
       `cust_id` int(11) NOT NULL AUTO_INCREMENT,
       `cust_name` char(50) NOT NULL,
       `cust_address` char(50) DEFAULT NULL,
       `cust_city` char(50) DEFAULT NULL,
       `cust_state` char(5) DEFAULT NULL,
       `cust_zip` char(10) DEFAULT NULL,
       `cust_country` char(50) DEFAULT NULL,
       `cust_contact` char(50) DEFAULT NULL,
       `cust_email` char(255) DEFAULT NULL,
       PRIMARY KEY (`cust_id`)
     ) ENGINE=InnoDB AUTO_INCREMENT=10006 DEFAULT CHARSET=utf8;
  3. Insert 20 data entries into the customers table.

    INSERT INTO customers (cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
    VALUES
    ('John Smith', '123 Main St', 'New York', 'NY', '10001', 'USA', 'John Doe', 'john@example.com'),
    ('Jane Doe', '456 Elm St', 'Los Angeles', 'CA', '90001', 'USA', 'Jane Smith', 'jane@example.com'),
    ('Bob Johnson', '789 Oak St', 'Chicago', 'IL', '60601', 'USA', 'Bob Smith', 'bob@example.com'),
    ('Alice Brown', '987 Maple Ave', 'Houston', 'TX', '77001', 'USA', 'Alice Smith', 'alice@example.com'),
    ('Michael Davis', '654 Pine St', 'San Francisco', 'CA', '94101', 'USA', 'Michael Smith', 'michael@example.com'),
    ('Sarah Wilson', '321 Cedar St', 'Seattle', 'WA', '98101', 'USA', 'Sarah Smith', 'sarah@example.com'),
    ('David Lee', '876 Birch St', 'Boston', 'MA', '02101', 'USA', 'David Smith', 'david@example.com'),
    ('Karen Taylor', '543 Willow St', 'Miami', 'FL', '33101', 'USA', 'Karen Smith', 'karen@example.com'),
    ('Steven Miller', '210 Oak St', 'Denver', 'CO', '80201', 'USA', 'Steven Smith', 'steven@example.com'),
    ('Lisa Anderson', '876 Elm St', 'Atlanta', 'GA', '30301', 'USA', 'Lisa Smith', 'lisa@example.com'),
    ('Matthew Wilson', '567 Pine St', 'Dallas', 'TX', '75201', 'USA', 'Matthew Smith', 'matthew@example.com'),
    ('Emily Johnson', '654 Cedar St', 'Phoenix', 'AZ', '85001', 'USA', 'Emily Smith', 'emily@example.com'),
    ('James Davis', '321 Birch St', 'San Diego', 'CA', '92101', 'USA', 'James Smith', 'james@example.com'),
    ('Olivia Brown', '987 Willow St', 'Portland', 'OR', '97201', 'USA', 'Olivia Smith', 'olivia@example.com'),
    ('Daniel Wilson', '543 Oak St', 'Las Vegas', 'NV', '89101', 'USA', 'Daniel Smith', 'daniel@example.com'),
    ('Emma Taylor', '210 Elm St', 'Philadelphia', 'PA', '19101', 'USA', 'Emma Smith', 'emma@example.com'),
    ('Christopher Miller', '876 Maple Ave', 'Austin', 'TX', '78701', 'USA', 'Christopher Smith', 'christopher@example.com'),
    ('Sophia Anderson', '567 Pine St', 'San Antonio', 'TX', '78201', 'USA', 'Sophia Smith', 'sophia@example.com'),
    ('Jacob Wilson', '654 Cedar St', 'Nashville', 'TN', '37201', 'USA', 'Jacob Smith', 'jacob@example.com'),
    ('Ava Johnson', '321 Birch St', 'Charlotte', 'NC', '28201', 'USA', 'Ava Smith', 'ava@example.com');

  4. Execute the EXPLAIN SELECT cust_id,cust_name,cust_zip from customers where cust_zip = '42222'order by cust_zip,cust_name; statement to view the new execution plan of the SQL statement that is used for the data query.

               id: 1
      select_type: SIMPLE
            table: customers
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 20
         filtered: 10.00
            Extra: Using where; Using filesort
  5. Execute the ALTER TABLE customers add index idx_cu_zip_name(cust_zip,cust_name); statement to create an index.

  6. Re-execute the EXPLAIN SELECT cust_id,cust_name,cust_zip from customers where cust_zip = '42222'order by cust_zip,cust_name; statement to view the new execution plan of the SQL statement that is used for the data query.

               id: 1
      select_type: SIMPLE
            table: customers
       partitions: NULL
             type: ref
    possible_keys: idx_cu_zip_name
              key: idx_cu_zip_name
          key_len: 31
              ref: const
             rows: 1
         filtered: 100.00
            Extra: Using where; Using index

Example 1: Optimization for a table on which implicit conversions occur

  1. Execute the DROP TABLE if exists customers; statement to delete the previously created table named customers.

  2. Execute the following statement in the database to create a test table named customers:

     CREATE TABLE `customers` (
       `cust_id` int(11) NOT NULL AUTO_INCREMENT,
       `cust_name` char(50) NOT NULL,
       `cust_address` char(50) DEFAULT NULL,
       `cust_city` char(50) DEFAULT NULL,
       `cust_state` char(5) DEFAULT NULL,
       `cust_zip` char(10) DEFAULT NULL,
       `cust_country` char(50) DEFAULT NULL,
       `cust_contact` char(50) DEFAULT NULL,
       `cust_email` char(255) DEFAULT NULL,
       PRIMARY KEY (`cust_id`)
     ) ENGINE=InnoDB AUTO_INCREMENT=10006 DEFAULT CHARSET=utf8;
  3. Insert 20 data entries into the customers table.

    INSERT INTO customers (cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
    VALUES
    ('John Smith', '123 Main St', 'New York', 'NY', '10001', 'USA', 'John Doe', 'john@example.com'),
    ('Jane Doe', '456 Elm St', 'Los Angeles', 'CA', '90001', 'USA', 'Jane Smith', 'jane@example.com'),
    ('Bob Johnson', '789 Oak St', 'Chicago', 'IL', '60601', 'USA', 'Bob Smith', 'bob@example.com'),
    ('Alice Brown', '987 Maple Ave', 'Houston', 'TX', '77001', 'USA', 'Alice Smith', 'alice@example.com'),
    ('Michael Davis', '654 Pine St', 'San Francisco', 'CA', '94101', 'USA', 'Michael Smith', 'michael@example.com'),
    ('Sarah Wilson', '321 Cedar St', 'Seattle', 'WA', '98101', 'USA', 'Sarah Smith', 'sarah@example.com'),
    ('David Lee', '876 Birch St', 'Boston', 'MA', '02101', 'USA', 'David Smith', 'david@example.com'),
    ('Karen Taylor', '543 Willow St', 'Miami', 'FL', '33101', 'USA', 'Karen Smith', 'karen@example.com'),
    ('Steven Miller', '210 Oak St', 'Denver', 'CO', '80201', 'USA', 'Steven Smith', 'steven@example.com'),
    ('Lisa Anderson', '876 Elm St', 'Atlanta', 'GA', '30301', 'USA', 'Lisa Smith', 'lisa@example.com'),
    ('Matthew Wilson', '567 Pine St', 'Dallas', 'TX', '75201', 'USA', 'Matthew Smith', 'matthew@example.com'),
    ('Emily Johnson', '654 Cedar St', 'Phoenix', 'AZ', '85001', 'USA', 'Emily Smith', 'emily@example.com'),
    ('James Davis', '321 Birch St', 'San Diego', 'CA', '92101', 'USA', 'James Smith', 'james@example.com'),
    ('Olivia Brown', '987 Willow St', 'Portland', 'OR', '97201', 'USA', 'Olivia Smith', 'olivia@example.com'),
    ('Daniel Wilson', '543 Oak St', 'Las Vegas', 'NV', '89101', 'USA', 'Daniel Smith', 'daniel@example.com'),
    ('Emma Taylor', '210 Elm St', 'Philadelphia', 'PA', '19101', 'USA', 'Emma Smith', 'emma@example.com'),
    ('Christopher Miller', '876 Maple Ave', 'Austin', 'TX', '78701', 'USA', 'Christopher Smith', 'christopher@example.com'),
    ('Sophia Anderson', '567 Pine St', 'San Antonio', 'TX', '78201', 'USA', 'Sophia Smith', 'sophia@example.com'),
    ('Jacob Wilson', '654 Cedar St', 'Nashville', 'TN', '37201', 'USA', 'Jacob Smith', 'jacob@example.com'),
    ('Ava Johnson', '321 Birch St', 'Charlotte', 'NC', '28201', 'USA', 'Ava Smith', 'ava@example.com');

  4. Execute the EXPLAIN SELECT * FROM customers where cust_zip = 44444 limit 0,1; statement to view the new execution plan of the SQL statement that is used for the data query.

               id: 1
      select_type: SIMPLE
            table: customers
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 20
         filtered: 10.00
            Extra: Using where

Example 2: Optimization for two tables on which implicit conversions occur

  1. Execute the SHOW CREATE TABLE customers1; statement and the SHOW CREATE TABLE customers2; statement to view the schemas of the customers 1 table and the customers 2 table.

     CREATE TABLE `customers1` (
       `cust_id` varchar(10) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
       `cust_name` char(50) NOT NULL,
       KEY `idx_cu_id` (`cust_id`)
     ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    CREATE TABLE `customers2` (
       `cust_id` varchar(10) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
       `cust_name` char(50) NOT NULL,
       KEY `idx_cu_id` (`cust_id`)
     ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
  2. Execute the EXPLAIN SELECT customers1.* from customers2 left join customers1 on customers1.cust_id=customers2.cust_id where customers2.cust_id='x';statement to view the new execution plan of the SQL statement that is used for the data query.

     *************************** 1. row ***************************
                id: 1
       select_type: SIMPLE
             table: customers2
              type: ref
     possible_keys: idx_cu_id
               key: idx_cu_id
           key_len: 33
               ref: const
              rows: 1
             Extra: Using where; Using index
     *************************** 2. row ***************************
                id: 1
       select_type: SIMPLE
             table: customers1
              type: ALL
     possible_keys: NULL
               key: NULL
           key_len: NULL
               ref: NULL
              rows: 1
             Extra: Using where; Using join buffer (Block Nested Loop)
    Note

    The preceding results show that the data types that are specified in the character set of the cust_id field in two tables are different. As a result, an implicit conversion occurred, and indexes cannot be used.

  3. Execute the ALTER TABLE customers1 modify column cust_id varchar(10) COLLATE utf8_bin; statement to change the data type of the character set of the cust_id field in the customers 1 table to UTF-8. After the modification, the data types of the character set of the cust_id field in two tables are consistent.

    Note

    If you execute this statement, the data type of the character set of the cust_id field in the customers 2 table is modified to UTF-8.

  4. Re-execute the EXPLAIN SELECT customers1.* from customers2 left join customers1 on customers1.cust_id=customers2.cust_id where customers2.cust_id='x'; statement to view the new execution plan of the SQL statement that is used for the data query.

     *************************** 1. row ***************************
                id: 1
       select_type: SIMPLE
             table: customers2
              type: ref
     possible_keys: idx_cu_id
               key: idx_cu_id
           key_len: 33
               ref: const
              rows: 1
             Extra: Using where; Using index
     *************************** 2. row ***************************
                id: 1
       select_type: SIMPLE
             table: customers1
              type: ref
     possible_keys: idx_cu_id
               key: idx_cu_id
           key_len: 33
               ref: const
              rows: 1
             Extra: Using where
    Note

    After the data types are consistent, the index is used and improves database performance.