This topic describes how to select or create an optimal index to improve the read performance of an ApsaraDB RDS for MySQL instance.
Background information
When you use an RDS instance, the instance may be faulty due to SQL performance issues. Index issues such as the lack of indexes, implicit conversion, and inappropriate indexing are a common type of SQL performance issue.
Lack of indexes: If you want to use SQL statements to access a table that is not indexed, the RDS instance performs a full table scan. If the amount of data in the table is large, the SQL statements are executed at low speeds and the connections to the RDS instance are consumed. When the maximum number of connections that can be established to the RDS instance is reached, new requests from the application are rejected and errors occur.
Implicit conversion: If the data type that is specified in the SQL statements is different from the data type of the fields in the table, indexing fails. As a result, slow SQL statements are accumulated, and the number of connections that are established to the RDS instance exceeds the upper limit.
Precautions
If you use the like keyword and add a percent sign (%) as a prefix, indexing fails.
In most cases, an index is not created on a column that contains null values. Columns that contain null values are excluded from the columns that are eligible for indexing.
If you use the or keyword and the field to the left or right of the or keyword is not indexed, indexing fails.
If you use the != operator, indexing fails. In this case, the indexed range is uncertain, and the RDS instance automatically performs full table scans. As a result, indexes cannot help significantly improve query performance.
Do not perform operations on indexed fields.
When you use a composite index to improve query performance, you must follow the leftmost prefix matching principle and use the first field of the composite index. Otherwise, indexing fails. Make sure that the order of fields is consistent with the order of indexes.
To prevent implicit conversions, make sure that the data type that is specified in the SQL statements is consistent with the data type of fields in the table.
Usage notes
Create indexes on fields that are frequently queried but are not frequently updated by add, delete, and modify operations.
Use indexed fields directly after ORDER BY and GROUP BY statements.
Make sure that the number of indexes that are created on a table does not exceed 6.
Make sure that the length of an indexed field is fixed and short.
Make sure that the number of duplicate fields among indexed fields is small.
Create indexes on fields of high filterability.
Example 1: Optimization for a table that has no indexes
Execute the
SHOW CREATE TABLE customers;
statement to view the schema of the table.NoteThe customers table contains 20 entries.
CREATE TABLE `customers` ( `cust_id` int(11) NOT NULL AUTO_INCREMENT, `cust_name` char(50) NOT NULL, `cust_address` char(50) DEFAULT NULL, `cust_city` char(50) DEFAULT NULL, `cust_state` char(5) DEFAULT NULL, `cust_zip` char(10) DEFAULT NULL, `cust_country` char(50) DEFAULT NULL, `cust_contact` char(50) DEFAULT NULL, `cust_email` char(255) DEFAULT NULL, PRIMARY KEY (`cust_id`) ) ENGINE=InnoDB AUTO_INCREMENT=10006 DEFAULT CHARSET=utf8;
Execute the
EXPLAIN SELECT * FROM customers where cust_zip = '44444' limit 0,1;
statement to view the execution plan of the SQL statement that is used for the data query.id: 1 select_type: SIMPLE table: customers partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 20 filtered: 10.00 Extra: Using where
NoteIn the execution plan, the value of the type parameter is ALL, which indicates a full table scan. A total of 20 rows of data are scanned each time. This significantly deteriorates database performance.
Execute the
ALTER TABLE customers add index idx_cus(cust_zip);
statement to create an index.Re-execute the
EXPLAIN SELECT * FROM customers where cust_zip = '44444' limit 0,1;
statement to view the new execution plan of the SQL statement that is used for the data query.id: 1 select_type: SIMPLE table: customers partitions: NULL type: ref possible_keys: idx_cus key: idx_cus key_len: 31 ref: const rows: 1 filtered: 100.00 Extra: NULL
NoteIn the new execution plan, the value of the type parameter is ref. This setting specifies that the RDS instance performs index-based equivalent queries or equi-joins between tables. A row of data is scanned. This significantly improves query performance.
Example 2: Optimization for a table that has no indexes
Execute the
DROP TABLE if exists customers;
statement to delete the previously created table named customers.Execute the following statement in the database to create a test table named customers:
CREATE TABLE `customers` ( `cust_id` int(11) NOT NULL AUTO_INCREMENT, `cust_name` char(50) NOT NULL, `cust_address` char(50) DEFAULT NULL, `cust_city` char(50) DEFAULT NULL, `cust_state` char(5) DEFAULT NULL, `cust_zip` char(10) DEFAULT NULL, `cust_country` char(50) DEFAULT NULL, `cust_contact` char(50) DEFAULT NULL, `cust_email` char(255) DEFAULT NULL, PRIMARY KEY (`cust_id`) ) ENGINE=InnoDB AUTO_INCREMENT=10006 DEFAULT CHARSET=utf8;
Insert 20 data entries into the customers table.
INSERT INTO customers (cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email) VALUES ('John Smith', '123 Main St', 'New York', 'NY', '10001', 'USA', 'John Doe', 'john@example.com'), ('Jane Doe', '456 Elm St', 'Los Angeles', 'CA', '90001', 'USA', 'Jane Smith', 'jane@example.com'), ('Bob Johnson', '789 Oak St', 'Chicago', 'IL', '60601', 'USA', 'Bob Smith', 'bob@example.com'), ('Alice Brown', '987 Maple Ave', 'Houston', 'TX', '77001', 'USA', 'Alice Smith', 'alice@example.com'), ('Michael Davis', '654 Pine St', 'San Francisco', 'CA', '94101', 'USA', 'Michael Smith', 'michael@example.com'), ('Sarah Wilson', '321 Cedar St', 'Seattle', 'WA', '98101', 'USA', 'Sarah Smith', 'sarah@example.com'), ('David Lee', '876 Birch St', 'Boston', 'MA', '02101', 'USA', 'David Smith', 'david@example.com'), ('Karen Taylor', '543 Willow St', 'Miami', 'FL', '33101', 'USA', 'Karen Smith', 'karen@example.com'), ('Steven Miller', '210 Oak St', 'Denver', 'CO', '80201', 'USA', 'Steven Smith', 'steven@example.com'), ('Lisa Anderson', '876 Elm St', 'Atlanta', 'GA', '30301', 'USA', 'Lisa Smith', 'lisa@example.com'), ('Matthew Wilson', '567 Pine St', 'Dallas', 'TX', '75201', 'USA', 'Matthew Smith', 'matthew@example.com'), ('Emily Johnson', '654 Cedar St', 'Phoenix', 'AZ', '85001', 'USA', 'Emily Smith', 'emily@example.com'), ('James Davis', '321 Birch St', 'San Diego', 'CA', '92101', 'USA', 'James Smith', 'james@example.com'), ('Olivia Brown', '987 Willow St', 'Portland', 'OR', '97201', 'USA', 'Olivia Smith', 'olivia@example.com'), ('Daniel Wilson', '543 Oak St', 'Las Vegas', 'NV', '89101', 'USA', 'Daniel Smith', 'daniel@example.com'), ('Emma Taylor', '210 Elm St', 'Philadelphia', 'PA', '19101', 'USA', 'Emma Smith', 'emma@example.com'), ('Christopher Miller', '876 Maple Ave', 'Austin', 'TX', '78701', 'USA', 'Christopher Smith', 'christopher@example.com'), ('Sophia Anderson', '567 Pine St', 'San Antonio', 'TX', '78201', 'USA', 'Sophia Smith', 'sophia@example.com'), ('Jacob Wilson', '654 Cedar St', 'Nashville', 'TN', '37201', 'USA', 'Jacob Smith', 'jacob@example.com'), ('Ava Johnson', '321 Birch St', 'Charlotte', 'NC', '28201', 'USA', 'Ava Smith', 'ava@example.com');
Execute the
EXPLAIN SELECT cust_id,cust_name,cust_zip from customers where cust_zip = '42222'order by cust_zip,cust_name;
statement to view the new execution plan of the SQL statement that is used for the data query.id: 1 select_type: SIMPLE table: customers partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 20 filtered: 10.00 Extra: Using where; Using filesort
Execute the
ALTER TABLE customers add index idx_cu_zip_name(cust_zip,cust_name);
statement to create an index.Re-execute the
EXPLAIN SELECT cust_id,cust_name,cust_zip from customers where cust_zip = '42222'order by cust_zip,cust_name;
statement to view the new execution plan of the SQL statement that is used for the data query.id: 1 select_type: SIMPLE table: customers partitions: NULL type: ref possible_keys: idx_cu_zip_name key: idx_cu_zip_name key_len: 31 ref: const rows: 1 filtered: 100.00 Extra: Using where; Using index
Example 1: Optimization for a table on which implicit conversions occur
Execute the
DROP TABLE if exists customers;
statement to delete the previously created table named customers.Execute the following statement in the database to create a test table named customers:
CREATE TABLE `customers` ( `cust_id` int(11) NOT NULL AUTO_INCREMENT, `cust_name` char(50) NOT NULL, `cust_address` char(50) DEFAULT NULL, `cust_city` char(50) DEFAULT NULL, `cust_state` char(5) DEFAULT NULL, `cust_zip` char(10) DEFAULT NULL, `cust_country` char(50) DEFAULT NULL, `cust_contact` char(50) DEFAULT NULL, `cust_email` char(255) DEFAULT NULL, PRIMARY KEY (`cust_id`) ) ENGINE=InnoDB AUTO_INCREMENT=10006 DEFAULT CHARSET=utf8;
Insert 20 data entries into the customers table.
INSERT INTO customers (cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email) VALUES ('John Smith', '123 Main St', 'New York', 'NY', '10001', 'USA', 'John Doe', 'john@example.com'), ('Jane Doe', '456 Elm St', 'Los Angeles', 'CA', '90001', 'USA', 'Jane Smith', 'jane@example.com'), ('Bob Johnson', '789 Oak St', 'Chicago', 'IL', '60601', 'USA', 'Bob Smith', 'bob@example.com'), ('Alice Brown', '987 Maple Ave', 'Houston', 'TX', '77001', 'USA', 'Alice Smith', 'alice@example.com'), ('Michael Davis', '654 Pine St', 'San Francisco', 'CA', '94101', 'USA', 'Michael Smith', 'michael@example.com'), ('Sarah Wilson', '321 Cedar St', 'Seattle', 'WA', '98101', 'USA', 'Sarah Smith', 'sarah@example.com'), ('David Lee', '876 Birch St', 'Boston', 'MA', '02101', 'USA', 'David Smith', 'david@example.com'), ('Karen Taylor', '543 Willow St', 'Miami', 'FL', '33101', 'USA', 'Karen Smith', 'karen@example.com'), ('Steven Miller', '210 Oak St', 'Denver', 'CO', '80201', 'USA', 'Steven Smith', 'steven@example.com'), ('Lisa Anderson', '876 Elm St', 'Atlanta', 'GA', '30301', 'USA', 'Lisa Smith', 'lisa@example.com'), ('Matthew Wilson', '567 Pine St', 'Dallas', 'TX', '75201', 'USA', 'Matthew Smith', 'matthew@example.com'), ('Emily Johnson', '654 Cedar St', 'Phoenix', 'AZ', '85001', 'USA', 'Emily Smith', 'emily@example.com'), ('James Davis', '321 Birch St', 'San Diego', 'CA', '92101', 'USA', 'James Smith', 'james@example.com'), ('Olivia Brown', '987 Willow St', 'Portland', 'OR', '97201', 'USA', 'Olivia Smith', 'olivia@example.com'), ('Daniel Wilson', '543 Oak St', 'Las Vegas', 'NV', '89101', 'USA', 'Daniel Smith', 'daniel@example.com'), ('Emma Taylor', '210 Elm St', 'Philadelphia', 'PA', '19101', 'USA', 'Emma Smith', 'emma@example.com'), ('Christopher Miller', '876 Maple Ave', 'Austin', 'TX', '78701', 'USA', 'Christopher Smith', 'christopher@example.com'), ('Sophia Anderson', '567 Pine St', 'San Antonio', 'TX', '78201', 'USA', 'Sophia Smith', 'sophia@example.com'), ('Jacob Wilson', '654 Cedar St', 'Nashville', 'TN', '37201', 'USA', 'Jacob Smith', 'jacob@example.com'), ('Ava Johnson', '321 Birch St', 'Charlotte', 'NC', '28201', 'USA', 'Ava Smith', 'ava@example.com');
Execute the
EXPLAIN SELECT * FROM customers where cust_zip = 44444 limit 0,1;
statement to view the new execution plan of the SQL statement that is used for the data query.id: 1 select_type: SIMPLE table: customers partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 20 filtered: 10.00 Extra: Using where
Example 2: Optimization for two tables on which implicit conversions occur
Execute the
SHOW CREATE TABLE customers1;
statement and theSHOW CREATE TABLE customers2;
statement to view the schemas of the customers 1 table and the customers 2 table.CREATE TABLE `customers1` ( `cust_id` varchar(10) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL, `cust_name` char(50) NOT NULL, KEY `idx_cu_id` (`cust_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `customers2` ( `cust_id` varchar(10) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL, `cust_name` char(50) NOT NULL, KEY `idx_cu_id` (`cust_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Execute the
EXPLAIN SELECT customers1.* from customers2 left join customers1 on customers1.cust_id=customers2.cust_id where customers2.cust_id='x';
statement to view the new execution plan of the SQL statement that is used for the data query.*************************** 1. row *************************** id: 1 select_type: SIMPLE table: customers2 type: ref possible_keys: idx_cu_id key: idx_cu_id key_len: 33 ref: const rows: 1 Extra: Using where; Using index
*************************** 2. row *************************** id: 1 select_type: SIMPLE table: customers1 type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 Extra: Using where; Using join buffer (Block Nested Loop)
NoteThe preceding results show that the data types that are specified in the character set of the cust_id field in two tables are different. As a result, an implicit conversion occurred, and indexes cannot be used.
Execute the
ALTER TABLE customers1 modify column cust_id varchar(10) COLLATE utf8_bin;
statement to change the data type of the character set of the cust_id field in the customers 1 table to UTF-8. After the modification, the data types of the character set of the cust_id field in two tables are consistent.NoteIf you execute this statement, the data type of the character set of the cust_id field in the customers 2 table is modified to UTF-8.
Re-execute the
EXPLAIN SELECT customers1.* from customers2 left join customers1 on customers1.cust_id=customers2.cust_id where customers2.cust_id='x';
statement to view the new execution plan of the SQL statement that is used for the data query.*************************** 1. row *************************** id: 1 select_type: SIMPLE table: customers2 type: ref possible_keys: idx_cu_id key: idx_cu_id key_len: 33 ref: const rows: 1 Extra: Using where; Using index
*************************** 2. row *************************** id: 1 select_type: SIMPLE table: customers1 type: ref possible_keys: idx_cu_id key: idx_cu_id key_len: 33 ref: const rows: 1 Extra: Using where
NoteAfter the data types are consistent, the index is used and improves database performance.