Index diagnostics - PolarDB - Alibaba Cloud Documentation Center

This topic describes the index diagnostics feature of PolarDB-X. You can use this feature to diagnose and identify inefficient indexes in your database and take measures to improve database performance.

Background information

Indexing is a method that is commonly used in databases to speed up query operations and greatly enhance the efficiency of data retrieval. However, each time data in a table is updated, the corresponding indexes must also be updated to reflect the changes. This can slow down write operations and reduce the overall efficiency of the database. In addition, indexes occupy the storage space of databases. Therefore, you must carefully consider the use of indexes and avoid creating an excessive number of indexes that may be unnecessary.

The index diagnostics feature of PolarDB-X can help you identify unnecessary and inefficient indexes (including local indexes and global secondary indexes) in your database. This feature can also provide optimization suggestions to help you make necessary adjustments to the relevant indexes, such as deleting or rebuilding the indexes. This improves write performance and saves storage space without compromising query performance.

Note

To maintain good database performance, we recommend that you perform index diagnostics on a regular basis.

Prerequisites

The index diagnostics feature is supported in PolarDB-X 5.4.17-16859297 and later. For information about how to view instance versions, see View and update the version of an instance.

Usage notes

The index diagnostics feature can be used for databases in AUTO mode and DRDS mode.

Syntax

INSPECT [FULL] INDEX [FROM table_name] [MODE= {STATIC|DYNAMIC|MIXED}]

Parameters

MODE: the diagnostics mode. Three diagnostics modes are supported. If you do not specify the MODE parameter, the STATIC mode is used.
- STATIC: This mode is used to diagnose and identify static issues, such as duplicate indexes. These types of issues can be detected as soon as indexes are created.
- DYNAMIC: This mode is used to identify dynamic issues, such as less selective column indexes or unused indexes. This mode requires collection of data on index usage over a period of time to provide accurate diagnostic results. For more information, see DYNAMIC mode.
- MIXED: This mode is used to diagnose both static and dynamic issues.
FROM: the data tables that you want to diagnose.
- If you do not specify the FROM parameter, all tables in the current database are diagnosed.
- If you specify the FROM parameter, only the specified data tables are diagnosed.
FULL: the level of detail of the diagnostic output that is generated during a diagnostic operation.
- If you specify the FULL parameter, a comprehensive diagnostic report about indexes is generated, including detailed information on identified issues and specific suggestions.
- If you do not specify the FULL parameter, the diagnostic output includes only suggestions.

STATIC mode

Execute the following statement to create the tb1 table:

create table tb1(
  id int,
  name varchar(20),
  code varchar(50),
  primary key(id),
  global index idx_name(`name`) partition by key(name),
  global index idx_name_code(`name`, `code`) partition by key(name, code),
  global index idx_id(`id`) partition by key(id)
) partition by key(id);

Run the INSPECT FULL INDEX FROM tb1\G command to return the diagnostic results in STATIC mode.

inspect full index from tb1\G
*************************** 1. row ***************************
          SCHEMA: d5
           TABLE: tb1
           INDEX: idx_id
      INDEX_TYPE: GLOBAL INDEX
    INDEX_COLUMN: id
 COVERING_COLUMN:
       USE_COUNT: 0
LAST_ACCESS_TIME: NULL
  DISCRIMINATION: 0.0
         PROBLEM: ineffective gsi `idx_id` because it has the same rule as primary table
  ADVICE (STEP1): alter table `tb1` alter index `idx_id` invisible;
  ADVICE (STEP2): alter table `tb1` drop index `idx_id`;
alter table `tb1` add local index `idx_id` (`id`);
*************************** 2. row ***************************
          SCHEMA: d5
           TABLE: tb1
           INDEX: idx_name
      INDEX_TYPE: GLOBAL INDEX
    INDEX_COLUMN: name
 COVERING_COLUMN: id
       USE_COUNT: 0
LAST_ACCESS_TIME: NULL
  DISCRIMINATION: 0.0
         PROBLEM: index columns duplicate: idx_name, idx_name_code;
  ADVICE (STEP1): alter table `tb1` alter index `idx_name` invisible;
  ADVICE (STEP2): alter table `tb1` drop index `idx_name`;
alter table `tb1` add local index `idx_name` (`name`);
*************************** 3. row ***************************
          SCHEMA: d5
           TABLE: tb1
           INDEX: idx_name_code
      INDEX_TYPE: GLOBAL INDEX
    INDEX_COLUMN: name,code
 COVERING_COLUMN: id
       USE_COUNT: 0
LAST_ACCESS_TIME: NULL
  DISCRIMINATION: 0.0
         PROBLEM: None
  ADVICE (STEP1): None
  ADVICE (STEP2): None
3 rows in set (0.42 sec)

The following section describes the diagnostic results:

The PROBLEM line shows issues with indexes.
- Issue with the idx_id global index: The idx_id index is reported as ineffective because it has the same partitioning rule as the primary table.
- Issue with the idx_name global index: The idx_name index is reported as redundant because the indexed column of idx_name is a duplicate of a column that is already included in the indexed column of idx_name_code.
The ADVICE lines provide suggestions for optimizing the performance of problematic indexes. The suggestions are presented in two steps and can vary based on the specific scenario.
1. STEP1 of the optimization advice suggests that you use the INVISIBLE INDEX statement to hide the problematic index. The statement hides the index from the optimizer. This way, your business SQL queries do not utilize or rely on the index for query execution. By hiding the index, you can evaluate the impact of index deletion without the need to permanently delete the index.
  Note
  For more information about the INVISIBLE INDEX feature and how to evaluate the impact of index deletion on your business, see INVISIBLE INDEX.
2. STEP2 of the optimization advice provided by PolarDB-X may include suggestions such as deleting the problematic index or replacing the problematic index with an optimized index.
  Warning
  To prevent unintended consequences or negative impacts on the database system, we recommend that you evaluate the impact of index deletion before you delete an index.

DYNAMIC mode

Index diagnostics in DYNAMIC mode rely on index usage data. Before you perform the diagnostics, we recommend that you collect a complete set of usage data that covers a full business cycle, such as one day or one week, based on your business characteristics.

Usage

Run the set global GSI_STATISTICS_COLLECTION=true command to enable index usage statistics collection in your database. This option allows for the collection of usage data over a period of time.
Run the inspect full index mode=dynamic\G command to perform index diagnostics in DYNAMIC mode.
Enabling the GSI_STATISTICS_COLLECTION option can have a minor impact on the query performance of the database (a potential decrease of approximately 1%). After index diagnostics are completed, run the set global GSI_STATISTICS_COLLECTION=false command to disable this option.

Scenarios

Improve testing performance during the business stress testing phase. Perform index diagnostics in DYNAMIC mode before business stress testing begins. This allows you to identify inefficient indexes and make adjustments based on the diagnostic suggestions before stress testing begins.
Optimize application performance during runtime. Collect index usage information over a complete business cycle and then perform index diagnostics in DYNAMIC mode to identify areas for optimization. You can adjust the indexes based on the diagnostic results to optimize the overall performance of the application.

Others

PolarDB-X provides the INFORMATION_SCHEMA.GLOBAL_INDEXES view to facilitate the optimization of global indexes. This view allows you to easily access and examine the usage information of global indexes in PolarDB-X.

Execute the following statement to view the global index usage:

select * from information_schema.global_indexes where table="tb1"\G
*************************** 1. row ***************************
             SCHEMA: testdb
              TABLE: tb1
         NON_UNIQUE: 1
           KEY_NAME: idx_id_$3449
        INDEX_NAMES: id
     COVERING_NAMES:
         INDEX_TYPE: NULL
   DB_PARTITION_KEY:
DB_PARTITION_POLICY:
 DB_PARTITION_COUNT: NULL
   TB_PARTITION_KEY:
TB_PARTITION_POLICY:
 TB_PARTITION_COUNT: NULL
             STATUS: PUBLIC
         SIZE_IN_MB: 10.03
          USE_COUNT: 5
   LAST_ACCESS_TIME: 2023-06-08 10:06:33
        CARDINALITY: 389090
          ROW_COUNT: 404508
*************************** 2. row ***************************
             SCHEMA: testdb
              TABLE: tb1
         NON_UNIQUE: 1
           KEY_NAME: idx_name_code_$2986
        INDEX_NAMES: name, code
     COVERING_NAMES: id
         INDEX_TYPE: NULL
   DB_PARTITION_KEY:
DB_PARTITION_POLICY:
 DB_PARTITION_COUNT: NULL
   TB_PARTITION_KEY:
TB_PARTITION_POLICY:
 TB_PARTITION_COUNT: NULL
             STATUS: PUBLIC
         SIZE_IN_MB: 0.03
          USE_COUNT: 15
   LAST_ACCESS_TIME: 2023-06-08 10:10:06
        CARDINALITY: -1
          ROW_COUNT: 404508
2 rows in set (0.10 sec)

Response parameters

SIZE_IN_MB: the amount of storage space that is occupied by the index.
USE_COUNT: the number of times that the index has been used since the GSI_STATISTICS_COLLECTION option was enabled.
LAST_ACCESS_TIME: the last time when the index was used since the GSI_STATISTICS_COLLECTION option was enabled.
CARDINALITY: the cardinality of the index.
ROW_COUNT: the number of rows in the indexed table.