This topic describes how to troubleshoot the issues that cause excessive active threads on an ApsaraDB RDS for MySQL instance.

Background information

The number of active threads or the number of active connections is an important metric that is used to measure the loads on your RDS instance. In most cases, the number is less than 10 if your RDS instance is healthy, and may increase to 20 to 30 if your RDS instance provides high specifications and high queries per second (QPS). If the number exceeds 100, the speed at which your RDS instance responds to queries is low due to an SQL query pile-up. In severe cases, your RDS instance does not respond and stops processing SQL queries.

View the number of active threads

The ApsaraDB RDS console provides various methods that can be used to view the number of active threads:

  • Monitoring and alerting

    In the left-side navigation pane, click Monitoring and Alerts. On the page that appears, click the Standard Monitoring tab. On the Standard Monitoring tab, click Engine Monitoring. Then, you can view the number of active threads.

  • Autonomy service

    In the left-side navigation pane, choose Autonomy Service > Dashboard. On the page that appears, click the Performance Trends tab. Then, you can view the number of active threads. If the number is excessively high, some sessions are blocked.

Troubleshoot piled-up slow SQL queries

  • Symptom

    If the number of active threads is excessively high, you can execute the SHOW PROCESSLIST statement to check for slow SQL queries. If a large number of SQL queries require ApsaraDB RDS to scan an excessively large number of rows, the number of active threads may increase.

    In the left-side navigation pane, choose Autonomy Service > Diagnostics. On the page that appears, click the Session Management tab. Then, you can view the SQL queries that are in progress.

  • Solution

    Enable the SQL throttling feature or terminate sessions. This mitigates the impact of slow SQL queries. For more information, see SQL throttling.

Troubleshoot table cache issues

  • Symptom

    If your RDS instance implements excessively high QPS or processes a large number of tables, a large number of SQL queries switch to the Opening table state due to an insufficient table cache size.

  • Solution

    Increase the values of the table_open_cache and table_open_cache_instances parameters. The reconfiguration of the table_open_cache parameter does not require a restart of your RDS instance. However, the reconfiguration of the table_open_cache_instances parameter requires a restart of your RDS instance.

Troubleshoot metadata locking issues

  • Symptom

    In the Prepare and Commit phases, data definition language (DDL) statements need to acquire metadata locks on tables. If the tables are involved in uncommitted transactions or slow SQL queries, these DDL statements are blocked. This in turn blocks more SQL queries. All the blocked SQL queries switch to the Waiting for table metadata lock state. As a result, the number of active threads increases.

  • Solution

    Abort all the uncommitted transactions, slow SQL queries, and ongoing DDL statements.

Troubleshoot row lock conflicts

  • Symptom

    If the values of the Innodb_row_lock_waits and Innodb_row_lock_time metrics are abnormally large, row lock conflicts may occur.

    In the left-side navigation pane, choose Autonomy Service > Dashboard. On the page that appears, click the Performance Trends tab. Then, you can view the metrics in the RowLock section.

  • Solution

    Execute the SHOW ENGINE INNODB STATUS statement to check whether a large number of sessions are in the Lock wait state. If a large number of sessions are in the Lock wait state, severe row lock conflicts may occur. In this case, mitigate row lock conflicts by using all the suggested methods. For example, you can optimize hot data updates, reduce transaction sizes, and reduce the time that is required to commit transactions.