This topic describes how to identify and troubleshoot database failures.

Check whether the system bottleneck issue occurs on the database

  • Use PROCESSLIST
    Execute the following statement to display all SQL statements that are being executed in the PolarDB-X instance:
    SHOW PROCESSLIST WHERE INFO IS NOT NULL

    In most cases, statement stacking occurs when the database responds at a low speed. If the returned result does not contain a large number of statements whose execution time is greater than 0, the issue does not exist on the database. Otherwise, the bottleneck issue may occur on the database.

  • Use stack information
    An application interacts with a database over TCP. If a bottleneck issue occurs on the database, the database does not return results after the application sends a request to the database by using a socket. In this case, the request that is sent by using the socket cannot be processed by using the read() method. The stack information about the application can be used to determine whether blocking occurs on the database. The following section provides an example. In this example, a Java application is used.
    1. Run the jstack command to dump the stack information.
    2. In the dumped stack information, search for the stack in which the MySQL driver waits for the response to the request. The following code block provides the stack content:
              at java.net.SocketInputStream.socketRead0(Native Method)
              at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
              at java.net.SocketInputStream.read(SocketInputStream.java:171)
              at java.net.SocketInputStream.read(SocketInputStream.java:141)
              at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
              at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
              at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
              - locked <0x00000002eb8f2d98> (a com.mysql.jdbc.util.ReadAheadInputStream)
              at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
              at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
              at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3649)
              at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4090)
              at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:972)
              at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2497)
              at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2870)
              at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2806)

    If the stacks of a large number of threads exist, as shown in the preceding example, a large number of threads are blocked and are waiting for responses from the database. This indicates that the bottleneck issue occurs on the database. Otherwise, focus on checking whether the bottleneck issue exists on the application.

Resolve database issues

After you use the preceding methods to determine that the bottleneck issue occurs on the database, we recommend that you use one of the following methods to resolve the bottleneck issue:

Method 1: Kill all statements

If the returned result of the PROCESSLIST statement shows that a large number of SQL statements are stacked, we recommend that you immediately kill all statements that are being executed. PolarDB-X provides the following statement to allow you to kill all statements that are being executed:

KILL "ALL"

The statement can be used to kill all connections between compute nodes and data nodes to ensure that all statements are killed.

Method 2: Restart the application

If the statements are stacked even after you use Method 1 and wait for a specific period of time, we recommend that you restart the application. This prevents the application from repeatedly attempting to execute SQL statements that incur high costs because the application is in the incorrect state.

Method 3: Throttle the number of SQL statements that can be executed in period of time

If the issue persists after you use Method 2, we recommend that you use the CCL_RULES throttling feature provided by PolarDB-X.

  1. Execute the SHOW FULL PROCESSLIST statement to find the template IDs of the SQL statements that consume a large proportion of resources.
    +----+---------------+-----------------+----------+-------------------------------+------+-------+-----------------------+-----------------+
    | ID | USER          | HOST            | DB       | COMMAND                       | TIME | STATE | INFO                  | SQL_TEMPLATE_ID |
    +----+---------------+-----------------+----------+-------------------------------+------+-------+-----------------------+-----------------+
    |  2 | polardbx_root | ***.*.*.*:62787 | polardbx | Query                         |    0 |       | show full processlist | NULL            |
    |  1 | polardbx_root | ***.*.*.*:62775 | polardbx | Query(Waiting-selectrulereal) |   12 |       | select 1              | 9037e5e2        |
    +----+---------------+-----------------+----------+-------------------------------+------+-------+-----------------------+-----------------+
    2 rows in set (0.08 sec)
  2. Use the template IDs to throttle the number of this type of SQL statements that can be executed in a period of time, as shown in the following example:
    CREATE CCL_RULE IF NOT EXISTS `test` ON *.* TO 'ccltest'@'%'
    FOR SELECT
    FILTER BY TEMPLATE('9037e5e2')
    WITH MAX_CONCURRENCY=10;
Method 4: Restart the database

If the preceding methods are invalid, restart the database.