This topic describes how to use the hot row optimization feature to improve the performance of database systems.

Prerequisites

Your PolarDB cluster must be one of the following versions:

  • A cluster of ApsaraDB PolarDB MySQL-compatible edition 5.6 whose minor engine version is 20200601 or later.
  • A cluster of ApsaraDB PolarDB MySQL-compatible edition 8.0 whose minor engine version is 8.0.1.1.10 or later.
Note For information about how to upgrade the minor engine version of your cluster, see Upgrade versions.

Background

Pain points:
  • Hot rows refer to the rows in a database that are frequently added, deleted, modified, or queried. When a transaction updates data in a row, the row is locked until the transaction is committed or rolled back. Each row can be updated by only one transaction at a time. Other transactions are pending. Requests to update a hot row are processed in serial order. Therefore, traditional database sharding and table sharding policies are ineffective to improve database performance.
  • On e-commerce platforms, limited-time sales and flash sales are frequently used for promotion. In these scenarios, a large number of requests to update a hot row are sent to the database within a short period of time. This results in row lock contention and queues for acquiring row locks. This degrades system performance. If an update request waits for a long period of time before the update request is processed, services are adversely affected.
Improvements to hardware cannot fix the preceding issue or reduce latency. To fix the issues, PolarDB optimizes the kernels of databases. The optimized kernels automatically identify the requests to update hot rows and group the requests to update the same hot row made within a specific period of time. Different groups of requests are processed in parallel in pipelines. This way, system performance is improved. Solutions:
  • Convert serial processing to parallel processing

    Parallel processing is the easiest way to improve the performance of a database system. However, it is difficult for the system to process all the requests to update the same hot row in parallel. PolarDB uses pipelines to process the update requests to improve the performance of parallel processing for hot rows to the maximum extent.

    1

    The SQL statements that are executed to update hot rows are attached with the autocommit or commit_on_success label. The optimized MySQL kernel can automatically identify update requests that are attached with the autocommit or commit_on_success label. The kernel can also be used to hash the collected update requests based on primary keys or unique keys. Then, the update requests that are allocated to the same bucket are processed and committed in groups and batches based on the arrival time.

    To process these update requests in pipelines, two execution units are required to group these requests. When the update requests in the first group are collected and are ready to commit, the second group of update requests are collected. When the update requests in the second group are collected and are ready to commit, the update requests in the first group are committed. Then, the next batch of update requests starts to be collected in the first group. This ensures that update requests are processed in parallel.

    Multi-core CPUs are widely used. This pipelined processing method can make full use of hardware resources, enhance CPU utilization, and improve the parallel processing capability of database systems. This maximizes the throughput of database systems.

  • Eliminate the need to wait for acquiring row locks

    To ensure the logical consistency of data, a row must be locked when it is updated. If requests to acquire row locks cannot be immediately processed, the requests wait in a queue. This increases request processing time and triggers deadlock detection. This consumes additional resources.

    The requests to update the same row are grouped in chronological order. The first update request in a group is the leader. The leader reads data in the row and locks the row. The other update requests in the same group are followers. If the row is locked by the leader, the followers can immediately acquire the row locks.

    This reduces the number of times and time consumed for row locks, and improves the performance of the database system.

  • Reduce the traversals of B-tree indexes

    MySQL manages data based on B-tree indexes. During each query, MySQL traverses all indexes to query the required row. If the table is large and has multiple index levels, a long period of time is required to traverse indexes.

    If the grouping mechanism is used, only the leader of each group needs to traverse indexes to query the required row. Then, the leader caches the updated row to the memory. The followers in the same group can read the row from the memory after they acquire the row locks. Followers do not need to traverse all of the indexes again.

    This reduces the number of index traversals and the time consumed.

Usage limits

The hot row optimization feature is not supported in the following scenarios:
  • Hot rows are in a partitioned table.
  • A trigger is defined on the table to which the hot rows belong.

Use the hot row optimization feature

  • Before you use the hot row optimization feature to improve the performance of a database, you must configure the following parameters. For more information about how to configure the parameters, see Specify cluster parameters.
    Parameter Description
    hotspot Specifies whether to enable the hot row optimization feature. Valid values: ON and OFF. Default value: OFF.
    hotspot_for_autocommit Specifies whether to enable the hot row optimization feature to optimize the UPDATE statement that is in auto-commit mode. Valid values: ON and OFF. Default value: OFF.
    Note You can configure this parameter only when the hotspot parameter is set to ON.
    hotspot_update_max_wait_time The time that is required for the leader in a group to wait for the followers to join the group in the process of request grouping. Unit: microseconds. Default value: 100 μs.
    hotspot_lock_type Specifies whether to use a new type of row lock when the system processes requests to update the same data row in a group. Valid values: ON and OFF. Default value: OFF.
    Note If this parameter is set to ON, the requests to update the same hot row do not need to wait to acquire row locks. This improves the performance of the database system.
    Note
    • The hotspot_for_autocommit, hotspot_update_max_wait_time, and hotspot_lock_type parameters are not available. To use these parameters, Submit a ticket.
    • To use the hot row optimization feature, you must enable the binary logging feature. Therefore, if you try to set the hotspot parameter to ON when the binary logging feature is disabled, an error is returned.
      If you execute the MySQL [(none)]> set global hotspot=ON; statement to set the hotspot parameter to ON when the binary logging feature is disabled, the following error is returned:
      ERROR 3042 (HY000):
      Variable 'hotspot' cannot be enabled because 'bin logging' is enabled/disabled

      For more information about how to enable binary logging, see Enable binary logging.

    • When the binary logging feature is enabled globally but disabled for sessions, you can set the hotspot parameter to ON. However, hot row optimization does not take effect on the UPDATE statement.
    • If you execute the MySQL [(none)]> set global hotspot=ON; statement to set the hotspot parameter to ON when the rds_ic_reduce_hint_enable parameter is set to ON, the following error is returned:
      ERROR 3042 (HY000): Variable 'hotspot' cannot be enabled because 'rds_ic_reduce_hint_enable' is enabled/disabled
  • You can execute the following statement to check the parameter settings:
    show variables like "hotspot%";

    The following result may be returned:

    +------------------------------+-------+
    |
    Variable_name                | Value |
    +------------------------------+-------+
    |
    hotspot                      | OFF   |
    |
    hotspot_for_autocommit       | OFF   |
    |
    hotspot_lock_type            | OFF   |
    |
    hotspot_update_max_wait_time | 100   |
    +------------------------------+-------+
  • The hotspot row optimization feature involves three new optimizer hints:
    hint Required Description
    COMMIT_ON_SUCCESS Yes Commits a transaction when the update is successful.
    ROLLBACK_ON_FAIL No Rolls back a transaction when the update fails.
    TARGET_AFFECT_ROW(1) No Explicitly specifies that the request updates only one row. If more than one row is updated, the update fails.
  • You can execute the following statement to view the statistics of hot row optimization:
    show global status like 'Group_update%';

Performance testing

  • You can execute the following statements to conduct a performance test.
    • Create a table
      CREATE TABLE sbtest(id INT UNSIGNED NOT NULL, c BIGINT UNSIGNED NOT NULL) PRIMARY KEY (id)
    • Perform an UPDATE operation
      UPDATE COMMIT_ON_SUCCESS ROLLBACK_ON_FAIL TARGET_AFFECT_ROW 1 sbtest SET c=c+1 WHERE id = 1
  • Test tool: SysBench
  • Scenarios and test results
    • Scenario 1: one hot row and an 8-core CPU

      In this scenario, an 8-core CPU is used and a large number of concurrent requests are sent to update a hot row. After hot row optimization is enabled, the performance of the database is improved by about 64 times.

      2
    • Scenario 2: one hot row and a 32-core CPU

      In this scenario, a 32-core CPU is used and concurrent requests are sent to update a hot row. After hot row optimization is enabled, the peak queries per second (QPS) is increased by 63 times. When the number of concurrent requests reaches 8,000, the performance of the database is improved by 46 times.

      2
    • Scenario 3: eight hot rows and a 32-core CPU

      In this scenario, a 32-core CPU is used and concurrent requests are sent to update eight hot rows. After hot row optimization is enabled, the maximum of QPS is improved by 20 times.

      If the number of concurrent requests reaches 16,000, database failures occur and no results are returned when hot row optimization is disabled. However, when hot row optimization is enabled, row data is updated as expected and QPS remains stable.

      4