This topic describes the Lizard transaction system that is launched for ApsaraDB RDS for MySQL 8.0 instances.

Background information

The performance of MySQL 8.0 is significantly enhanced thanks to the continuous improvement in the locking service and redo logging mechanism by the MySQL community. However, the transaction system of the InnoDB storage engine suffers problems such as interference between read and write requests and unstable XA transactions. To increase the throughput of ApsaraDB RDS for MySQL databases and allow them to support distributed transactions and global consistency, Alibaba Cloud launches the Lizard transaction system for ApsaraDB RDS for MySQL 8.0.22 (with a minor version of 20201231). The following section describes the benefits of the Lizard system:

Improved performance in high-concurrency scenarios

The InnoDB engine of the official MySQL version uses a global transaction system. Changes of the transaction status in data manipulation language (DML) processes and read views in queries all require access to the global transaction system. This causes severe read/write interference and limits the system throughput. However, Lizard bypasses the transaction system by not maintaining the read views used to implement multi-version concurrency control (MVCC) during queries. This allows MySQL databases to better utilize multi-core CPU resources and substantially enhances the transaction throughput in high-concurrency scenarios that involve both read and write requests.

In the following section, SysBench tests are performed to compare the performance of the official MySQL version and ApsaraDB RDS for MySQL. The test environment and test results are described.

  • Test environment
    • CPU: Intel(R) Xeon(R) Platinum 8163, with 96 cores and a 2.5 GHz base frequency
    • MySQL configuration:
      • innodb_buffer_pool_size: 50 GB
      • sync_binlog = 0: When a transaction is committed, binlogs are written to the cache, but not immediately to the disk. The system determines when to write binlogs to the disk.
      • innodb_flush_log_at_trx_commit = 2: When a transaction is committed, redo logs are written to the cache, but not immediately to the disk. A flush operation is performed once per second to write redo logs to the disk.
      Note sync_binlog = 0 and innodb_flush_log_at_trx_commit = 2 can reduce the interference of I/O subsystems, facilitate the utilization of multi-core CPU resources in high-concurrency scenarios, and eliminate concurrent contentions.
    • Test data amount: 30 GB (20 tables with 5 million records in each table)
  • Test scenario 1: SysBench OLTP Read_onlytest1
    Note The queries per second (QPS) of ApsaraDB RDS for MySQL is up to 42.2% higher than that of the official MySQL version.
  • Test scenario 2: SysBench OLTP Read_writetest2
    Note The QPS of ApsaraDB RDS for MySQL is up to 69.9% higher than that of the official MySQL version.
  • Test scenario 3: SysBench OLTP Write_onlytest3
    Note The QPS of ApsaraDB RDS for MySQL is up to 48.1% higher than that of the official MySQL version.

According to the preceding test results, ApsaraDB RDS for MySQL that uses Lizard can eliminate concurrent contentions that occur in high-concurrency scenarios to improve the system throughput.

Native flashback query

During daily O&M of ApsaraDB RDS for MySQL databases, you may encounter accidental changes or unexpected data modifications. After you commit a transaction, you may not be able to roll it back. This issue can be solved by using Lizard. Its native flashback query feature allows you to have a consistent view of data as of an earlier point in time.

The following code shows the flashback query syntax:

SELECT ... FROM tablename
  AS OF [SCN | TIMESTAMP] expr;

Examples

  • Initialize data
    mysql> CREATE TABLE tab (
        -> id int PRIMARY KEY AUTO_INCREMENT,
        -> version int,
        -> gmt_modify timestamp
        -> );
    
    mysql> INSERT INTO tab VALUES (1, 1, now()),(2, 1, now());
    
    mysql> COMMIT;
    
    mysql> SELECT * FROM tab;
    +----+---------+---------------------+
    | id | version | gmt_modify          |
    +----+---------+---------------------+
    |  1 |       1 | 2020-12-17 16:40:38 |
    |  2 |       1 | 2020-12-17 16:40:39 |
    +----+---------+---------------------+
  • Update data
    mysql> UPDATE tab SET version = version + 1, gmt_modify = now();
    
    mysql> COMMIT;
    
    mysql> SELECT * FROM tab;
    +----+---------+---------------------+
    | id | version | gmt_modify          |
    +----+---------+---------------------+
    |  1 |       2 | 2020-12-17 16:40:54 |
    |  2 |       2 | 2020-12-17 16:40:54 |
    +----+---------+---------------------+
  • Retrieve data by using flashback query
    mysql> SELECT * FROM tab AS OF TIMESTAMP '2020-12-17 16:40:40';
    +----+---------+---------------------+
    | id | version | gmt_modify          |
    +----+---------+---------------------+
    |  1 |       1 | 2020-12-17 16:40:38 |
    |  2 |       1 | 2020-12-17 16:40:39 |
    +----+---------+---------------------+
    
    mysql> SELECT * FROM tab AS OF TIMESTAMP '2020-12-17 16:40:55';
    +----+---------+---------------------+
    | id | version | gmt_modify          |
    +----+---------+---------------------+
    |  1 |       2 | 2020-12-17 16:40:54 |
    |  2 |       2 | 2020-12-17 16:40:54 |
    +----+---------+---------------------+
    Note If you query data as of an earlier point in time when undo records have been truncated, an ERROR 7546 (HY000): Snapshot too old error is returned.

ApsaraDB RDS for MySQL provides the following parameters to facilitate the management of flashback query.

Parameter INNODB_UNDO_RETENTION INNODB_UNDO_SPACE_SUPREMUM_SIZE INNODB_UNDO_SPACE_RESERVED_SIZE
Description The maximum period of time for which InnoDB retains the undo records. The value must be greater than 0. Unit: seconds.

The longer the period, the earlier the records supported by flashback query, and the more space the undo tables occupy.

The maximum size of undo tablespaces used by InnoDB. Unit: MB.

When the specified threshold is reached, the undo records are forcibly deleted regardless of the INNODB_UNDO_RETENTION value.

The space reserved for the undo tablespaces used by InnoDB. Unit: MB.

Within the period specified by INNODB_UNDO_RETENTION, the space reserved by INNODB_UNDO_SPACE_RESERVED_SIZE is used to store as many undo records as possible.

Command syntax --innodb-undo-retention=# --innodb-undo-space-supremum-size=# --innodb-undo-space-reserved-size=#
Parameter range Global Global Global
Dynamic Yes Yes Yes
Prompts for value change No No No
Data type Integer Integer Integer
Default value 0 102400 0
Valid values 0 – 4294967295 0 – 4294967295 0 – 4294967295

Support for XA transactions and global consistency

To provide comprehensive support for XA transactions and global consistency, Lizard supports the assignment of global commit numbers (GCNs). The following section describes the methods to use this feature:

  • During the commit phase, a transaction can be assigned a GCN and committed by using the COMMT/XA COMMIT by GCN statement.
  • During the query phase, a transaction can be assigned a GCN and committed by using the SELECT by GCN statement.

The following figure shows a simple distributed architecture and scenario. In the example, a global Time Sharing Option (TSO) node is maintained and three ApsaraDB RDS for MySQL instances on High-availability Edition are created.

test5

The following section shows the sample commands:

  • Node 1
    XA BEGIN $xid;
    UPDATE account SET balance = balance + 10 WHERE user = 'Johnson';
    XA END;
    XA PREPARE $xid;
    XA COMMIT $xid $GCN;
  • Node 2
    XA BEGIN $xid
    UPDATE account SET balance = balance - 10 WHERE user = 'Lisa';
    XA END;
    XA PREPARE $xid;
    XA COMMIT $xid $GCN;
  • Query statements
    • Node 1
      SELECT * FROM account AS OF GCN $GCN where user ='Johnson';
    • Node 2
      SELECT * FROM account AS OF GCN $GCN where user ='Lisa';
Note ApsaraDB RDS for MySQL supports GCN-related syntax. This feature is in public preview. To use this feature, submit a ticket.