Table locks that result from scheme changes may cause great damage to business. To resolve this issue, the data definition language (DDL)-based lockless change feature introduces a triggerless design and allows you to change the schemas of large tables without locking the tables. This design is not perfect but provides substantial benefits.

Background information

For example, in versions earlier than MySQL 5.6, online DDL operations cannot be performed on MySQL databases. In most cases, to alter a table, you must lock the table, copy data from the original table to a temporary table, name the temporary table after the original table, and then unlock the original table. A larger table indicates the longer duration for which the table is locked. This results in difficulty in altering tables in an online environment.

In MySQL 5.6 and later, online DDL operations can be performed on InnoDB tables. In most cases, you can alter tables without locking the tables. However, you cannot perform online DDL operations to add or delete columns, or change column data types.

Comparison of solutions

We recommend that you do not perform DDL operations on tables in an online environment. You can use the following solutions to alter tables without affecting your business:

  • Alter tables during off-peak hours.

    To alter tables without affecting business, you must specify a long enough period of time for altering the tables during off-peak hours. If an excessively large table fails to be altered within the specified period of time, your business will be affected.

  • Alter tables in the secondary database and perform a switchover between the primary and secondary databases.

    You must have primary and secondary databases. You must switch the databases in an appropriate period of time.

  • Use online schema change tools, such as pt-online-schema-change, OnlineSchemaChange, and gh-ost.

    These tools perform DDL operations on table copies. This way, tables can be altered without being locked and data can be read from or written to the tables.

    • How the pt-online-schema-change tool works: Assume that you want to alter Table A. This tool creates a temporary table named Table A_gst and performs DDL operations on Table A_gst. Then, this tool creates a data manipulation language (DML) trigger on Table A and copies data from Table A to Table A_gst. In the process of copying data from Table A to Table A_gst, the DML trigger captures incremental changes to Table A and synchronously applies the changes to Table A_gst. After the data is copied from Table A to Table A_gst, this tool renames Table A_gst after Table A.
    • How the OnlineSchemaChange tool works: This tool basically works the same way as the pt-online-schema-change tool. The difference is that this tool adopts an asynchronous design. A log table is created based on Table A_gst. Changes to Table A are captured by the DML trigger and recorded in the log table. Then, backend processes apply the recorded changes in the log table to Table A_gst. The entire process is asynchronous. Therefore, you can manage the process of applying the changes to Table A_gst.
    • How the gh-ost tool works: This tool basically works the same way as the preceding two tools. However, this tool does not use triggers. Instead, this tool uses binary logs to capture incremental table changes. This tool reads the changes to Table A that are recorded in the binary logs. Then, it analyzes and applies the changes to Table A_gst. The binary logs contain changes only to Table A. Therefore, these changes can be obtained and applied to Table A_gst with ease.

Trigger-based issues

Trigger-based tools have simple code logic. You can use triggers to process data in most cases. For example, you can use triggers to process data in an implicit way and convert data types. This simplifies the complex process of migrating table data in real time. However, trigger-based tools may cause the following issues:

  • Increased database overheads

    A trigger is a stored procedure. As business develops, more DML operations are required. This increases the overheads that are generated by using triggers, particularly in peak hours.

  • Intensified lock contention

    A trigger groups the operations that are performed on the original table and the temporary table in the same transaction. In this case, concurrent operations on the two tables may acquire the lock on the same object. This intensifies lock contention. In addition, trigger-based tools must concurrently copy data and change data. This further intensifies lock contention.

  • Inability to suspend triggers

    In a trigger-based design, triggers must keep running and cannot be suspended. Despite peak hours, exceptions, or latency in synchronization between primary and secondary databases, triggers cannot be canceled at any time during a schema change. Forced cancellation will interrupt the schema change or cause data loss. This affects data accuracy in Table A_gst. The pt-online-schema-change and OnlineSchemaChange tools are trigger-based tools. The tools provide the throttling feature. However, the throttling feature is limited to specific extent. For example, you can use this feature to suspend or slow down a data copy process, but the trigger is still operating and generating overheads.

  • Inability to test reliability

    To verify a solution, you may want to know the duration of a task. If a trigger-based solution executes SQL statements to replicate data, you can create and use triggers to simulate data replication on a secondary database. However, when data is replicated based on rows, no trigger is required on the secondary database. This is because the triggers are operating only on the primary database and the secondary database directly applies the changes. In addition, even though SQL statements are executed to replicate data, you cannot simulate concurrent replication processes on the primary database. This is because a MySQL database uses a single thread to replay the replication process. Therefore, concurrency and locking issues cannot be verified or tested.

Triggerless design

The key benefit of a triggerless design is the decoupling of database load. A trigger-based design synchronizes each DML operation on the original table to the temporary table in a synchronous or an asynchronous manner. However, this process is decoupled in the triggerless design. To be specific, the process of writing data to the temporary table is separated from that of writing data to the original table. This resolves the issues that are caused by the trigger-based design and has the following benefits:

  • Reduce database overheads

    A triggerless tool imposes as a replicating server. It reads the binary logs of primary and secondary databases and applies the events of the original table to the temporary table. This process makes no changes to the original table and requires no stored procedure on the database to interfere with the writes to the temporary table. Therefore, no trigger-based overhead is generated.

  • Reduce lock contention

    The process of writing data to the temporary table is separated from that of writing data to the original table. This prevents lock contention. In addition, to logically avoid and reduce lock contention, data is copied to and updated in the temporary table at different time periods. This affects the efficiency in altering the table but significantly reduces database load.

  • Suspend or slow down threads at any time

    The decoupling feature allows you to suspend or slow down the threads that are used to acquire binary logs at any time. During peak hours or when the latency in synchronization between the primary and secondary databases is large, throttling can be enabled for the applications that are running. This helps avoid further issues.

  • Support reliability tests

    When you migrate data by using binary logs, you can perform online operations on the primary database or secondary database to achieve the same effects. This avoids resource contention or interference with online services. In addition, after you simulate the data change process in the secondary database, the original table and temporary table are not switched. This way, you can continue to verify the reliability by checking the data in both the original table and temporary table.

Despite the preceding benefits, the triggerless design has the following issues:

  • Complex code

    The trigger-based design uses databases and triggers to synchronize data. Tools play a small part in synchronizing data. The triggerless design uses binary logs to synchronize data. This method is flexible but complex. A triggerless tool must be registered as a replicating server, obtain the events of the original table, convert the events to SQL statements, and then execute the SQL statements to apply the events to the temporary table. Code of the triggerless tool incorporates handling methods for exceptions such as connection failures, replication latency, data type mismatch, inappropriate load on applications, and other uncontrollable exceptions. Therefore, the triggerless tool has a larger codebase and more complex logic for concurrency control.

  • High network traffic

    Compared with the trigger-based design where data is processed within a database, the triggerless design uses binary logs to capture changes to the original table and applies the recorded changes to the temporary table. This consumes traffic between hosts and occupies MySQL processes. In addition, to ensure robust and stable code logic, the code must be rigorously developed and bountiful tests must be carried out. Despite the preceding challenges, the triggerless design provides substantial benefits. For example, you can specify when to exchange the names of the temporary and original tables. You can manage the process of data replication and implement throttling.