All Products
Search
Document Center

Appendix to changing schemas without locking tables

Last Updated: Jul 29, 2020

Data Management Service (DMS) supports changing schemas without locking tables. This helps avoid service unavailability or failure caused by tables being locked.
For example, MySQL earlier than 5.6 does not support online data definition language (DDL) operations. In most cases, to alter a table, you need to lock the table, create a temporary copy of the table, alter the temporary copy, copy data from the original table to the temporary copy, exchange the names of the temporary copy and the original table, and then unlock the original table.A table larger in size will be locked for a longer period of time. Consequently, it is extremely difficult to alter a table in an online environment.
MySQL 5.6 or later supports online DDL operations, allowing most ALTER TABLE operations to be performed without locking tables. However, some operations, such as adding or deleting columns and changing column data types, still need to be performed on a copy of a table that you want to alter.
Are there any other plans for changing schemas of large tables without locking them?

Comparison of plans

We recommend that you do not perform certain DDL operations directly on tables in an online environment. The following are several common ways to alter tables without affecting business:

  • Alter tables during off-peak hours.

    To alter tables without affecting business, you need to specify a long enough period of time during off-peak hours. If an excessively large table fails to be altered within the specified period of time, your business will be affected.

  • Alter tables in the secondary database and then perform a switchover between the primary and secondary databases.

    You need a primary database and a secondary database and you need to switch them in an appropriate period of time.

  • Use online schema change tools, such as pt-online-schema-change, OnlineSchemaChange, and gh-ost.

    These three tools perform DDL operations on copies of tables so that tables can be altered without being locked. In this way, the read/write operations on the tables will not be blocked.

  • pt-online-schema-change: Assume that you want to alter Table A. This tool works by creating Table A_gst, which is an empty copy of Table A, performing DDL operations on Table A_gst, and then copying data from Table A to Table A_gst.This tool creates a data manipulation language (DML) trigger on Table A. During the process of copying data from Table A to Table A_gst,the DML trigger will capture incremental changes to Table A and synchronously apply the changes to Table A_gst. When the tool finishes copying data to Table A_gst,it uses a RENAME TABLE operation to simultaneously rename the original and temporary tables.

  • OnlineSchemaChange: This tool works basically the same way as the pt-online-schema-change tool does.
    The difference is that this tool adopts an asynchronous design. A log table is created based on Table A_gst.Changes to data in Table A that are captured by the DML trigger will be recorded in the log table. Background processes will apply the recorded changes in the log table to Table A_gst.The entire process is asynchronous. Therefore, you can customize the process of applying the changes to the destination table.
  • gh-ost: This tool works basically the same way as the preceding two tools do. However, this tool does not use triggers.Instead, this tool uses binlogs to capture table changes. It reads changes to Table A that are recorded in the binlogs andanalyzes and applies the changes to Table A_gst. The binlogs include only changes to Table A.Therefore, these changes can be obtained and applied to Table A_gst more easily.

Issues with trigger-based design

Trigger-based tools have relatively simple code logic. Triggers are responsible for most data processing operations, such as implicit processing and data type conversion. This simplifies the complex process of real-time table migration. However, trigger-based tools may cause the following issues.

  • Increased database overhead

    A trigger is a stored procedure. With business continuously requiring more DML operations, the execution overhead of triggers will increase, especially in peak hours.

  • Intensified lock contention

    A trigger groups operations on the source and destination tables in the same transaction. Therefore, concurrent operations on the two tables try to acquire the lock on the same object, intensifying lock contention. In trigger-based design, full data replication and incremental data change must be performed concurrently, which undoubtedly intensifies lock contention.

  • Inability to suspend triggers

    In trigger-based design, triggers are always operating and cannot be suspended. Even when the server becomes busy, the secondary database cannot keep up with the updates occurring in the primary database, or exceptions occur, triggers cannot be canceled at any stage during the schema change process. Forced cancellation of triggers will interrupt the schema change process or cause data loss, leading to inaccurate data in Table A_gst. Although the pt-online-schema-change and OnlineSchemaChange tools both support the notion of throttling, the throttling feature designed for trigger-based tools is largely limited. For example, the process of copying data from a table can be suspended or slowed down, but the trigger is still operating and causing overhead.

  • Inability to test reliability

    When experimenting with a solution, you may want to know in advance how much time a task will take. A trigger-based solution allows you to create and simulate triggers on a secondary database, provided that the solution uses statement-based replication. With row-based replication, there is no option at all to use triggers on a secondary database, because the triggers only execute on the primary database and the secondary database directly applies the changes. Even with statement-based replication, because the MySQL replication replay process is single-threaded on a primary database, you cannot get a reliable representation of the replication process as it would have executed on the primary database. Therefore, concurrency and locking issues cannot be verified or tested.

Triggerless design

The biggest advantage of triggerless design is the decoupling of database load. With triggers, either synchronous or asynchronous, every DML operation on the source table is synchronized to the destination table. Triggerless design decouples these two tables. That is, writes to the destination table and writes to the source table are unrelated. This design will perfectly resolve several issues caused by trigger-based design.

  • Reduce database overhead.

    A triggerless tool imposes as a replicating server. It reads binlogs of primary and secondary databases and applies events of the source table to the destination table. This process does not affect the source table or require any stored procedure on the database to interfere with the writes to the destination table. Therefore, the problem of triggers causing overhead does not exist.

  • Reduce lock contention.

    Writes to the source table and writes to the destination table are unrelated. Therefore, lock contention does not exist. Regarding the lock contention that occurs during the processes of copying data from the source table to the destination table and updating the destination table, the cross-execution method is used to logically avoid and reduce the lock contention. Although this method may affect the efficiency of altering tables, it obviously reduces the database load.

  • Suspend or slow down threads at any time.

    The threads that are used to acquire binlogs can be suspended or slowed down at any time. When the system is busy or the replication lag between the primary and secondary databases is large, throttling can be enabled for applications that are currently running. This helps avoid further issues.

  • Support reliability testing.

    Another advantage of binlog-based migration is that there is no difference between performing online operations on the primary and secondary databases. This avoids resource contention or interference with online business. On the other hand, after data changes are simulated on the secondary database, the source table and destination table are not switched. The simulation is followed by a checksum of the entire table data, on both source and destination tables, to continuously verify the reliability. However, there is no perfect one-size-fits-all solution for all scenarios. Triggerless design also has some disadvantages.

Issues with triggerless design

  • Complex code

    In trigger-based design, most tasks are performed by the database and triggers are responsible for synchronization. Therefore, the role of tools themselves is relatively small. Triggerless design relies on binlogs. This brings about high flexibility but causes great complexity. A triggerless tool needs to register as a replicating server, obtain events of a source table, and convert the events to SQL statements to apply the events to a destination table. Code of triggerless tools needs to incorporate handling methods for exceptions such as connection failures, replication lag, data type mismatch, inappropriate load on applications, and other uncontrollable exceptions. Therefore, triggerless tools have a larger codebase and more complicated concurrency control logic.

  • High network traffic

    Compared with trigger-based design where data is processed within a database, triggerless solutions require tools to use binlogs to capture changes to the source table and apply the recorded changes to the destination table. This will generate traffic between hosts and consume MySQL processes. To ensure the robustness and stability of the complex code, rigorous algorithm logic and a complete set of test cases are required. Despite higher requirements, the complex code of triggerless tools brings more additional features. For example, you can specify when to exchange the names of the temporary and original tables, when to copy tables, or implement throttling for synchronizing incremental changes.

Summary

Table locking caused by changing schemas can cause great damage to business. To resolve this issue, you can use triggerless tools to change schemas of large tables without locking the tables. Triggerless design is not perfect, but its benefits are obvious.