[MySQL]When Mixed format + DML limit collide
Created#More Posted time:Sep 1, 2016 13:38 PM
1. MySQL V5.6.16
2. DB schema (Cascade) M --> New_M (log_slave_update) --> S
3. I believe that DB splitting, server room moving and DB upgrading will all use a similar schema.
4. Here comes the question: during recent data verification, I found there is more data in New_S than M and New_M. Isn’t it strange?
1. How is data verification realized? Through pt-table-checksum, or a self-composed comparison program.
2. After verification, there are 300 records in Table A on M, S and New_M instances, but there are 500 records on New_S and Bak_etl instances.
Why does the New_M instance have less data than New_S?
In such cases, we can only compare the binlog. But the binlog data size is so big, how can be checked?
In concrete analysis, the data in M, S and New_M instances are the same, indicating well-performing data synchronization between them.
New_S and Bak_etl have the same data, indicating a well-performing binlog between them.
So that leaves the M instance binlog, and it is inconsistent with the New_M binlog.
In other words, N records are deleted using the delete statement on M instance, but only delete N-200 records are logged in New_M binlog. As a result, the slave database contains 200 more records than the master database.
Next let’s verify it.
It is obvious that New_M has much fewer binlog events than M does. Some binlog records (starting from 214568151) are all lost.
Now the problem has been identified. But since SQL statements have been forcefully converted to the ROW mode, how can we locate the SQL statement?
Thankfully we have Performance Schema, a new feature of MySQL5.6, and the problematic SQL statements can be quickly located.
Now we have a few questions:
1) Why is binlog missing?
2) In what circumstances will it be missing?
We conducted repeated tests on the several questions and drew the following conclusion.
• Why is binlog missing?
Condition 1: When the binlog format is set as mixed, general copy works. But cascading copy may cause binlog loss in some special cases.
Condition 2: When there is a large number of data (around 4 million records) scanning updates, deletes and inserts and when indeterminate DML statements exist (such as delete from table where data<'N' limit).
When both Condition 1 and Condition 2 are met, the problem will occur. The above cases are the evidence.
Next, let’s test our theory. This time, we are not using delete + limit, but insert into select instead.
It is strongly recommended that you set MySQL binlog format to the ROW mode to ensure master-slave data consistency.
set global binlog_format=ROW;
Disable indeterminate DML statements, especially DML + limit with no order by.