Elizabeth
Engineer
Engineer
  • UID625
  • Fans4
  • Follows1
  • Posts68
Reads:2768Replies:0

[MySQL]Explore MySQL replication filters

Created#
More Posted time:Sep 5, 2016 11:10 AM
I. Background
Suppose we have such a request: the master instance has four databases: Database A, Database B, Database C and Database D. For some reason, we need to split Database B and Database C from the instance and put them onto another instance.  If you are handling the request, what would you do?  The common approach is: set up an instance with only Database B and Database C, then replicate Database B and Database C from the master instance and filter out Database A and Database D. That is how replication filters come into being, such as replicate-*-do-DB/table.
II. Why
To structure such an environment (for replicating Database B and Database C from the master instance only), some may make the following configurations in my.cnf file:
--replicate-do-db=B,C

Conventional wisdom deems it right. Unfortunately, a disaster has been harbored.
Official Documentation:
Warning
To specify multiple databases you must use multiple instances of this option. Because database names can contain commas, if you supply a comma separated list then the list will be treated as the name of a single database.


MySQL will regard B,C a database name, instead of two databases.
However, there is far more than only one replication filter argument here. It actually involves:
--replicate-do-db
--replicate-ignore-db
--replicate-do-table
--replicate-wild-do-table
--replicate-ignore-table
--replicate-wild-ignore-table


The arguments can be put in combinations at will, and different combinations have different meanings. To further understand their direct relationship, let’s dive a bit deeper.
III. Database-level replication flowchart
Attention 1: Database-level rules only apply to binlog_format=‘STATEMENT or MIXED’
Attention 2: If binlog_format=‘ROW’, database-level rules do not apply, but table-level rules apply.



IV. Table-level replication flowchart


V. Key points
Note:  All the following tests and conclusions are made on the premise that row_format=’MIXED’.
• In DB level, when binlog-format=statement, filtering is dominated by use DB statements (cross-database not allowed). When binlog_format=‘ROW’, filtering is not dominated by use DB statements (cross-database allowed)
• However, in table level, no matter whether binlog format = statement or = rows, table-level replication filtering is not dominated by use DB statements (cross-database allowed).
• In general, first, identify whether it is DB-level replication, if yes, exit the operation. If DB-level judgment completes and the operation is not exited, judge the table-level.
• In the DB-level flow, if replicate-do-db options exist, it will judge replicate-do-db and won’t go through replicate-ignore-db.  If replicate-do-db is judged to be matching the condition, it moves to the table-level.  Otherwise, the operation is existed.
• In the DB-level flow, if no replicate-do-db options are detected, but replicate-ignore-db exists,  the flow will be: if the replicate-ignore-db rule is observed, exit the operation. Otherwise, it moves to table-level for further judgment.
• **In table-level flow, the judgment logic order is (from top to bottom): replicate-do-table -> replicate-ignore-table -> replicate-wild-do-table -> replicate-wild-ignore-table **
• In the table-level flow, from the first stage (replicate-do-table), if the replicate-do-table judgment rule is observed, the operation exits. Otherwise, the flow jumps to the next step (replicate-ignore-table).  The rest can be done in the same manner until the last step (replicate-wild-ignore-table). If none of them meet the condition, judge whether replicate-do-table or replicate-wild-do-table exists. If yes, ignore & exit the operation. If no, execute & exit the operation.
VI. Test
Note:  All the following tests and conclusions are made on the premise that row_format=’MIXED’.
6.1 Special cases
Note: All the following tests are based on the statement format.  For the rows format, the principles can be proved in the same way. I will not illustrate it here.
• Case 1: Set replicate_do_DB=A,B


Result: Replication of Database A and Database B is not executed on the slave instance. Because MySQL treats ‘A,B’ as a database name.
6.2 Database-level rules
Note:
• do-db
replicate_do_DB=A
replicate_do_DB=B


• ignoare-db
replicate_ignore_DB=A
replicate_ignore_DB=B


• do-db & ignore-db
replicate_do_DB=A
replicate_do_DB=B
replicate-ignore-db=mysql
replicate-ignore-db=test


6.3 Table-level rules
• do-table
• ignore-table
• wild-do-table
• wild-ignore-table
• do-table & ignore-table
• do-table & wild-ignore-table
• wild-do-table & wild-ignore-table
6.4 Mixed database-level and table-level rules
• do-DB & do-table
replicate_do_DB=A
replicate_do_DB=B
replicate-do-table=table1
replicate-do-table=table2


• do-DB & wild-do-table
replicate_do_DB=A
replicate_do_DB=B
replicate-wild-do-table=mysql.%
replicate-wild-do-table=test.%


• do-DB & ignore-table
replicate_do_DB=A
replicate_do_DB=B
replicate-ignore-table=table1
replicate-ignore-table=table2


• do-DB & wild-ignore-table
replicate_do_DB=A
replicate_do_DB=B
replicate-wild-ignore-table=mysql.%
replicate-wild-ignore-table=test.%


• Most common scenarios: db-db & do-ignore-db & wild-do-table & wild-ignore-table
* Common scenarios: Split Database A and Database B on the master instance to a new instance.
* Feature:
    1) The slave instance won’t replicate the MySQL test database of the master instance.
    2) The slave instance only replicates all the operations to Database A and Database B on the master instance.

replicate-ignore-db=mysql
replicate-ignore-db=test
replicate-wild-ignore-table=mysql.%
replicate-wild-ignore-table=test.%

replicate_do_DB=A
replicate_do_DB=B
replicate-wild-do-table=A.%
replicate-wild-do-table=B.%

* Misunderstandings:
    1) If my default database is neither Database A nor Database B, the following operations won’t be executed on the slave instance, hence the tragedy.
    master> use C;insert into A.id values(1);

    2) So the above configurations in the .cnf file only apply to the case where the default database is either Database A or Database B.

* The configuration should go like this [Premise: developers have no permission to log in to MySQL test database]:

replicate-ignore-db=mysql
replicate-ignore-db=test
replicate-wild-ignore-table=mysql.%
replicate-wild-ignore-table=test.%

replicate-wild-do-table=A.%
replicate-wild-do-table=B.%


• **Practice: wild-do-table & ignore-table & wild-ignore-table **
Request:  Migrate a database from the old server to a new instance

old_master [Database:  A, B , mysql] ---->(Sync) new_master[A]

        * Verify the correctness of Database (A) replication:  Rule => slave: Only replicate Database A, do not replicate Database B
            my.cnf
                Replicate_Wild_Do_Table: A.%
                Replicate_Ignore_DB: mysql
                Replicate_Wild_Ignore_Table: mysql.%

        A) use A/B; insert A.a select B.b from B ;  --err: Synchronization error. No content of Database B on the slave instance
        b) use A/B; insert A.a select B.b from A,B where A.b=B.b;   --err: Synchronization error. No content of Database B on the slave instance
        c) use mysql; insert into A.a values('a'); --err:  No synchronization errors, but the binlog of the old master instance is not executed on the slave instance, because Replicate_Ignore_DB:  mysql,Replicate_Wild_Ignore_Table:  mysql.%

        d) use a blank database; insert into A.a values('a');  --ok:  to replicate the log.
        e) use B;   insert into A.a values('a');  --ok:  to replicate the log.


There are too many combinations to name all of them.
Concluding remarks
Numerous scenarios and combinations can be derived following the above principles. As long as the principles are well understood, it is easy.
Guest