You can use MySQL Binlog Reader to read data from tables in your MySQL database in real time by subscribing to binary logs (binlogs). This topic describes how to configure MySQL Binlog Reader. This topic also describes the network environment and permissions that you must prepare before you configure MySQL Binlog Reader.
- Prepare the data source: A MySQL data source is created.
- Plan and prepare resources: An exclusive resource group for Data Integration is purchased and configured. For more information, see Plan and configure resources.
- Evaluate and plan the network environment: Before you add the data source to DataWorks,
connect the data source to an exclusive resource group for Data Integration based
on your business requirements. After you connect the data source to the exclusive
resource group for Data Integration, configure network access settings such as a vSwitch
and a whitelist.
- If data sources and exclusive resource groups for Data Integration reside in the same region and virtual private cloud (VPC), they are automatically connected.
- If data sources and exclusive resource groups for Data Integration reside in different network environments, you must connect data sources and resource groups by using methods such as a VPN gateway.
- Prepare an account and grant permissions to the account:
You must create an account that can be used to access the data sources, read data from the source, and write data to the destination during the data synchronization process.For more information, see Create and authorize an account.
- Enable the MySQL binary logging feature. This feature is required only for real-time
data synchronization. For more information about real-time data synchronization, see
Overview of real-time synchronization.
If the source data source is a MySQL database, you must enable the binary logging feature. Binary logs record changes to all table schemas and modifications on table data. You can execute statements such as CREATE and ALTER to perform operations on table schemas. You can execute statements such as INSERT, UPDATE, and DELETE to perform operations on table data. You can use binary logs to view the change history of the database, back up incremental data and restore data in the database, and replicate data from the primary database to secondary databases.Formats of binary logs:For more information about how to enable the binary logging feature, see Enable the binary logging feature.
- Statement: SQL statement-based replication. Binary logs in this format record the SQL statements that are executed to modify data entries.
- Row: row-based replication. Binary logs in this format record only modification details about data entries in rows.
- Mixed: replication in mixed mode. This mode combines the statement and row formats. In most cases, binary logs in the statement format are used to record the SQL statements that are executed to modify data entries, such as functions. If data replication from the primary database to secondary databases cannot be implemented by using binary logs in this format, switch to the row format. MySQL determines which format to use based on each SQL statement that is executed.
- Data Integration does not support data synchronization from a read-only database by using MySQL Binlog Reader.
- MySQL Binlog Reader reads data in your MySQL database in real time by subscribing
to binlogs. MySQL Binlog Reader supports data synchronization only from ApsaraDB RDS
8.x. MySQL Binlog Reader does not support data synchronization from Distributed Relational Database Service (DRDS).
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- Select the region where the required workspace resides, find the workspace, and then click Data Analytics.
- Move the pointer over the icon and choose .Alternatively, you can click the required workflow, right-click Data Integration, and then choose .
- In the Create Node dialog box, set the Node Name and Location parameters.Notice The node name must be 1 to 128 characters in length. It can contain letters, digits, underscores (_), and periods (.).
- Click Commit.
- On the configuration tab of the real-time sync node, choose and drag MySQL Binlog to the canvas on the right.
- Click the MySQL Binlog component node. In the panel that appears, set the parameters in the Node configuration section.
Parameter Description Data source The MySQL Binlog data source that you have added. You can select only a MySQL Binlog data source.
If no data source is available, click New data source to add one on the Data Source page.
Table The name of the table from which you want to read data. You can click Data preview on the right to preview the selected table.In the database sharding scenario, MySQL Binlog Reader can read data from multiple tables and databases in real time.Notice To prevent errors, make sure that the tables use the same schema. Output field The fields that you want to synchronize. Valid values:
- Manage fields: the additional fields that are automatically added when the fields in the source tables are synchronized to the destination. These fields facilitate data management, sorting, and deduplication.
- Data fields: the fields in the source tables that you want to synchronize.
For more information, see Fields used for real-time synchronization.MySQL Binlog Reader supports the sharding feature. To use this feature, click Add a database and table data source and select the data sources and tables from the drop-down lists to synchronize data from multiple tables and databases.Notice To prevent errors, make sure that the selected tables use the same schema.
- Click the icon in the toolbar.