All Products
Search
Document Center

DataWorks:Configure and manage a real-time synchronization node

Last Updated:Jan 15, 2024

After you prepare data sources, network environments, and resources, you can create a real-time synchronization node to synchronize data to AnalyticDB for MySQL. This topic describes how to create a real-time synchronization node and view the status of the node.

Prerequisites

  1. The data sources that you want to use are prepared. Before you configure a data synchronization node, you must prepare the data sources from which you want to read data and to which you want to write data. This way, when you configure a data synchronization node, you can select the data sources. For information about the data source types, readers, and writers that are supported by real-time synchronization, see Data source types that support real-time synchronization.

    Note

    For information about the items that you need to understand before you prepare a data source, see Overview.

  2. An exclusive resource group for Data Integration that meets your business requirements is purchased. For more information, see Create and use an exclusive resource group for Data Integration.

  3. Network connections are established between the exclusive resource group for Data Integration and the data sources. For more information, see Establish a network connection between a resource group and a data source.

  4. The data source environments are prepared. You must create an account that can be used to access a database in the source and an account that can be used to access a database in the destination. You must also grant the accounts the permissions required to perform specific operations on the databases based on your configurations for data synchronization. For more information, see Overview.

Limits

  • You can use only exclusive resource groups for Data Integration to run real-time synchronization nodes.

  • You can use a real-time synchronization node to synchronize data to an AnalyticDB for MySQL data source only from a PolarDB, MySQL, or ApsaraDB for OceanBase data source.

  • A real-time data synchronization node cannot be used to synchronize data from a table that has no primary key.

Create a real-time data synchronization node

  1. Create a real-time synchronization node to synchronize all data in a database.

  2. Configure an exclusive resource group for Data Integration.

  3. Configure the source and synchronization rules.

    1. In the Data Source section of the Configure Source and Synchronization Rules step, configure the Type and Data source parameters.

    2. Select the tables from which you want to read data.

      In the Source Table section, all tables in the selected data source are displayed in the Source Table list. You can select all or some tables from the Source Table list and click the Icon icon to move the tables to the Selected Source Table list.

      Important

      If a selected table does not have a primary key, the table cannot be synchronized in real time.

    3. In the Mapping Rules for Table Names section, click Add Rule, select a rule type, and then configure a mapping rule of the selected type.

      By default, data in the source tables is written to the destination tables that are named the same as the source tables. You can also configure mapping rules to specify the names of the destination tables to which you want to write data. Data Integration allows you to use a regular expression to configure a mapping rule to specify the names of the destination tables. You can also concatenate built-in variables to specify the names of the destination tables. You can configure a mapping rule to synchronize data from multiple tables in the source to the same table in the destination. You can also configure a mapping rule to synchronize data from source tables whose names start with a specified prefix to the destination tables whose names start with another specified prefix. For more information about the configuration logic, see Configure the source and synchronization rules.

  4. Configure the destination tables.

    1. In the Set Destination Table step, configure the Target AnalyticDB for MySQL 3.0 data source parameter.

    2. Confirm the setting of the Write Mode parameter.

      The Write Mode parameter supports only Replay. If a data record is inserted into a source table, the same data record is also inserted into the related destination table. If data records in a source table are updated or deleted, the same data records are also updated in or deleted from the related destination table.

    3. Refresh mappings between the source tables and destination tables.

      You can click Refresh source table and AnalyticDB for MySQL 3.0 Table mapping to map the source tables and destination tables based on the mapping rules that you configured in the Mapping Rules for Table Names section. If no mapping rule is configured in the Mapping Rules for Table Names section, data in the source tables is written to the destination tables that are named the same as the source tables. If no such destination table exists, the system automatically creates the tables in the destination. You can also modify the table generation method and add additional fields to the destination tables.

      Note

      The names of the destination tables are automatically generated based on the mapping rules that you configured in the Mapping Rules for Table Names section.

      Operation

      Description

      Synchronize a source table that does not have a primary key

      The current solution cannot be used to synchronize data from a source table that does not have a primary key. If you want to synchronize data from a source table that does not have a primary key, you must click the Edit icon in the Synchronized Primary Key column of the table to specify a primary key for the table. You can use a field or a combination of multiple fields as the primary key of the source table. The system removes duplicate data based on the primary key during data synchronization.

      Select a table generation method

      You can select Create Table or Use Existing Table from the drop-down list in the Table creation method column. If you select Create Table, the name of the table that is automatically created appears in the AnalyticDB for MySQL 3.0 Table column. If you select Use Existing Table, you can select a destination table from the drop-down list in the AnalyticDB for MySQL 3.0 Table column.

      Specify a table type

      In the Table Type column, you can view and modify the destination table type. For more information about partitioned tables and dimension tables, see Terms.

      Select a distribution key

      You can click the Edit icon in the Distribution Field Column to select an appropriate distribution key for the destination table. By default, the primary key of a source table is used as the distribution key of the related destination AnalyticDB for MySQL 3.0 table.

      Note

      The distribution field is the distribution key of the destination AnalyticDB for MySQL 3.0 table. It is used to specify the distribution of the data in the destination AnalyticDB for MySQL 3.0 table based on the hash value of the distribution field.

      Add additional fields to a destination table and assign values to the fields

      You can click Edit additional fields in the Actions column of a destination table to add additional fields to the table and assign values to the fields. You can assign constants and variables to the additional fields as values.

      Note

      You can add additional fields to a destination table only if you select Create Table from the drop-down list in the Table creation method column of the table.

      The following additional variables are supported by Data Integration:

      EXECUTE_TIME: the execution time
      UPDATE_TIME: the update time
      DB_NAME_SRC: the name of the original database
      DB_NAME_SRC_TRANSED: the converted name of the database
      DATASOURCE_NAME_SRC: the name of the source
      DATASOURCE_NAME_DEST: the name of the destination
      DB_NAME_DEST: the name of the destination database
      TABLE_NAME_DEST: the name of the destination table
      TABLE_NAME_SRC: the name of the source table

      Edit a destination table

      The destination table schema is generated based on the source table schema, and field types may be converted during data synchronization. If the data type of a field in the source is not supported in the destination, when the synchronization node creates the destination table, the synchronization node automatically converts the data type of the field in the source to a data type supported in the destination. You can click the destination table name in the AnalyticDB for MySQL 3.0 Table Name column to modify field type mappings of the destination table.

    4. Click Next.

      If you select Create Table from the drop-down list in the Table creation method column, you must click Start table building in the Create tables automatically dialog box to create destination AnalyticDB for MySQL tables.

  5. Configure rules for processing DDL messages.

    DDL operations are performed on a source. Data Integration provides default rules to process DDL messages. You can also configure processing rules for different DDL messages based on your business requirements. For more information, see Rules for processing DDL messages.

  6. Configure the resources required to run the data synchronization node.
    1. In the Configure Resources step, configure the parameters.
      ParameterDescription
      Maximum number of connections supported by source readThe maximum number of Java Database Connectivity (JDBC) connections that are allowed for the source. Configure this parameter based on the resources of the source database. Default value: 15.
      Maximum number of parallel threads allowed to read by destinationThe maximum number of parallel threads that the synchronization node uses to read data from the source table or write data to the destination. Maximum value: 32. Specify an appropriate number based on the specifications of the exclusive resource group for Data Integration and the data write capabilities of the destination.
    2. Click Complete Configuration.

Commit and deploy the real-time synchronization node

Commit and deploy the node.

  1. Click the Save icon in the top toolbar to save the node.

  2. Click the Submit icon in the top toolbar to commit the node.

  3. In the Commit Node dialog box, configure the Change description parameter.

  4. Click OK.

If you use a workspace in standard mode, you must deploy the node in the production environment after you commit the node. On the left side of the top navigation bar, click Deploy. For more information, see Deploy nodes.

What to do next

After the real-time synchronization node is configured, you can start and manage the node on the Real Time DI page in Operation Center. To go to the Real Time DI page, perform the following operations: Log on to the DataWorks console and go to the Operation Center page. In the left-side navigation pane of the Operation Center page, choose RealTime Task > RealTime DI. For more information, see O&M for real-time synchronization nodes.