DataWorks supports real-time synchronization. This topic describes how to create, configure, commit, and manage real-time sync nodes.
Prerequisites
Create a real-time sync node
Configure the real-time sync node
The operations that you can perform on the configuration tab of the real-time sync
node vary based on the synchronization method you selected.
- To configure the real-time sync node for which Sync Method is set to End-to-end ETL, perform the following steps:
- Double-click the real-time sync node. On the node configuration tab that appears,
click the Basic configuration tab in the right-side navigation pane. On the Basic configuration tab, select the
desired resource group from the Resource Group drop-down list.
No. Description 1 The left-side navigation tree. This pane consists of the Input, Output, and Conversion sections. 2 The configuration canvas of the real-time sync node. You can drag components from the navigation tree to the canvas. 3 The property configuration pane of the real-time sync node. This pane appears after you click a node on the canvas or click the Basic configuration tab in the right-side navigation pane. Notice You must select a resource group before you commit the node. Otherwise, the system returns an error when you commit the node. Real-time sync nodes can be run only on an exclusive resource group for Data Integration. For more information, see Use exclusive resource groups for data integration. - Drag components from the navigation tree to the canvas, and drag directed lines to connect the nodes on the canvas. Data will be synchronized from upstream nodes to downstream nodes based on the connection.
- Click each node. In the configuration pane that appears, set the required parameters
in the Node configuration section. For more information, see Supported data stores.
- Click the
icon in the toolbar.
- Double-click the real-time sync node. On the node configuration tab that appears,
click the Basic configuration tab in the right-side navigation pane. On the Basic configuration tab, select the
desired resource group from the Resource Group drop-down list.
- To configure the real-time sync node for which Sync Method is set to Migration to Hologres, perform the following steps:
- Double-click the real-time sync node. On the node configuration tab that appears,
click the Basic configuration tab in the right-side navigation pane. On the Basic configuration tab, select the
desired resource group from the Resource Group drop-down list.Notice You must select a resource group before you commit the node. Otherwise, the system returns an error when you commit the node. Real-time sync nodes can be run only on an exclusive resource group for Data Integration. For more information, see Use exclusive resource groups for data integration.
- In the Data source section, set the Type and Data source parameters.
- In the Select the source table for synchronization section, select the tables to be synchronized in the SOURCE Table list and click the
icon to move the tables to the Selected Source table list.
The SOURCE Table list displays all the tables in the source database. You can select all or some tables to synchronize them at a time.Notice If a selected table does not have a primary key, the table cannot be synchronized in real time. - Optional:In the Set synchronization rules section, click Add rule and select an option to configure naming rules for destination tables.
Supported options include Table name conversion rules and Target table name rule.
- Table name conversion rules: the rules for converting the names of source tables to that of destination tables.
- Target table name rule: the rule for adding a prefix and suffix to the converted names of destination tables.
- Click Next Step.
- In the Set target table step, set the Target Hologres data source and Schema parameters.
- Click Reload source table and Hologres Table mapping to configure the mappings between the source tables and destination Hologres tables.
- Check the source and destination tables after the mappings are created, and click
Next Step.
No. Description 1 The mapping progress between the source and destination tables. Note The mapping may take a long period of time if the number of source tables to be synchronized is large.2 The destination tables to which data is written. The tables can be existing ones or the ones that are automatically created. Note An error message appears if the selected source table does not have a primary key. The synchronization can be performed if one of the selected source tables has a primary key. Source tables without primary keys are ignored during the synchronization.3 The method of creating a destination table. The message that appears in the Hologres Table name column varies depending on the method that you select. - If you select Create tables automatically, the Create tables automatically dialog box appears after you click Next Step. Click Start table building in the dialog box, and then click Close after the table is created. You can click the table name to view and modify the table creation statements.
- If you select Use existing Table, you must select a table from the drop-down list in the Hologres Table name column.
- In the Run resource settings step, set the Maximum number of connections supported by source read and Number of concurrent writes on the target side parameters and then click the
icon in the toolbar.
- Double-click the real-time sync node. On the node configuration tab that appears,
click the Basic configuration tab in the right-side navigation pane. On the Basic configuration tab, select the
desired resource group from the Resource Group drop-down list.
- To configure the real-time sync node for which Sync Method is set to Migration to MaxCompute, perform the following steps:
- Double-click the real-time sync node. On the node configuration tab that appears, click the Basic configuration tab in the right-side navigation pane. On the Basic configuration tab, select the desired resource group from the Resource Group drop-down list.
- In the Data source section, set the Type and Data source parameters.
- In the Select the source table for synchronization section, select the tables to be synchronized in the SOURCE Table list and click the
icon to move the tables to the Selected Source table list.
The SOURCE Table list displays all the tables in the source database. You can select all or some tables to synchronize them at a time.Notice If a selected table does not have a primary key, the table cannot be synchronized in real time. - Optional:In the Set synchronization rules section, click Add rule and select an option to configure naming rules for destination tables.
Supported options include Table name conversion rules and Target table name rule.
- Table name conversion rules: the rules for converting the names of source tables to that of destination tables.
- Target table name rule: the rule for adding a prefix and suffix to the converted names of destination tables.
- Click Next Step.
- In the Set target table step, select a connection from the Target MaxCompute (ODPS) data source drop-down list and click the
icon next to MaxCompute (ODPS) time automatic partition settings. In the Edit dialog box, set the partition interval of tables in MaxCompute to day or hour.
- Click Reload source table and MaxCompute (ODPS) Table mapping to configure the mappings between the source tables and destination MaxCompute tables.
- Check the source and destination tables after the mappings are created, and click
Next Step.
No. Description 1 The mapping progress between the source and destination tables. Note The mapping may take a long period of time if the number of source tables to be synchronized is large.2 The destination tables to which data is written. The tables can be existing ones or the ones that are automatically created. Note An error message appears if the selected source table does not have a primary key. The synchronization can be performed if one of the selected source tables has a primary key. Source tables without primary keys are ignored during the synchronization.3 The method of creating a destination table. The message that appears in the MaxCompute (ODPS) Table name column varies depending on the method that you select. - If you select Create tables automatically, the Create tables automatically dialog box appears after you click Next Step. Click Start table building in the dialog box, and then click Close after the table is created. You can click the table name to view and modify the table creation statements.
- If you select Use existing Table, you must select a table from the drop-down list in the MaxCompute (ODPS) Table name column.
- In the Run resource settings step, set the Maximum number of connections supported by source read and Number of concurrent writes on the target side parameters and then click the
icon in the toolbar.
- To configure the real-time sync node for which Sync Method is set to Migration to Datahub, perform the following steps:
- Double-click the real-time sync node. On the node configuration tab that appears, click the Basic configuration tab in the right-side navigation pane. On the Basic configuration tab, select the desired resource group from the Resource Group drop-down list.
- In the Data source section, set the Type and Data source parameters.
- In the Select the source table for synchronization section, select the tables to be synchronized in the SOURCE Table list and click the
icon to move the tables to the Selected Source table list.
The SOURCE Table list displays all the tables in the source database. You can select all or some tables to synchronize them at a time.Notice If a selected table does not have a primary key, the table cannot be synchronized in real time. - In the Set synchronization rules section, click Add rule and then select an option to configure naming rules for destination tables.
Supported options include SOURCE table name and Topic conversion rules and Target Topic rules.
- Click Next Step.
- In the Set target table step, select a connection from the Target DataHub data source drop-down list and then click Reload source table and DataHub Topic mapping to configure the mappings between the source tables and destination DataHub topics.
- Check the source tables and destination topics after the mappings are created, and
click Next Step.
No. Description 1 The mapping progress between the source tables and destination topics. Note The mapping may take a long period of time if the number of source tables to be synchronized is large.2 The destination topics to which data is written. The topics can be existing ones or the ones that are automatically created. 3 The method of creating a destination topic. The message that appears in the Topic column varies depending on the method that you select. - If you select Create tables automatically, the Create tables automatically dialog box appears after you click Next Step. Click Start table building in the dialog box, and then click Close after the topic is created.
- If you select Use existing Topic, you must select a topic from the drop-down list in the Topic column.
- In the Run resource settings step, set the Maximum number of connections supported by source read and Number of concurrent writes on the target side parameters and then click the
icon in the toolbar.