This topic describes how to use the codeless user interface (UI) to configure a batch synchronization node that is periodically scheduled and how to commit and deploy the node.
Prerequisites
- The required data sources are configured. Before you configure a data synchronization node, you must configure the data sources from which you want to read data and to which you want to write data. This way, you can select the data sources when you configure a batch synchronization node. For information about the data source types, Reader plug-ins, and Writer plug-ins that are supported by batch synchronization, see Supported data source types, Reader plug-ins, and Writer plug-ins. Note For information about the items that you must understand before you configure a data source, see Overview.
- An exclusive resource group for Data Integration that meets your business requirements is purchased. For more information, see Create and use an exclusive resource group for Data Integration.
- Network connections between the exclusive resource group for Data Integration and the data sources are established. For more information, see Establish a network connection between a resource group and a data source.
Go to the DataStudio page
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region where the desired workspace resides. On the Workspaces page, find the workspace and click DataStudio in the Actions column. The DataStudio page appears.
Procedure
- Step 1: Create a batch synchronization node
- Step 2: Configure the batch synchronization node
- Establish network connections between the exclusive resource group for Data Integration and the data sources
- Select the tables from which you want to read data and the tables to which you want to write data, and specify a filter condition when you configure the source
- Configure field mappings
- Configure channel control policies, such as the maximum transmission rate and settings for dirty data records
- Configure scheduling properties for the batch synchronization node
- Step 3: Commit and deploy the batch synchronization node
Step 1: Create a batch synchronization node
- Create a workflow. For more information, see Manage workflows.
- Create a batch synchronization node. You can use one of the following methods to create a batch synchronization node:
- Method 1: Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. On the Workspaces page, find the workspace in which you want to create a batch synchronization node and click DataStudio in the Actions column. In the Scheduled Workflow pane of the DataStudio page, find the created workflow and click its name. Right-click Data Integration and choose .
- Method 2: Log on to the DataWorks console. In the left-side navigation pane, click Workspaces. On the Workspaces page, find the workspace in which you want to create a batch synchronization node and click DataStudio in the Actions column. In the Scheduled Workflow pane of the DataStudio page, find the created workflow and double-click its name. In the Data Integration section of the workflow editing tab that appears, click Batch Synchronization.
- In the Create Node dialog box, configure the parameters to create a batch synchronization node.
Step 2: Configure the batch synchronization node
- Establish network connections between the exclusive resource group for Data Integration and the data sources. Select the source, destination, and exclusive resource group for Data Integration, and establish network connections between the resource group and the data sources.
- You can use a batch synchronization node to synchronize data from tables in sharded databases to a single table. For more information, see Scenario: Configure a batch synchronization node to synchronize data from tables in sharded databases.
- If network connections between the exclusive resource group for Data Integration and data sources cannot be established, you can configure the network connectivity as prompted or by referring to the related topic. For more information, see Establish a network connection between a resource group and a data source.
Important The items that you must configure vary based on the Reader or Writer plug-in. The following tables describe the common configuration items that are required when you configure a batch synchronization node. For information about the configuration items supported by a Reader or Writer plug-in and how to configure the items, see the topic for the related Reader or Writer plug-in. For more information about the data source types, Reader plug-ins, and Writer plug-ins that are supported by batch synchronization, see Supported data source types, Reader plug-ins, and Writer plug-ins. - Click Next Step to configure the source and destination for the batch synchronization node.
- Click Next Step to configure scheduling properties for the batch synchronization node. If you want DataWorks to periodically schedule your batch synchronization node, you must configure scheduling properties for the node. This substep describes how to configure scheduling properties for a batch synchronization node. For information about how to use scheduling parameters, see Description for using scheduling parameters in data synchronization.
- Configure scheduling parameters: If you use variables in the configurations of the batch synchronization node, you can assign scheduling parameters to the variables as values.
- Configure time properties: The time properties define the mode in which the batch synchronization node is scheduled in the production environment. In the section in which you configure time properties for the batch synchronization node, you can configure attributes such as the instance generation mode, scheduling type, and scheduling cycle for the node.
- Configure the resource property: The resource property defines the exclusive resource group for scheduling that is used to issue the batch synchronization node to the related exclusive resource group for Data Integration. You can select the exclusive resource group for scheduling that you want to use. Note DataWorks uses resource groups for scheduling to issue batch synchronization nodes in Data Integration to resource groups for Data Integration and uses the resource groups for Data Integration to run the nodes. You are charged for using the resource groups for scheduling to schedule batch synchronization nodes. For more information about the node issuing mechanism, see Mechanism for issuing nodes.
- Click Complete Configuration.
Step 3: Commit and deploy the batch synchronization node
If you want DataWorks to periodically run the batch synchronization node, you must deploy the node to the production environment. For more information about how to deploy a node, see Deploy nodes.