After a batch synchronization node that is created in DataStudio is committed and deployed to the production environment, you can go to Operation Center to manage the batch synchronization node, monitor the status of the node, change the resource group that is used to run the node, and view the run logs of the node. This ensures that the synchronization node can be run as expected. This topic describes the common O&M operations that you can perform on a batch synchronization node.

Prerequisites

A batch synchronization node is created, deployed, and run as expected. For more information, see Configure a batch synchronization node by using the codeless UI and Configure a batch synchronization node by using the code editor.

Usage notes

  • The O&M operations that can be performed on batch synchronization nodes are the same as the O&M operations that can be performed on other types of auto triggered nodes. This topic describes how to perform common O&M operations on batch synchronization nodes. For more information about O&M for auto triggered nodes, see O&M overview of auto triggered nodes.
  • To ensure that a batch synchronization node can be run as expected after you deploy the node, you can go to the Cycle Task page in Operation Center in the production environment to check whether the configurations of the node in the production environment meet your requirements. The configurations include the code of the node and the resource groups for scheduling and for Data Integration used to run the node.
  • Batch synchronization nodes are issued to a resource group for Data Integration by using a resource group for scheduling. Therefore, execution of batch synchronization nodes requires both a resource group for Data Integration and a resource group for scheduling. If you use an exclusive resource group for scheduling, you are charged for scheduling instances. For more information, see Mechanism for issuing nodes.
  • Workspaces in standard mode support isolation of data sources.
    • Before a node is deployed to the production environment, the system accesses the databases or data warehouses in the development environment that correspond to the data sources you added to the node by default.
    • After a node is deployed to the production environment, the system accesses the databases or data warehouses in the production environment that correspond to the data sources you added to the node by default.
    For more information, see Isolate a data source in the development and production environments.

Schedule and manage a batch synchronization node

DataWorks provides powerful scheduling capabilities for you to run batch synchronization nodes. You can configure scheduling parameters for a batch synchronization node to write incremental and full data to a specific partition of a destination table. The O&M operations that can be performed on batch synchronization nodes are the same as the O&M operations that can be performed on other types of auto triggered nodes. You can also manually run a batch synchronization node.
Operation Description
Run a batch synchronization node After you deploy a batch synchronization node to the production environment, you can go to the Cycle Task page in Operation Center in the production environment to view the node. The scheduling system runs the node based on the configurations of the scheduling parameters. You can also manually run the node.
  • Automatic scheduling of nodes: After you deploy a batch synchronization node, the scheduling system generates auto triggered node instances for the node based on the value of the Instance Generation Mode parameter that is configured on the Properties tab in DataStudio and automatically schedules the auto triggered node instances to run. You can go to the Cycle Instance page in Operation Center to view the status of the instances.
    Note After you commit and deploy a batch synchronization node to the production environment, whether the node is run on the current day depends on the value of the Instance Generation Mode parameter. For more information, see Modes in which instances take effect.
  • Manual running of nodes: After you deploy a batch synchronization node, you can manually run the node to test the node or backfill data for the node. In this case, test instances or data backfill instances are generated.
    • Test an auto triggered node: You can perform this operation to check whether an auto triggered node can be run as expected.
    • Backfill data for an auto triggered node: You can perform this operation to backfill data of a historical period of time for an auto triggered node. For more information, see Synchronize historical data.
Suspend scheduling of a batch synchronization node On the Cycle Task page in Operation Center, you can freeze an auto triggered node for a period of time. After you freeze the auto triggered node, the auto triggered node and its descendant nodes cannot be run.
Note Instances are generated for an auto triggered node after the node is run. If an auto triggered node instance and its descendant instances do not need to be run, you can freeze the current auto triggered node instance.
Resume scheduling of a batch synchronization node On the Cycle Task page in Operation Center, you can unfreeze an auto triggered node. After you unfreeze the auto triggered node, the node can be run as expected.
Note Instances generated for a frozen auto triggered node are also frozen. If you want to run a frozen auto triggered node instance and its descendant instances, you can unfreeze the current auto triggered node instance.

Synchronize historical data

DataWorks allows you to synchronize historical data to a specified table or partition in the destination database or data warehouse based on the scheduling parameter configurations and data backfill configurations of a batch synchronization node. If you want to configure a batch synchronization node to synchronize incremental data and historical data to a specified partition in the destination table, you must configure the data backfill settings for the node. When you backfill data for the node, the system assigns the value that you specify for the Data Timestamp parameter to the variable of the related scheduling parameter. For more information about how to backfill data for a node, see View and manage data backfill instances.

Monitor the status of a batch synchronization node

You can create an alert rule to monitor the status of an auto triggered node on the Rule Management page. To go to the Rule Management page, perform the following operations: In the left-side navigation pane of the Operation Center page, choose Alarm > Rule Management. An alert notification is sent if the node is in a specified state, such as Completed, Uncompleted, Error, or Overtime. For more information, see Overview.

Perform O&M operations on resource groups

  • Monitor resource groups: On the Resource page of Operation Center, you can monitor the usage of resource groups that are used to run nodes. For more information, see Use the resource O&M feature.
  • Change resource groups: You can change the resource group that is used to run nodes to another resource group by using one of the methods described in the following table.
    Note Before you change a resource group, make sure that network connections are established between the resource group that you want to select and the required data sources. If you do not establish the required network connections, nodes fail to run.
    Operating environment Supported operation Entry point
    Production environment Change the resource groups for multiple nodes at the same time Go to the Operation Center page. In the left-side navigation pane, choose Cycle Task Maintenance > Cycle Task.
    Select the nodes for which you want to change the resource groups and click Modify Data Integration Resource Group at the bottom of the Cycle Task page. Change the resource groups for multiple nodes at the same time
    Development environment
    Note After you change the resource group for a node in the development environment, you must commit and deploy the node to the production environment again.
    • Change the resource group for a single node
    • Change the resource groups for multiple nodes at the same time
    Go to the DataStudio page.
    • Change the resource group for a single node
      Go to the configuration tab of the node for which you want to change the resource group and click Resource Group configuration in the right-side navigation pane. On the Resource Group configuration tab, you can change the resource group for the node. Modify Data Integration Resource Group
    • Change the resource groups for multiple nodes at the same time

      Click the Batch Operation icon. On the batch operation tab, select the nodes for which you want to change the resource groups and click Modify Data Integration Resource Group at the bottom of the batch operation tab.

Monitor the quality of table data

On the Data Quality page, you can configure monitoring rules for tables of some destinations to monitor the data quality of data in the tables. If you configure monitoring rules for a table, the monitoring rules are triggered after the scheduling node with which you associate the table is successfully run. If exceptions are detected, Data Quality determines whether to fail the node and block the descendant nodes based on the check result and rule settings, such as the rule type. This way, dirty data is stopped from being forwarded as downstream data. For more information about the destinations that support monitoring rules and how to use Data Quality, see Overview.

Note If you want to configure monitoring rules for tables generated by a batch synchronization node, make sure that a network connection is established between the resource group for scheduling that you use to run the node and the destination.

View the run logs of a batch synchronization node

After an auto triggered node instance, a data backfill instance, or a test instance is successfully run, you can go to the DAG page in Operation Center to view the run logs of the instances. For more information, see Appendix: Use the features provided in a DAG.
Note For more information about the parameters in the run logs, see View run logs generated for a batch synchronization node.

View the statistics on batch synchronization nodes

On the Batch Sync tab of the Data integration tab under Overview in Operation Center, you can view the statistics on node execution, such as execution status distribution, data synchronization progress, synchronized data volume, and details of synchronization nodes. You can search for the desired synchronization node based on the filter conditions such as Source Name, Destination Name, and Whether there is public network traffic. For more information, see View the statistics on the Overview page.

Appendix