All Products
Search
Document Center

DataWorks:Dependencies for workflows with batch sync nodes

Last Updated:Mar 02, 2026

DataWorks batch synchronization nodes do not automatically add scheduling dependencies by parsing code. In a workflow that contains a batch synchronization node, if a downstream node depends on a table that the batch synchronization node creates, you must manually add this output table to the node's outputs. This allows the automatic parsing feature to find the correct upstream batch synchronization node when a downstream node queries the table.

Common mistakes

If you do not manually add the output table of a batch synchronization node to its outputs, the automatic parsing feature cannot find the node. When you submit an SQL node that references this batch synchronization node, the following error message appears.离线报错

This error occurs because the upstream dependency, which is automatically parsed from the downstream node, cannot be matched to the corresponding upstream batch synchronization node. For a detailed explanation, see Detailed explanation of the cause. To avoid this error, configure scheduling dependencies for workflows that contain batch synchronization nodes using one of the following two methods:

Method 1: Manually add the output table as a node output

To avoid this error, you must ensure that the upstream dependency parsed from a downstream node is added to the outputs of the upstream node. Therefore, after you configure the workflow, go to the Scheduling Configuration page for the batch synchronization node and manually add the output table as a node output. The following figure shows an example.手动

Method 2: Keep the node name and output table name the same

This method works based on the following logic:

  • When you create a batch synchronization node, an output is automatically generated for the node. The output name follows the projectname.nodename naming convention for the Outputs of this node section.

  • When an SQL node references the output table of an offline node, an Ancestor Node Dependency is created with the naming convention projectname.tablename

  • To prevent errors, the name of the upstream dependency in the SQL node must match the name of the output of this node for the batch synchronization node.

Therefore, you can keep the node name (`nodename`) and the output table name (`tablename`) the same. This ensures that no error occurs when you submit the node.

Note

The projectname.nodename output of this node is generated when you create the node. If you change the node name after the node is created, the name of this automatically generated projectname.nodename output of this node does not change. This method works only when you create the batch synchronization node. Changing the node name to match the output table name after the node is created does not resolve the issue described in this topic.

Detailed explanation of the cause

The following figure shows the typical steps to create nodes and configure dependencies for a workflow that contains a batch synchronization node.离线同步

Step

Details

Scheduling dependency configuration

1

Create each node according to the workflow plan.

For this example, create a virtual node, a batch synchronization node, and an ODPS node.

After you create a node in DataWorks, DataWorks automatically generates two output of this node configurations for it. One output name has the _out suffix. The other follows the projectname.nodename format.

For example, after you create the `user_1` batch synchronization node in the figure, the node has two outputs:

  • An output of this node named *******_out.

  • An output of this node named doctest.user_1.

2

Connect the nodes to define the upstream and downstream dependencies based on the logical execution order of the workflow.

After you connect the nodes on the workflow page, DataWorks automatically configures the dependencies.

For example, the `user_1` batch synchronization node is upstream of the `sql_1` ODPS node. DataWorks automatically adds the *******_out output of `user_1` as an upstream dependency for `sql_1`.

3

Develop the task code for each node.

When you develop the code, DataWorks automatically parses it. Based on the input and output commands, DataWorks adds an output of this node or an upstream dependency.

For example, the `sql_1` ODPS node needs to use data from the table_1 table, which is generated by the `user_1` batch synchronization node. If the code contains a statement such as select * from table_1, DataWorks automatically adds an upstream dependency to `sql_1`. The name of this dependency follows the projectname.tablename format. In this example, the name is doctest.table_1.

After you complete these steps, an error occurs when you submit the node if you are unaware that batch synchronization nodes cannot be automatically parsed and their output tables are not automatically added as an output of this node. The error message indicates that the output name of the dependent parent node does not exist.离线报错

This error occurs for the following reasons:

  • Batch synchronization nodes do not support automatic parsing. Therefore, the output table `table_1` is not automatically added as an output of this node for the batch synchronization node. As a result, the `user_1` node does not have an output named doctest.table_1.

  • Automatic parsing adds an upstream dependency to the downstream node `sql_1`. This dependency is named based on the projectname.tablenameDependent ancestor nodesdoctest.table_1 naming convention for Dependent ancestor nodes. However, because doctest.table_1 is not an output of `user_1`, this dependency in `sql_1` cannot be matched to the node ID of `user_1`.

  • When you submit the `sql_1` node, the system detects the doctest.table_1 upstream dependency. Because this dependency is not associated with a node ID, the system cannot find the corresponding upstream node and reports an error that the parent node output name does not exist.