All Products
Search
Document Center

DataWorks:Dependencies for workflows with batch sync nodes

Last Updated:Mar 27, 2026

Batch synchronization nodes in DataWorks do not support automatic dependency parsing. Unlike SQL nodes — where DataWorks reads code and automatically registers output tables — batch synchronization nodes require you to manually add their output tables as node outputs. If you skip this step and a downstream SQL node depends on a table produced by a batch synchronization node, submitting the SQL node fails with an error.

This topic explains why the error occurs and describes two methods to fix it.

Why the error occurs

Building a workflow that mixes batch synchronization nodes with SQL nodes typically involves three steps. The table below shows what DataWorks configures automatically at each step and what names each node ends up with.

Step

Action

Auto-generated scheduling configuration

1. Create nodes

Create the virtual node, batch synchronization node, and ODPS node.

DataWorks generates two outputs for each node: one with the _out suffix (for example, *******_out) and one following the projectname.nodename format (for example, doctest.user_1).

2. Connect nodes

Draw connections on the workflow canvas to define execution order.

DataWorks automatically adds the *******_out output of the upstream node as an upstream dependency for the downstream node.

3. Write task code

Write the code for each node.

DataWorks parses the code and registers dependencies automatically. For example, if sql_1 contains select * from table_1, DataWorks adds doctest.table_1 as an upstream dependency for sql_1, following the projectname.tablename format.

After step 3, a mismatch exists:

  • sql_1 has an upstream dependency named doctest.table_1 (registered by automatic code parsing). An upstream dependency is the output name DataWorks looks for in an upstream node when a downstream node references a table.

  • user_1 has outputs named *******_out and doctest.user_1 — but not doctest.table_1. An output of this node is the name DataWorks associates with a node so downstream nodes can depend on it.

When you submit sql_1, DataWorks searches for a node whose outputs of this node include doctest.table_1. It finds none, because batch synchronization nodes do not support automatic parsing and table_1 was never added as an output of user_1. DataWorks reports:

Submission error: output name of the dependent parent node does not exist

Fix the problem

Two methods resolve the mismatch. Choose the method that fits your situation.

Method 1: Manually add the output table as a node output (recommended)

On the Scheduling Configuration page for the batch synchronization node, add the output table as a node output. This makes doctest.table_1 an explicit output of user_1, so the upstream dependency in sql_1 resolves correctly.

Scheduling Configuration page showing the output table added as a node output

This method works regardless of when you apply it — during initial setup or after you encounter the error.

To verify the fix: After adding the output, resubmit sql_1. The upstream dependency doctest.table_1 now matches the output of user_1, and submission succeeds.

Method 2: Keep the node name and output table name the same at node creation time

When you create a batch synchronization node, DataWorks automatically generates an output named projectname.nodename. When downstream code references a table, DataWorks creates an upstream dependency named projectname.tablename. If nodename equals tablename, both names resolve to the same string and the dependency matches automatically.

For example, name the batch synchronization node table_1 instead of user_1. DataWorks generates the output doctest.table_1 at creation time. When sql_1 queries table_1, the auto-parsed upstream dependency doctest.table_1 matches the node output, and submission succeeds.

To verify the fix: Submit sql_1 immediately after setting up the workflow. If the node names match the output table names, submission succeeds without any manual scheduling configuration.

Important

This method only works at node creation time. The projectname.nodename output is generated once when the node is created and does not update if you rename the node later. If you have already created the node, use Method 1 instead.

The workflow with correct scheduling dependencies configured:

Workflow diagram showing a virtual node, batch synchronization node, and ODPS node with correct dependencies configured