All Products
Search
Document Center

DataWorks:Use case 1: Configure dependencies for batch sync nodes

Last Updated:Jun 20, 2026

In DataWorks, batch synchronization nodes do not support automatic parsing to add scheduling dependencies. If a workflow includes a batch synchronization node whose generated table is a dependency for a descendant node, you must manually add that table to the batch synchronization node's output. This allows the automatic parsing feature to identify the correct ancestor node when the descendant node queries the table.

Common pitfalls

If you do not add the generated table of a batch synchronization node to its output, automatic parsing cannot find this node. Consequently, when you try to commit an SQL node that references this table, the system returns an error. The error message is The output name of the dependent ancestor node of the current node: test.table_1 does not exist. The current node cannot be committed. Make sure that the ancestor node that has this output name has been committed! and the error code is 1201111368.
This error occurs because the upstream dependency that is automatically parsed for the descendant node does not match any output from the ancestor batch synchronization node. For a detailed explanation, see Error analysis. To prevent this error, configure the scheduling dependency using one of the following methods:

Method 1: Manually add table to output

To avoid this error, ensure that the upstream dependency parsed from the descendant node is added to the ancestor node's output. To do this, go to the scheduling configuration page of the batch synchronization node and manually add the generated table to the node's output. In the input field for the node's output, enter the generated table name, such as test.table_1, and click Add. The table is then added to the output list, and its addition method is displayed as Manually Added.

Method 2: Align node and table names

This method works as follows:
  • When you create a batch synchronization node, DataWorks automatically generates an Output for it in the format projectname.nodename.
  • When an SQL node references the generated table of the batch synchronization node, DataWorks automatically generates a Parent Nodes for the SQL node in the format projectname.tablename.
  • To prevent errors, ensure that the name of the Parent Nodes in the SQL node matches the name of the Output of the batch synchronization node.
Therefore, you can use the same name for the batch synchronization node (nodename) and its generated table (tablename). This ensures that no error occurs when you commit the node.
Note The automatically generated Output named projectname.nodename is created at the same time as the node. If you rename the node after it is created, the name of this projectname.nodename Output does not change. This method works only when you initially create the batch synchronization node. It will not work if you try to align the names by renaming an existing node.

Error analysis

Consider a typical workflow that contains a batch synchronization node. The following table outlines the node creation and dependency configuration process:离线同步
Step Description Scheduling dependency
1 Create the required nodes based on your workflow plan.

For example, create a virtual node, a batch synchronization node, and a MaxCompute node.

After you create nodes, DataWorks automatically generates two Output configurations for each: one named in the projectname.nodename format and the other suffixed with _out.
For example, for the batch synchronization node user_1, the system generates the following outputs after the node is created:
  • An output named *******_out.
  • An output named doctest.user_1.
2 Connect the nodes with lines to define their running order and dependencies. After you connect the nodes on the workflow page, DataWorks automatically adds dependency configurations based on the connections.

For example, once connected, the MaxCompute node sql_1 becomes a descendant node of the batch synchronization node user_1. DataWorks automatically adds the output of user_1 that is named *******_out as a Parent Nodes for sql_1.

3 Develop task code for each node. As you write code for a node, DataWorks automatically parses I/O commands and adds a corresponding Output or a Parent Nodes.

For example, if the MaxCompute node sql_1 needs to use data from the table table_1, which is generated by the batch synchronization node user_1, and the code for sql_1 contains a statement such as select * from table_1, DataWorks automatically adds a Parent Nodes for sql_1. The name of this dependency is based on the ancestor node's output name, which follows the projectname.tablename format. In this example, the name is doctest.table_1.

After you complete these steps and commit the node, if you did not account for the lack of automatic parsing support and did not manually add the generated table to the node's Output, the system reports an error that the output of the dependent ancestor node does not exist.
This error occurs for these reasons:
  • The batch synchronization node user_1 does not support automatic parsing. Therefore, the generated table, table_1, is not automatically added to the Output of node user_1. This means that node user_1 does not have an output named doctest.table_1.
  • For the descendant node sql_1, automatic parsing adds a Parent Nodes with a name that follows the projectname.tablename format. In this example, the name is doctest.table_1. However, because doctest.table_1 is not an output of user_1, this dependency cannot be matched to the node ID of user_1.
  • When you commit the sql_1 node, the system detects its upstream dependency on doctest.table_1. Because this dependency is not associated with a node ID, the system cannot find the ancestor node and reports that the output name of the dependent ancestor node does not exist.