All Products
Search
Document Center

DataWorks:Merge node

Last Updated:Jun 21, 2026

This topic describes the merge node, explains how to create one and define its merge logic, and uses a practical example to demonstrate its scheduling configuration and runtime details.

What is a merge node?

  • A merge node is a type of logical control node in Data Studio. It consolidates the run statuses of upstream nodes to resolve dependency and triggering issues for nodes that are downstream from a branch node.

  • Currently, you cannot define the resulting status of the merge node. When its conditions are met, a merge node always reports a successful status. This allows downstream nodes to depend on the merge node without ambiguity.

    For example, consider a branch node C with two mutually exclusive branches, C1 and C2. Each branch uses different logic to write to the same MaxCompute table. A downstream node, B, depends on the output of this table. To handle this, you must use a merge node, J, to consolidate the branches. Then, make node B dependent on merge node J. If you make node B dependent on both C1 and C2 directly, a problem occurs. Because the branches are mutually exclusive, one of them will always have the status Branch Not Selected. Node B then inherits this status from the unselected branch, so it is also skipped with the status Branch Not Selected. The node does not actually execute, and this behavior propagates to all subsequent downstream nodes.

Prerequisites

  • The RAM user that you want to use is added to your workspace.

    If you want to use a RAM user to develop tasks, you must add the RAM user to your workspace as a member and assign the Develop or Workspace Administrator role to the RAM user. The Workspace Administrator role has more permissions than necessary. Exercise caution when you assign the Workspace Administrator role. For more information about how to add a member and assign roles to the member, see Add members to a workspace.

  • A serverless resource group is associated with your workspace. For more information, see the topics in the Use serverless resource groups directory.

  • A merge node has been created. For more information, see Create nodes for a scheduling workflow.

Usage notes

The merge node feature is available only in DataWorks Standard Edition and higher. To purchase or upgrade your DataWorks edition, see DataWorks edition features.

Step 1: Develop a merge node

After you create a merge node, go to the node's configuration page to define its merge logic.

  1. In the merge logic definition section, search for and add the nodes that you want to merge. You can search by node output, node ID, or node name.

  2. After you find a node, click the image icon to add it to the Merge Condition Settings section.

    Note

    If you need to merge multiple branch nodes, repeat this step for each node.

  3. In MERGE Condition, configure the merge conditions for the branch nodes.

    • Merge logic conditions include:

      • AND: All upstream branch nodes must reach a terminal state (that is, they finish running), and all must meet their configured run statuses. Then, the status you set in Result is applied to the current node.

      • OR: All upstream nodes must reach a terminal state (that is, they finish running), and any branch node must meet its configured run status. Then, the status you set in Result is applied to the current node.

    • Node completion statuses include:

      • Successful: The node ran successfully.

      • Failed: The node failed to run.

      • Branch Not Selected: The node is not selected. It is marked as successful but does not actually run (a dry run).

        Note

        This status applies only if the upstream node is a branch node.

  4. In the Result section, set the run status for the current node.

    Note

    Currently, you can only set the node's status to Successful.

    For example:

    • Add nodes branch1 and branch2 as the upstream nodes of the current merge node.

    • Set the running status of the node Branch 1 to Successful, Branch Not Running, or Failed. This means that the node Branch 1 only needs to finish running.

    • The running status of the node Branch 2 is set to Successful or Branch Not Running. This means that the node Branch 2 has completed its run and Node B has not failed.

    • The merge logic condition is set to AND.

    The Successful run status of the current merge node takes effect only when the node Branch 1 completes its run, and Branch 2 completes its run and does not fail.

  5. After you develop the merge node, click Scheduling Settings on the right side of the node configuration page to configure its scheduling properties. For more information, see Configure scheduling properties for a node.

Step 2: Publish and maintain the node

  1. After you configure the scheduling properties, submit and publish the merge node to the production environment. For more information, see Deploy a node or workflow.

  2. After a task is deployed, it runs periodically according to your scheduling configuration. Go to Operation and Maintenance Center > Node O&M > Auto Triggered Task O&M > Auto Triggered Task to view and manage the deployed periodic task. For more information, see Get started with Operation Center.