Use a Merge Node to Unblock Branch Dependencies in DataWorks - DataWorks

When a branch node routes execution down one of several mutually exclusive paths, descendant nodes that depend directly on multiple branch outputs encounter a status conflict: the branch paths that were not selected always have the status Branch Not Run, which causes dependent nodes to inherit that status and be skipped. A merge node resolves this by consolidating the run statuses of all upstream branch paths into a single dependency point, so descendant nodes proceed regardless of which branch ran.

How it works

Merge nodes are a type of logical control node in DataStudio. In a DataStudio workflow, a branch node evaluates a condition and routes execution to exactly one of its output paths. All other paths receive the status Branch Not Run — the node is marked as dry-run (skipped, not executed).

If a downstream node depends directly on both branch outputs, one output will always be Branch Not Run. A node with any upstream dependency in Branch Not Run status also receives that status and is skipped.

A merge node sits between the branch outputs and the downstream node. It evaluates the run statuses of all upstream branch nodes against conditions you define, and outputs Successful when those conditions are met. The downstream node depends only on the merge node and runs normally.

Example: Branch node C has two mutually exclusive outputs, C1 and C2, both writing to the same MaxCompute table. Without a merge node, downstream node B cannot depend on both C1 and C2 — one will always be Branch Not Run, and B is skipped. Adding merge node J between C1/C2 and B lets node B depend only on J, which handles the status consolidation.

Run statuses you will work with:

Status	Meaning
Successful	The node ran and completed successfully.
Failed	The node failed to run.
Branch Not Run	The branch was not selected to run. The node enters a dry-run state — marked as successful, but the task is not executed. Applies only when the upstream node is a branch node.

Prerequisites

Before you begin, ensure that you have:

A DataWorks Standard Edition workspace or later (merge nodes are available only in Standard Edition and later). To purchase or upgrade, see Feature details for each DataWorks edition.

Create a merge node

Go to the Data Studio page. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O\&M \> Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.
Move the pointer over the icon and choose Create Node \> General \> Merge Node.
In the Create Node dialog box, set the Name and Path for the node.
Click Confirm.

Define the merge logic

After creating the merge node, open its configuration tab.

Step 1: Add upstream branch nodes

In the Add Merged Branch section, find each upstream branch node by name, ID, or output. Click the icon to add it as a parent node. Repeat for each branch node whose outcomes you want to merge.

Step 2: Configure merge conditions

In the Merge Condition Settings section, specify which run statuses each upstream branch node must reach for the merge condition to be satisfied.

Merge logic operators:

Operator	Behavior
AND	All upstream branch nodes must have finished running, and every node's actual status must match one of its specified statuses.
OR	All upstream branch nodes must have finished running, and at least one node's actual status must match one of its specified statuses.

Available run statuses per branch node:

Status	Meaning
Successful	The node ran successfully.
Failed	The node failed to run.
Branch Not Run	The branch was not selected to run. The node enters a dry-run state — marked as successful, but the task is not executed. Applies only when the upstream node is a branch node.

Example (based on the figure above):

Node A: required statuses are Successful, Branch Not Run, or Failed — any completed state satisfies the condition.
Node B: required statuses are Successful or Branch Not Run — a failure does not satisfy the condition.
Merge logic: AND.

Result: the merge node outputs Successful only when Node A has finished running in any state, and Node B has finished running without failing.

Step 3: Set the execution result

In the Execution Result Settings section, set the run status for the merge node itself.

Note

Currently, the only available run status for a merge node is Successful.

Configure scheduling properties

Click Schedule on the right side of the node configuration page to set scheduling properties for the merge node. For details, see Configure basic properties.

Example: branch paths with a merge node

To route execution across different branch paths, add a branch node as an upstream dependency for multiple descendant nodes and assign each descendant a different branch output. The following example shows two descendant nodes of the same branch node.

Branch 1 depends on the output named autotest.fenzhi121902_1.

Branch 2 depends on the output named autotest.fenzhi121902_2.

View run results

In the Runtime Log, check the run details for the branch that meets the merge condition. The log shows that the descendant node for the branch that does not meet the condition is skipped, while the descendant node of the merge node runs normally.

What's next

Configure basic properties — set scheduling dependencies, rerun policies, and other node properties.