Cron expressions cannot target relative dates such as the last day of a month. Branch nodes solve this by evaluating Python conditions at runtime and routing execution to the correct downstream path using a switch-case model — letting you implement scheduling patterns that cron alone cannot express.
How it works
Three control node types work together to implement time-based conditional scheduling:
| Node type | What it does |
|---|---|
| Assignment node | Runs a script (SQL, SHELL, or Python) and passes its output to downstream nodes via the node context feature |
| Branch node | Evaluates Python conditions against the assignment node's output and routes execution to one or more branches; nodes on unselected branches are dry run |
| Merge node | Runs regardless of whether its ancestor nodes are dry run; stops the dry run state from propagating further downstream |
How dry run propagates: When a branch is not selected, DataWorks marks all instances on that branch as dry run. This dry run state passes to every downstream node until a merge node is encountered. The merge node always succeeds, and its downstream nodes are not dry run.
The figure below shows a dependency tree with two branch nodes (X and Y) and a merge node (JOIN):
-
ASN: assignment node that provides output to branch nodes X and Y
-
Nodes X and Y: branch nodes; X selects the left branch, Y selects the left two branches (green lines)
-
Node A: runs as expected — selected by X
-
Node C: runs as expected — selected by Y
-
Node B: dry run — selected by Y but not by X
-
Node E: dry run — not selected by Y, even though it has the common node Z as an ancestor
-
Node G: dry run — because its ancestor E is dry run, even though C and F run normally
-
Node D: runs as expected — it is a descendant of the JOIN merge node, which stops dry run from propagating
A merge node stops the dry run state at its boundary. Descendant nodes of a merge node always run, regardless of how many of their ancestors were dry run.
Common nodes vs. branch nodes: For common nodes, the output is only a globally unique string. To configure a descendant node of a common node, you specify that globally unique string as the input of the descendant node. For branch nodes, each output is associated with a condition — descendant nodes select the condition-associated output, and only those whose condition is met run as expected; the rest are dry run.
Schedule a task for the last day of each month
Cron expressions cannot target the last day of a month directly. The following steps show how to combine an assignment node, a branch node, and Shell nodes to implement this pattern.
Prerequisites
Before you begin, ensure that you have:
-
A DataWorks workspace with DataStudio access
-
Permissions to create and deploy nodes
Step 1: Define node dependencies
Set up the following node structure:
-
Assignment node (root): uses the
SKYNET_CYCTIMEparameter to determine whether the current day is the last day of the month. Returns1if it is,0otherwise. DataWorks captures this output and passes it to the branch node. -
Branch node: defines two branches based on the assignment node's output.
-
Shell nodes (two): one for last-day logic, one for all other days; each is a descendant of the branch node on a different branch.
Step 2: Configure the assignment node
The assignment node code is written in Python. DataWorks captures the last line of standard output as the value of the outputs parameter, which the branch node then reads.
The capture rules vary by language:
| Language | Captured output |
|---|---|
| SQL | Last SELECT statement |
| SHELL | Last line of standard output |
| Python | Last line of standard output |
Configure the scheduling properties and paste your Python code:

Step 3: Define branch conditions
Branch conditions are Python expressions. Each condition is linked to a specific output of the branch node. When a condition is met, the descendant nodes that depend on that output run as expected; all other descendant nodes are dry run.
Configure the scheduling properties and branch conditions:

The branch node generates outputs associated with each condition:
Step 4: Associate Shell nodes with specific branches
The branch node has three outputs. Select the appropriate output as the dependency input for each Shell node. Each output is tied to a condition, so choose carefully.
-
Dependency for the Shell node that runs on the last day of each month:

-
Dependency for the Shell node that runs on all other days:

Step 5: Deploy and verify branch logic
Commit and deploy all nodes. Then use data backfill to test the branch logic. Data backfill lets you simulate any scheduling time — including past dates — so you can verify that each branch fires correctly without waiting for a live scheduling cycle.
Set the data timestamps to December 30, 2018 and December 31, 2018. These correspond to scheduling times of December 31, 2018, and January 1, 2019, respectively.
Test case 1 — last-day logic (data timestamp: December 30, 2018 → scheduling time: December 31, 2018)
The branch node selects the last-day branch:
-
The Shell node for the last day runs as expected.
-
The Shell node for other days is dry run.
Test case 2 — non-last-day logic (data timestamp: December 31, 2018 → scheduling time: January 1, 2019)
The branch node selects the non-last-day branch:
-
The Shell node for the last day is dry run.
-
The Shell node for other days runs as expected.
Key rules
| Rule | Details |
|---|---|
| Assignment node output | DataWorks captures the last SELECT statement (SQL) or the last line of standard output (SHELL or Python). This is the value downstream nodes reference. |
| Branch output selection | Each output of a branch node is associated with a condition. Before setting a node as a descendant of a branch node, confirm which condition is tied to the output you are selecting. |
| Dry run propagation | If a branch is not selected, all descendant nodes on that branch are dry run. The dry run state propagates downstream until a merge node is encountered. |