All Products
Search
Document Center

DataWorks:Logic of do-while nodes

Last Updated:Mar 25, 2026

A do-while node is DataWorks's implementation of the do-while control flow pattern from programming. It executes an inner workflow repeatedly until the end node returns False. Use do-while nodes to run loops, iterate over result sets from an assignment node, or implement conditional retry logic in your scheduling workflows.

How it works

When a do-while node runs, it executes its inner workflow in sequential loops — parallel execution is not supported, so each loop must complete before the next one starts. At the end of every loop, the end node evaluates a condition and returns either True (continue looping) or False (exit). The loop cap is 1,024 iterations; if that limit is reached, the end node automatically returns False.

The inner workflow always starts with the start node and ends with the end node. Between them, you can place any task nodes your workflow requires. DataWorks provides built-in variables so your task and end nodes can read loop state (current loop count, offset) and data passed in by an ancestor assignment node.

Limitations

LimitationDetails
EditionStandard Edition or higher. See Differences among DataWorks editions.
Max loops1,024. If exceeded, the end node returns False automatically.
Parallel executionNot supported. Each loop must complete before the next starts.
TestingIn a standard mode workspace, you cannot test a do-while node directly in DataStudio. Commit and deploy the workflow to Operation Center in the development environment, then run the task from there.
End node codeComments are not supported in end node code.

Node composition

When you create a do-while node, DataWorks automatically creates three inner nodes:

NodeRoleDeletable?
startMarks the beginning of each loop. Task nodes depend on it and it does no processing itself.No
shell (default task node)A placeholder Shell task node. Delete it and replace with the task nodes your workflow requires.Yes
endEvaluates the exit condition and returns True or False. Essentially an assignment node — it must be a descendant of all task nodes.No

The inner workflow must always start with start and end with end. You can delete and reconfigure the dependencies between inner nodes to build any workflow structure in between.

If the inner workflow uses a branch node for conditional logic or result traversal, a merge node is also required. See Configure a merge node.
循环节点

Built-in variables

Do-while nodes provide built-in variables in the ${dag.<variable>} format. Use these in task node code and end node code to read loop state or data from an ancestor assignment node.

Loop state variables

These variables are always available, regardless of whether you use an assignment node.

VariableDescriptionLoop 1Loop 2Loop n
${dag.loopTimes}Number of loops completed12n
${dag.offset}Zero-based index of the current loop01n-1

Assignment node variables

If a do-while node depends on an assignment node, you can pass the assignment node's output parameters to the do-while node's Input Parameters. Input parameters follow the ${dag.<parameter-name>} format, where <parameter-name> matches the name you set in Input Parameters for the do-while node.

In the table below, input is a placeholder for the actual parameter name you configure.

VariableDescription
${dag.input}The full dataset passed by the ancestor assignment node
${dag.input[${dag.offset}]}The data entry for the current loop
${dag.input.length}The number of entries in the dataset
Add the assignment node's output parameters to Output Parameters for the assignment node, then add them to Input Parameters for the inner shell node of the do-while node (not the do-while node itself). The inner shell node reads data using these variables in its code.

Variable values by assignment node type

The data format depends on the language used by the assignment node.

Assignment node typeData formatExample of ${dag.input[${dag.offset}]}
ShellOne-dimensional array (comma-separated values)2021-03-28 (loop 1), 2021-03-29 (loop 2)
ODPS SQLTwo-dimensional array (rows and columns)0016359810821, Hubei Province, 30 to 40 years old, Cancer (loop 1)

For more details on the output format, see Output format of the outputs parameter.

Example 1: Shell assignment node

The assignment node outputs: 2021-03-28,2021-03-29,2021-03-30,2021-03-31,2021-04-01

VariableLoop 1Loop 2
${dag.input}2021-03-28,2021-03-29,2021-03-30,2021-03-31,2021-04-01(same)
${dag.input[${dag.offset}]}2021-03-282021-03-29
${dag.input.length}5(same)
${dag.loopTimes}12
${dag.offset}01

Example 2: ODPS SQL assignment node

The assignment node's last SELECT returns:

+----------------------------------------------+
| uid            | region         | age_range          | zodiac |
+----------------------------------------------+
| 0016359810821  | Hubei Province | 30 to 40 years old | Cancer |
| 0016359814159  | Unknown        | 30 to 40 years old | Cancer |
+----------------------------------------------+
VariableLoop 1Loop 2
${dag.input}The full table above(same)
${dag.input[${dag.offset}]}0016359810821, Hubei Province, 30 to 40 years old, Cancer0016359814159, Unknown, 30 to 40 years old, Cancer
${dag.input.length}2 (number of rows)(same)
${dag.input[0][1]}Hubei Province (row 0, column 1)
${dag.loopTimes}12
${dag.offset}01

End node code

The end node must return True to continue looping or False to exit. Write end node code in ODPS SQL, Shell, or Python 2. Comments are not supported in end node code.

ODPS SQL

Exit after 10 loops or when a table is empty:

SELECT CASE
    WHEN COUNT(1) > 0 AND ${dag.offset} <= 9
    THEN true
    ELSE false
  END
FROM xc_dpe_e2.xc_rpt_user_info_d
WHERE dt = '20200101';

The query compares the row count and the current offset against fixed values. When ${dag.offset} exceeds 9 (after 10 loops) or the table returns no rows, the end node returns false.

Shell

Exit after 5 loops:

if [ ${dag.loopTimes} -lt 5 ]; then
    echo "True"
else
    echo "False"
fi

${dag.loopTimes} is 1 on the first loop and increments by 1 each iteration. When the fifth loop finishes, ${dag.loopTimes} equals 5, the condition is no longer true, and the end node outputs False.

Python 2

Exit when all rows in the assignment node's result set have been processed:

if ${dag.loopTimes} < ${dag.input.length}:
    print True
else:
    print False

${dag.input.length} is the number of entries in the dataset. The loop continues until ${dag.loopTimes} reaches the dataset length, at which point every entry has been processed.

Use cases

Iterate over a result set with an assignment node

Use an assignment node to query or compute a dataset, then use the do-while node to process each entry in a loop.

Setup:

  1. Create an assignment node (for example, assign_node) that outputs the dataset.

  2. Create a do-while node that depends on assign_node.

    The dependency must be on the do-while node itself, not on the inner shell node.
  3. Add the output parameters of assign_node to Output Parameters for assign_node.

  4. Add those same parameters to Input Parameters for the inner task node (shell node) of the do-while node.

    Configure input parameters on the inner shell node, not on the do-while node.
  5. In the end node code, use ${dag.loopTimes} and ${dag.input.length} to exit when all entries are processed.

To verify that assign_node passes its output correctly, use the data backfill feature in Operation Center to backfill data for both nodes together. Running the do-while node alone skips the assignment node's output.

典型应用

Combine with a branch node and merge node

When the inner workflow needs conditional logic or result traversal, add a branch node and merge node inside the do-while node.

Setup:

  • Inside the do-while node, connect the branch node (branch_node) and merge node (merge_node) as part of the inner workflow.

  • The branch node and merge node must always be used together inside a do-while node.

View logs

To view the run logs of a do-while node in Operation Center:

  1. Find the do-while node and open its directed acyclic graph (DAG).

  2. Right-click the node name and select View Internal Nodes.

What's next