DataWorks provides do-while nodes. You can rearrange the workflow inside a do-while
node, write the logic that you want to execute in a loop in the node, and then configure
an end node to determine whether to exit from looping. You can also use a do-while
node together with an assignment node to traverse the output of the assignment node
in loops. This topic provides examples on how to configure a do-while node in simple
and complex scenarios.
Prerequisites
DataWorks Standard Edition or a more advanced edition is activated.
Background information
In DataWorks, do-while nodes are a special type of node that contains inner nodes.
After you create a do-while node, the following three inner nodes are created: start, sql, and end. The start node marks the start of a loop. The sql node runs a loop. The end node
marks the end of a loop and controls the number of loops to run. The three inner nodes
are organized as a workflow to traverse data in loops.
You can customize the sql node and use the built-in variables provided by the do-while
node to write the code of the end node. For more information about logic principles,
see Logic principles. You can plan the workflow inside your do-while node based on your business requirements.
For more information about how to configure a do-while node, see the procedures described
in the following sections.
Limits and usage nodes
- Support for do-while nodes
- You can use do-while nodes only in DataWorks Standard Edition or a more advanced edition.
- A do-while node supports a maximum of 128 times the loop is run. If the number of
times the loop is run determined by the end node exceeds 128, an error is returned.
- Internal nodes
- When you customize a do-while node, you can delete the dependencies between the internal
nodes and rearrange the internal workflow of the do-while node. However, you must
use the start node and the end node as the start and end nodes of the internal workflow of the do-while node.
- When the internal nodes of a do-while node use a branch node to perform logical judgments
or traverse results, a merge node also needs to be used.
- You cannot add comments when you develop the code of the end node of a do-while node.
- Test and running
- If the workspace is in standard mode, you cannot directly test and run a do-while
node in DataStudio.
To test the do-while node and view the result, you must commit the do-while node to
Operation Center and run the do-while node in Operation Center. If you use the value
passed by an assignment node in the do-while node, run both the assignment node and
do-while node during the test in Operation Center.
- When you view the operational logs of a do-while node in Operation Center, right-click
the do-while node and select View Internal Nodes to view the operational logs of the internal nodes.
Procedure

- Configure node dependencies.
Configure an assignment node as an ancestor node of a do-while node.
- Configure inputs for the do-while node.
In the Input and Output Parameters section of the Properties tab for the do-while node, add the outputs parameter of the assignment node to Input Parameters.
- Configure the inner nodes of the do-while node.
Customize the workflow inside the do-while node based on your business requirements.
Then, configure built-in variables for the inner nodes of the do-while node to obtain
and traverse the output of the assignment node in loops.
Create a do-while node
- Go to the DataStudio page.
- Log on to the DataWorks console.
- In the left-side navigation pane, click Workspaces.
- In the top navigation bar, select the region where the required workspace resides,
find the workspace, and then click Data Analytics.
- In the Scheduled Workflow pane of the DataStudio page, move the pointer over the
icon and choose . Alternatively, you can click the name of the workflow in which you want to create
a do-while node, right-click General, and then choose .
- In the Create Node dialog box, set the Node Name and Location parameters.
Note The node name must be 1 to 128 characters in length and can contain letters, digits,
underscores (_), and periods (.).
- Click Commit.
Simple example of using a do-while node
This section describes how to use a do-while node to traverse the output of an assignment
node in five loops and display the current number of loops each time a loop is run.
- Double-click the name of the do-while node. The configuration tab of the node appears.
By default, the do-while node consists of the
start,
sql, and
end nodes.
- The start node marks the start of a loop and does not run business code.
- The sql node is a sample business processing node provided by DataWorks. You can replace
the sql node based on your business requirements. For example, you can replace this
node with a Shell node named Display loop count.
- The end node marks the end of a loop and determines whether to start the next loop. The end
node defines the condition for exiting from looping for the do-while node.
- Delete the sql node.
- Right-click the sql node and select Delete Node.
- In the Delete message, click OK.
- Create and configure a loop task node. In this example, a Shell node is used.
- Choose and drag Shell to the canvas on the right.
- In the Create Node dialog box, enter a name in the Node Name field.
Notice The node name must be 1 to 128 characters in length and can contain letters, digits,
underscores (_), and periods (.).
- Click Commit.
- On the canvas of the do-while node, drag lines to configure the Shell node as the
descendant node of the start node and the ancestor node of the end node.
- Double-click the Shell node. The configuration tab of the Shell node appears.
- Enter the following code in the code editor:
echo ${dag.loopTimes} ----Display the current number of loops.
- The ${dag.loopTimes} variable is a reserved variable of the system. This variable specifies the current
number of loops, and the value of this variable starts from 1. All inner nodes of
the do-while node can reference this variable. For more information about built-in
variables, see Built-in variables and Examples of variable values.
- After you modify the code of the Shell node, save the modification. No message that
reminds you to save the modification will appear when you commit the node. If you
do not save the modification, the code cannot be updated to the latest version in
time.
- Configure the end node to control the number of loops that can be run.
- Double-click the end node. The configuration tab of the node appears.
- Select Python from the Language drop-down list.
- Enter the following code to define the condition for exiting from looping for the
do-while node:
if ${dag.loopTimes}<5:
print True;
else:
print False;
- The ${dag.loopTimes} variable is a reserved variable of the system. This variable specifies the current
number of loops, and the value of this variable starts from 1. All inner nodes of
the do-while node can reference this variable. For more information about built-in
variables, see Built-in variables and Examples of variable values.
- In the code, the value of the
dag.loopTimes
variable is compared with 5 to limit the number of loops that can be run. The value
of the ${dag.loopTimes} variable is 1 for the first loop and increases by 1 each time. In this case, the
value of the ${dag.loopTimes} variable is 2 for the second loop and 5 for the fifth
loop. The do-while node exits from looping when the result of ${dag.loopTimes}<5 is False.
- On the configuration tab of the do-while node, click the Properties tab on the right-side navigation pane to configure scheduling properties for the
node. For more information, see Configure basic properties.
- Click the
icon in the top toolbar.
- Commit the do-while node.
Notice You can commit the do-while node only after you configure the Rerun and Parent Nodes parameters.
- Click the
icon in the top toolbar.
- In the Commit dialog box, select the nodes that you want to commit and enter your comments in the
Description field.
- Click Commit.
If the workspace that you use is in standard mode, you must click
Deploy in the upper-right corner to deploy the do-while node after you commit it. For more
information, see
Deploy nodes.
- Test the node and view the result.
Note If the workspace that you use is in standard mode, you cannot directly perform a test
to run a do-while node in DataStudio.
To perform a test to run the do-while node and view the result, you must commit the
do-while node to Operation Center and run the do-while node in Operation Center. If
you use the value passed by an assignment node in the do-while node, run both the
assignment node and do-while node during the test in Operation Center.
- On the node configuration tab, click Operation Center in the upper-right corner to go to Operation Center.
- In the left-side navigation pane of the Operation Center page, choose .
- On the Cycle Task page, find the do-while node and click DAG in the Actions column
to open the directed acyclic graph (DAG) of the do-while node. In the DAG of the do-while
node, right-click the assignment node and choose . In the Patch Data dialog box, configure the parameters and click OK.
- Refresh the Patch Data page. After the data backfill instance is run, click DAG in the Actions column of the instance.
- Right-click the do-while node and select View Internal Nodes.
The internal workflow of the do-while node is divided into three parts:
- The left pane of the view displays the rerun history of the do-while node. A record
is generated each time a do-while node instance is run.
- The middle pane of the view displays a loop record list that shows all existing loops
of the do-while node and the status of each loop.
- The right pane of the view displays the details about each loop. You can click a record
in the loop record list to view the details of each instance in the loop.
- On the inner node page, click Loop 3 on the left, right-click the Shell node, and then select View Runtime Log.
The preceding example shows that a do-while node works based on the following application
logic:
- The system starts a loop from the start node.
- Other nodes inside the do-while node run in sequence based on the dependencies configured
for them.
- The system executes the conditional statement defined in the code of the end node
for exiting from looping.
- The system records the number of loops that are run, and the next loop starts if the
conditional statement returns True in the logs of the end node.
- The entire looping process ends if the conditional statement returns False in the
logs of the end node.
Complex example of using a do-while node
In addition to the preceding simple scenario, you may encounter complex scenarios
in which each data entry is processed in sequence by using a loop. You can use a do-while
node to process data in these scenarios. Before you use a do-while node to process
data in these scenarios, make sure that the following conditions are met:
- Another node is deployed and configured as the ancestor node of the do-while node.
The node can pass its output to the do-while node. You can use an assignment node
as the ancestor node.
- The output of the assignment node is configured as the input of the do-while node.
This way, the do-while node can obtain the output of the assignment node.
- The inner nodes of the do-while node can reference each data entry. The built-in variable
${dag.offset} is used to reference the input parameters configured for the do-while node.
The following example shows how to configure a do-while node in a complex scenario.
The preceding figure shows the following information:
- The output of the assignment node is a two-dimensional array. The two-dimensional
array is passed to the do-while node.
Sample values of the two-dimensional array:
+----------------------------------------------+
| uid | region | age_range | zodiac |
+----------------------------------------------+
| 0016359810821 | Hubei Province | 30 to 40 years old | Cancer |
| 0016359814159 | Unknown | 30 to 40 years old | Cancer |
+----------------------------------------------+
- The inner nodes of the do-while node use variables to obtain and print the loop parameters,
offsets, and parameter values of the input from the ancestor assignment node.
- Create and configure an assignment node.
Key points:
- Value assignment code and input and output parameters: Select the language of the
assignment node and write the code of the assignment parameter. The system generates
output parameters for the output of the assignment node based on specific rules.
Note The output of the assignment node is used as the input of the do-while node.

- Node dependencies: You can create an assignment node in the workflow and drag a line
to configure the assignment node as the ancestor node of the do-while node.
For more information, see
Configure an assignment node.
- Configure the output of the assignment node as the input of the do-while node.
On the configuration tab of the do-while node, click the
Properties tab on the right-side navigation pane. In the
Input and Output Parameters section, click
Create. Set the
Parameter Name parameter to
input and the
Value Source parameter to the output parameter of the ancestor assignment node.
Note The input and output parameters are configured for the assignment node and the do-while
node, not for the inner nodes of the do-while node.

- Configure the inner loop task node of the do-while node.
Double-click the name of the do-while node. The node configuration tab appears. Then,
define the workflow inside the do-while node.
By default, the do-while node consists of three nodes:
start,
sql, and
end. In this example, you must delete the
sql node, create a
Shell node, and then write code for the Shell node to print the loop parameters. Take note
of the following key points:
- Node dependencies: After you delete the sql node and create a Shell node, you must drag lines to establish dependencies between the inner nodes.

- Loop task code: When you write code for the Shell node, you can use built-in variables
to print various loop parameters. For more information about the built-in variables
available for a do-while node, see Built-in variables. You can refer to the following sample code to write code for the Shell node:
echo '${dag.input}';
echo 'Obtain the row data of the current loop:'${dag.input[${dag.offset}]};
echo 'Obtain the offset:'${dag.offset};
echo 'Obtain the number of loops that are run:'${dag.loopTimes};
echo 'Obtain the length of the dataset passed by the ancestor assignment node _odpssql:'${dag.input.length};
echo 'If you want to select data in a specific row and a specific column in the output of the assignment node, select the value based on a two-dimensional array:'${dag.input[0][1]};
- Define the loop exit condition for the end node.
You can use the built-in variables supported by the do-while node for loop control.
In this example, the values of the
dag.loopTimes and
dag.input.length variables are compared. The dag.loopTimes variable specifies the number of loops
that are run, and the dag.input.length variable specifies the length of the dataset
passed by the ancestor assignment node. If the value of the
dag.loopTimes variable is less than the value of the
dag.input.length variable, the end node returns True and the next loop starts. Otherwise, the end
node returns False, and the entire looping process ends. In this example, the following
code is used:
if ${dag.loopTimes}<${dag.input.length}:
print True;
else:
print False;
- Run the do-while node and view the result.
Go to Operation Center, find the do-while node, and then open the DAG of the node.
In the DAG, right-click the node name and choose . In the Nodes section of the Patch Data dialog box, select the assignment node and
the do-while node. After the data backfill instances are run, you can view the result
in the logs.
Note
- If you use the value passed by an assignment node in the do-while node, run both the
assignment node and do-while node during the test in Operation Center.
- To view the operational logs of a do-while node in Operation Center, perform the following
steps: find the do-while node and open the DAG of the node. In the DAG, right-click
the node name and select View Internal Nodes to view the operational logs of the inner nodes.
- View the output of the assignment node.

- View the result that is returned after the end node is run for the first time
- View the result that is returned after the end node is run for the second time
Summary
- Comparison between a do-while node and the while, For Each, and do-while loop statements:
- A do-while node runs based on a workflow that starts a loop before evaluation. This
node functions the same way as the do-while statement. A do-while node can use the
built-in variable ${dag.offset} and input and output parameters to achieve the feature
of the For Each statement.
- A do-while node cannot achieve the feature of the while statement because a do-while
node runs a loop before evaluation.
- Work procedure of a do-while node:
- The system runs a loop from the start node and runs other nodes based on the dependencies
configured for them.
- After the system runs the code that is defined for the end node in a loop, one of
the following situations occur:
- The next loop starts if the end node returns True.
- The entire looping process ends if the end node returns False.
- Input and output parameters: The inner nodes of the do-while node use a variable ${dag.Input and output parameter names} to reference the input and output parameters configured for the do-while node.
- Built-in variables: DataWorks provides the following built-in variables for the inner
nodes of the do-while node:
- dag.loopTimes: the number of loops that are run. The value of this variable starts from 1.
- dag.offset: the offset of the number of loops that are run to 1. The value of this variable
starts from 0.