DataWorks provides the do-while node, which lets you repeatedly execute part of your workflow. You can define the loop's business logic inside the node and configure the end node to control the exit condition. You can also use a do-while node with an assignment node to iterate over a result set passed by the assignment node. This topic uses examples of simple and complex scenarios to demonstrate how to configure a do-while node.
Prerequisites
-
You are familiar with customizing the workflow within a do-while node. For more information, see Node components and workflow orchestration.
-
You are familiar with using built-in variables to obtain loop-related parameters within the node. For more information, see Built-in variables.
-
You understand that within a do-while node, the Start node marks the beginning of a loop, and the End node defines the loop's exit logic. For more information, see Exit loop example: End node code samples.
-
You are familiar with the considerations for testing a do-while node and viewing its logs. For more information, see Usage notes.
Limitations
-
The do-while node is available only in DataWorks Standard Edition and later editions. For more information, see Features of DataWorks editions.
-
The maximum number of loops for a do-while node is 1,024.
-
Concurrent execution is not supported. A new loop can start only after the previous one is complete.
Create a do-while node
Log on to the DataWorks console. In the target region, click in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Data Development.
-
Create a do-while node.
-
On the Data Studio page, move the pointer over the
icon and choose .Alternatively, you can open a workflow, right-click General, and select .
-
In the Create Node dialog box, specify the node name and path.
-
Click OK.
-
Examples
This section describes how to use a do-while node to loop 5 times and print the current loop count in each iteration. The following is an end-to-end walkthrough.
Edit the node code
A do-while node contains three nodes by default: Start, Shell, and End:
-
The Start node marks the beginning of a loop and serves no business purpose. It cannot be deleted.
-
The Shell node is an example business processing node provided by DataWorks.
-
The End node marks the end of a loop and determines whether to start the next iteration. It is used to define the exit condition of the do-while node and cannot be deleted.
You can also customize the workflow inside the do-while node based on your business requirements by replacing the Shell node with other nodes.
-
Double-click the Shell node to open its editing page.
-
Enter the following code.
echo ${dag.loopTimes} ----Print the loop count.-
The ${dag.loopTimes} variable is a system reserved variable that represents the current loop count, starting from 1. Internal nodes of a do-while node can directly reference this variable. For more information about built-in variables, see Built-in variables and Context-dependent variables.
-
Make sure to save the code after you modify it in the Shell node. No prompt is displayed when you submit the node. If you do not save the code, the latest changes are not applied.
-
Define the loop exit condition
Define the exit condition so that the loop exits on the 5th iteration.
-
Double-click the End node to open its editing page.
-
In the Select Language drop-down list, select Python.
-
Enter the following code to define the exit condition of the do-while node.
if ${dag.loopTimes}<5: print True; else: print False;-
The ${dag.loopTimes} variable is a system reserved variable that represents the current loop count, starting from 1. Internal nodes of a do-while node can directly reference this variable. For more information about built-in variables, see Built-in variables and Context-dependent variables.
-
The code compares
dag.loopTimeswith 5 to limit the total number of loop iterations. In the first iteration, dag.loopTimes is 1, in the second it is 2, and so on until the fifth iteration when it is 5. At that point, the expression ${dag.loopTimes}<5 evaluates to False, and the loop exits.
-
Submit the do-while node
-
Click the
icon in the toolbar to save the node. -
Click the
icon in the toolbar to submit the node.When you submit the node, enter a Change Description in the Submission dialog box and specify whether to initiate a code review after the node is submitted.
Important-
You must configure the Rerun attribute and Parent Nodes before you can submit the node.
-
Code review helps ensure the quality of task code and prevents errors caused by deploying unreviewed code directly to the production environment. If code review is enabled, the submitted node code must be approved by reviewers before it can be deployed. For more information, see Code review.
If you use a workspace in standard mode, you must click Deploy in the upper-right corner of the node editing page to deploy the task to the production environment after it is submitted. For more information, see Deploy nodes.
-
Test and view execution logs
The submission and deployment process for a do-while node is the same as that for a regular node. The online execution process is also the same, but testing from the Data Studio interface is not supported.
When DataWorks is in standard mode, you cannot directly test a do-while node from the Data Studio interface.
To test and verify the running results of a do-while node, you must submit and deploy the task that contains the do-while node to Operation Center, and then run the do-while node task from the Operation Center page. If you use values passed by an assignment node inside the do-while node, run both the assignment node and the do-while node together during testing in Operation Center.
-
Click O&M Personnel in the upper-right corner of the page to go to Operation Center.
-
In the left-side navigation pane, click .
-
Select the desired node. In the DAG on the right side, right-click the assignment node and select .
-
Refresh the Data Backfill Instance page. After the backfill instance runs successfully, click DAG next to the instance.
-
View the execution logs of the do-while node.
-
Right-click the do-while node and select View Inner Nodes.
For composite nodes such as do-while nodes, you must view the inner nodes to see the detailed execution logs. In this example, the loop ran 5 times and all iterations completed successfully.
The internal loop body of a do-while node consists of the following three parts:
-
The left side of the view shows the rerun history list of the do-while node. Each time the do-while instance runs as a whole, a corresponding record is added to the history list.
-
The middle section shows the loop record list, which displays the total number of loop iterations and the status of each iteration.
-
The right side of the view shows the details of each iteration. Click an iteration in the loop record list to display the running status of each instance in that iteration.
-
-
On the inner nodes page, click an iteration number on the left, right-click the desired node, and select View Runtime Log.
-
View the detailed execution logs of the Nth iteration.
On the inner nodes page, click Iteration 5 on the left to view the logs of the Shell node in the 5th iteration.
The runtime log shows that in the 5th iteration, the End node executed the evaluation code
if 5 < 5: print True else: print False, and the output wasFalse(Output Result: False). The loop terminated and the task completed normally.As shown in this example, the workflow of a do-while node is as follows:
-
The execution starts from the Start node.
-
Tasks are executed sequentially based on the defined dependency relationships.
-
The exit condition is defined in the End node.
-
After a set of tasks completes, the exit condition statement in the End node is executed.
-
If the evaluation statement in the End node prints True in the logs, the loop continues from step 1.
-
If the evaluation statement in the End node prints False in the logs, the entire loop exits and the do-while node completes.
-
-
Summary
-
The following table compares the do-while node with the while, for-each, and do-while loop types:
-
The do-while node supports the do...while loop pattern, which executes the loop body first and then evaluates the condition. It can also indirectly implement the foreach pattern by using the system variable dag.offset in combination with node context.
-
The do-while node does not support the while loop pattern, which evaluates the condition before executing the loop body.
-
-
Execution flow of a do-while node:
-
Starting from the Start node, tasks in the loop body are executed sequentially based on the defined dependency relationships.
-
The code defined by the user in the End node is executed.
-
If the End node outputs True, the next iteration begins.
-
If the End node outputs False, the loop terminates.
-
-
-
Using context dependencies: Internal nodes of a do-while node can reference the node context defined for the do-while node by using ${dag.context variable name}.
-
System parameters: DataWorks automatically provides two system variables for internal nodes of a do-while node.
-
dag.loopTimes: The loop count starting from 1.
-
dag.offset: The offset of the current iteration relative to the first iteration, starting from 0.
-