All Products
Search
Document Center

DataWorks:Configure a do-while node

Last Updated:Jan 19, 2024

DataWorks provides do-while nodes. You can rearrange the workflow inside a do-while node, write the logic to be executed in a loop in the node, and then configure an end node to determine whether to exit from looping. You can use a do-while node alone, or use a do-while node together with an assignment node to loop through the result set passed by the assignment node. This topic provides example on how to configure a do-while node.

Prerequisites

  • You have understood that you can configure an inner workflow for a do-while node based on your business requirements. For more information, see Composition and workflow orchestration of a do-while node.

  • You have understood that the built-in variables provided by a do-while node can be used to obtain the related values in each loop. For more information about the built-in variables, see Built-in variables.

  • You have understood that the inner workflow in a do-while node must start with the start node and end with the end node. You have also understood that the start node marks the start of a loop and the end node is used to control whether to exit from looping. For information about the end node, see Sample code for the end node.

  • You are familiar with the method to test a do-while node and the method to view run logs of a do-while node. For more information, see Precautions.

Limits

  • Only DataWorks Standard Edition and more advanced editions support do-while nodes. For more information, see Differences among DataWorks editions.

  • The maximum number of loops for a do-while node is 128.

  • Parallel execution is not supported. A loop can start only if the previous loop ends.

Create a do-while node

  1. Go to the DataStudio page.

    Log on to the DataWorks console. In the left-side navigation pane, choose Data Modeling and Development > DataStudio. On the page that appears, select the desired workspace from the drop-down list and click Go to DataStudio.

  2. Create a do-while node.

    1. On the DataStudio page, move the pointer over the 新建 icon and choose Create Node > General > do-while.

      Alternatively, you can find the workflow in which you want to create a do-while node, click the workflow name, right-click General, and then choose Create Node > do-while.

    2. In the Create Node dialog box, configure the Name and Path parameters.

    3. Click Confirm.

Example of using a do-while node

This section describes how to use a do-while node to loop through an output in five loops and display the current number of loops each time a loop is run.

Customize the inner workflow in the do-while node

You can configure an inner workflow in the do-while node based on your business requirements. In detail, you can replace the default sql node with a Shell node.

  1. Double-click the name of the do-while node. The configuration tab of the node appears.

    By default, the do-while node consists of the start, sql, and end nodes.

    • The start node marks the start of a loop and is not used to process a loop task.

    • The sql node is a sample business processing node provided by DataWorks. You can replace the sql node based on your business requirements. For example, you can replace this node with a Shell node named Display loop count.

    • The end node marks the end of a loop and determines whether to start the next loop. The end node defines the condition for exiting from looping for the do-while node.

  2. Delete the sql node.

    1. Right-click the sql node and select Delete Node.

      删除节点

    2. In the Delete message, click OK.

  3. Create and configure a task node. In this example, a Shell node is used as the task node.

    1. Choose General > Shell and drag Shell to the canvas on the right.

      Shell

    2. In the Create Node dialog box, configure the Node Name parameter.

    3. Click Create.

    4. On the canvas of the do-while node, drag lines to configure the Shell node as the descendant node of the start node and the ancestor node of the end node.

Edit code for the Shell node

  1. Double-click the Shell node. The configuration tab of the Shell node appears.

  2. Enter the following code in the code editor:

    echo ${dag.loopTimes} ----Display the current number of loops.

    • The ${dag.loopTimes} variable is a reserved variable of the system. This variable specifies the current number of loops, and the value of this variable starts from 1. All inner nodes of the do-while node can reference this variable. For more information about the built-in variables, see Built-in variables and Examples of variable values.

    • After you modify the code of the Shell node, save the modification. No message that reminds you to save the modification will appear when you commit the node. If you do not save the modification, the code cannot be updated to the latest version in time.

Configure the end node

Define the condition for exiting from looping in the end node.

  1. Double-click the end node. The configuration tab of the node appears.

  2. Select Python from the Language drop-down list.

  3. Enter the following code to define the condition for exiting from looping for the do-while node:

    if ${dag.loopTimes}<5: 
     print True; 
    else: 
     print False;
    • The ${dag.loopTimes} variable is a reserved variable of the system. This variable specifies the current number of loops, and the value of this variable starts from 1. All inner nodes of the do-while node can reference this variable. For more information about the built-in variables, see Built-in variables and Examples of variable values.

    • In the code, the value of the dag.loopTimes variable is compared with 5 to limit the number of loops that can be run. The value of the dag.loopTimes variable is 1 for the first loop and increases by 1 each time. In this case, the value of the ${dag.loopTimes} variable is 2 for the second loop and 5 for the fifth loop. The do-while node exits from looping when the result of ${dag.loopTimes}<5 is False.

Commit the do-while node

Important

You can commit the do-while node only after you configure the rerun attribute and ancestor nodes for the do-while node.

  1. Click the 提交 icon in the top toolbar.

  2. In the Commit dialog box, select the nodes that you want to commit.

  3. Click OK.

    If the workspace that you use is in standard mode, you must click Deploy in the top toolbar to deploy the do-while node after you commit it. For more information about how to deploy a node, see Deploy nodes.

Test the do-while node and view run logs of the do-while node

The procedure for committing, deploying, and running a do-while node is the same as that for committing, deploying, and running a common node. However, you cannot test a do-while node in DataStudio.

Note

If the workspace that you use is in standard mode, you cannot directly perform a test to run a do-while node in DataStudio.

To perform a test to run the do-while node and view the result, you must commit and deploy the workflow that contains the do-while node to Operation Center and run the do-while node in Operation Center. If you use the value passed by an assignment node in the do-while node, run both the assignment node and do-while node during the test in Operation Center.

  1. Go to the Cycle Task page to perform the data backfill operation.

    1. On the configuration tab of the do-while node, click Operation Center in the top toolbar to go to Operation Center.

    2. In the left-side navigation pane of the Operation Center page, choose Cycle Task Maintenance > Cycle Task.

    3. On the Cycle Task page, find the do-while node and click DAG in the Actions column to open the directed acyclic graph (DAG) of the do-while node. In the DAG of the do-while node, right-click the assignment node and choose Run > Current and Descendent Nodes Retroactively. In the Backfill Data dialog box, configure the parameters and click OK.

    4. Refresh the Patch Data page. After the data backfill instances are successfully run, click DAG in the Actions column of the data backfill instance generated for the do-while node.

  2. View run logs of the do-while node.

    1. Right-click the do-while node and select View Internal Nodes.

      You can view run logs of a do-while node only if you view the inner nodes of the do-while node.

      The inner workflow of the do-while node is divided into three parts:

      • The left pane of the view displays the rerun history of the do-while node. A record is generated each time a do-while node instance is run.

      • The middle pane of the view displays a loop record list that shows all existing loops of the do-while node and the status of each loop.

      • The right pane of the view displays the details about each loop. You can click a record in the loop record list to view the details of each instance in the loop.

    2. On the inner node page, click a loop that is finished in the middle pane, right-click the desired node in the right pane, and then select View Runtime Log.

  3. View run logs for the nth loop.

    On the inner node page, click Loop 3 in the middle pane to view run logs of the Shell node in the third loop.

    View run logs of the end node in the fifth loop.

    The preceding example shows that a do-while node works based on the following application logic:

    1. The system starts a loop from the start node.

    2. Other nodes inside the do-while node run in sequence based on the dependencies configured for them.

    3. The system executes the conditional statement defined in the code of the end node for exiting from looping.

    4. The system records the number of loops that are run, and the next loop starts if the conditional statement returns True in the run logs of the end node.

    5. The entire looping process ends if the conditional statement returns False in the run logs of the end node.

Summary

  • Comparison between a do-while node and the while, For Each, and do-while loop statements:

    • A do-while node runs based on a workflow that starts a loop before evaluation. This node functions the same way as the do-while statement. A do-while node can use the built-in variable ${dag.offset} and input and output parameters to achieve the feature of the For Each statement.

    • A do-while node cannot achieve the feature of the while statement because a do-while node runs a loop before evaluation.

  • Work procedure of a do-while node:

    1. The system runs a loop from the start node and runs other nodes based on the dependencies configured for them.

    2. After the system runs the code that is defined for the end node in a loop, one of the following situations occur:

      • The next loop starts if the end node returns True.

      • The entire looping process ends if the end node returns False.

  • Input and output parameters: The inner nodes of the do-while node use a variable ${dag.Input and output parameter names} to reference the input and output parameters configured for the do-while node.

  • Built-in variables: DataWorks provides the following built-in variables for the inner nodes of the do-while node:

    • dag.loopTimes: the number of loops that are run. The value of this variable starts from 1.

    • dag.offset: the offset of the number of loops that are run to 1. The value of this variable starts from 0.