DataWorks provides for-each nodes. You can use a for-each node to traverse the output of an assignment node in loops. You can also customize the workflow in a for-each node. This topic provides an example on how to configure and use a for-each node. In this example, the for-each node is used to traverse the output of an assignment node in two loops, and the system displays the current number of loops in the Operation Center for each loop.
Prerequisites
Before you configure a for-each node, you must be familiar with the logic of a for-each node. This prevents errors during the configuration of the node. For information about the logic of a for-each node, see Composition and application logic.Procedure
In most cases, a for-each node is used with an assignment node. This section describes the procedure for using a for-each node.
- Configure dependencies for a for-each node.
A for-each node must depend on an assignment node. For information about how to configure dependencies for a for-each node, see Create and configure a workflow.
- Configure inputs for the for-each node.
In the Input Parameters and Output Parameters section of the Properties tab for the for-each node, add the built-in output parameter named outputs of the assignment node to Input Parameters for the for-each node as an input parameter. For information about how to configure an assignment node, see Configure an assignment node.
- Configure the inner nodes of the for-each node to obtain input parameters of the for-each node.
You can configure an inner workflow for the for-each node based on your business requirements, configure built-in variables for the inner nodes in the workflow to obtain the desired values for the input parameters, and then run the for-each node. For information about the built-in variables, see Built-in variables. For information about how to configure a for-each node, see Configure a for-each node.
- Test the for-each node. You cannot test for-each nodes in DataStudio.To test a for-each node, go to Operation Center, find the desired inner node, and then click the name of the node to view the details of the node. For more information, see Test the for-each node and view test results.Note If you want to check whether an assignment node passes its output to a for-each node in Operation Center, you can use the data backfill feature and select both the assignment and for-each nodes. You cannot obtain the output of the assignment node if you run only the for-each node.
Create and configure a workflow
To create a workflow that contains an assignment node as the ancestor node and a for-each node as the descendant node, perform the following steps:
- Go to the DataStudio page
- Create a workflow.
- Create a for-each node.
- Create an assignment node.
- Drag a directed line to configure the assignment node as the ancestor node of the for-each node.
Configure an assignment node
- On the configuration tab of the created workflow, double-click the name of the assignment node that you created. The configuration tab of the assignment node appears.
- Select SHELL from the Language drop-down list.
- Enter the following statement in the code editor:
echo 'this is name,ok';
- In the right-side navigation pane, click the Properties tab. In the Parameters section, view the information about the outputs parameter below Output Parameters. The outputs parameter is the default output parameter of the assignment node.
- Click the icon in the top toolbar to save the assignment node.
- Commit the assignment node. Important You must specify Rerun and Parent Nodes on the Properties tab before you commit the assignment node.
- Click the icon in the top toolbar.
- In the Commit Node dialog box, specify Change description.
- Click OK.
If the workspace that you use is in standard mode, you must click Deploy in the upper-right corner to deploy the node after you commit the node. For more information, see Deploy nodes.
Configure a for-each node
- Double-click the for-each node that you created. By default, the start, sql, and end nodes are displayed on the configuration tab of the for-each node.
- Delete the sql node. You can use a node other than an SQL node in the workflow of the for-each node.
- If you want to use an ODPS SQL node, skip this step.
- If you want to use a node other than an SQL node, delete the sql node first. In this example, a Shell node is used.
- Create and configure a Shell node. This step guides you through Shell node creation. You can use the same method to create other types of nodes. If you need to use the default sql node, skip this step.
- Configure the scheduling properties of the for-each node.
- Click the icon in the top toolbar to save the for-each node.
- Commit the for-each node. Important You must specify Rerun and Parent Nodes on the Properties tab before you commit the node.If the workspace that you use is in standard mode, you must click Deploy in the upper-right corner to deploy the for-each node after you commit the for-each node. For more information, see Deploy nodes.
Test the for-each node and view test results
- On the node configuration tab, click Operation Center in the upper-right corner to go to Operation Center.
- In the left-side navigation pane of the Operation Center page, choose .
- On the Cycle Task page, find the for-each node and click DAG in the Actions column to open the directed acyclic graph (DAG) of the for-each node. In the DAG of the for-each node, right-click the assignment node and choose . In the Patch Data dialog box, configure the parameters and click OK.
- Refresh the Patch Data page. After the data backfill instance is run, click DAG in the Actions column of the instance.
- In the DAG that appears, right-click the assignment node and select View Runtime Log to view its operational logs.
- On the Patch Data page, right-click the for-each node in the DAG and select View Internal Nodes.
- On the page that appears, click Loop 1 in the middle pane, right-click the Shell node in the DAG, and then select View Runtime Log. On the page that appears, view the operational logs of the Shell node in the first loop.
- Use the same method to view the operational logs of the Shell node in the second loop.