If you want a node to use the data of its ancestor node, you can use an assignment node to pass the data. Assignment nodes support the Shell, ODPS SQL, and Python languages and automatically add the outputs parameter based on value assignment rules. This helps nodes reference the data of their ancestor nodes. This topic describes how to use an assignment node.

Prerequisites

DataWorks of the Standard Edition or a more advanced edition is activated.

Only DataWorks of the Standard Edition or a more advanced edition supports assignment nodes.

Background information

When you use an assignment node to transparently pass the data that is assigned to the outputs parameter, take note of the following items:
  • Dependencies between the assignment node and its ancestor and descendant nodes Assignment node - DependenciesIn the preceding figure, three assignment nodes are created: fuzhi_python, fuzhi_sql, and fuzhi_shell. Before you use these assignment nodes, you must perform the following operations:
    • Configure the start node as the ancestor node of the assignment nodes and the down_compare node as the descendant node of the assignment nodes to establish dependencies among all these nodes. The down_compare node references the outputs parameters in the assignment nodes. The assignment nodes serve as the parent nodes of the down_compare node.
    • Before you configure the assignment nodes as the ancestor nodes of down_compare, commit the assignment nodes. This ensures that the outputs parameters in the assignment nodes can be parsed when you configure the down_compare node.
  • Data passing relationships between the assignment node and its ancestor and descendant nodes
    Data passingAfter you configure Output Parameters for the assignment nodes and Input Parameters for down_compare, data passing and parameter reference relationships are established, as shown in the preceding figure.
    • The outputs parameter that needs to be referenced by down_compare must be added to Output Parameters in the Parameters section of the Properties tab for each assignment node.
    • The outputs parameter that needs to be referenced by down_compare must be added to Input Parameters in the Parameters section of the Properties tab for down_compare.
    Note
    • For some nodes created in DataStudio, you do not need to configure assignment nodes if you want to transparently pass data between these nodes. You can manually add the outputs parameter to Output Parameters or Input Parameters for the nodes. The outputs parameter functions the same way as an assignment node. For example, you can manually add the outputs parameter to Output Parameters or Input Parameters only for the EMR Hive, EMR Spark SQL, ODPS Script, Hologres SQL, AnalyticDB for PostgreSQL, and MySQL nodes. For more information about how to add the outputs parameter, see Configure context-based parameters.
    • If you want to transparently pass data between nodes for which you cannot manually add the outputs parameter to Output Parameters or Input Parameters, you must use assignment nodes.
    • Assignment nodes can transparently pass data only to their Level-1 child nodes.
  • Data passing verification (If the descendant nodes of the assignment node need to reference the data that is passed to the assignment node, and the descendant nodes and the assignment node need to be run at the same time, you can run all these nodes on their configuration tabs or in the Operation Center to check whether the assignment node can pass the data to the descendant nodes.)
  • Relationships between the parameter output format of the assignment node and the way the descendant nodes of the assignment node reference the outputs parameter
    The following table describes the value assignment rules of the outputs parameter in the assignment nodes that use different languages.
    Language Value of outputs Output format of outputs Size limit on the value of outputs
    ODPS SQL The data in the output of the SELECT statement in the last row is used as the value of the outputs parameter. This outputs parameter is added to Output Parameters for the assignment node. The data is passed to the descendant nodes of the assignment node as a two-dimensional array. The value size of the outputs parameter cannot exceed 2 MB. If the value size exceeds 2 MB, the assignment node fails to run.
    Shell The data in the output of the ECHO statement in the last row is used as the value of the outputs parameter. This outputs parameter is added to Output Parameters for the assignment node. The data is passed to the descendant nodes of the assignment node as a one-dimensional array whose elements are separated by commas (,).
    Python The data in the output of the PRINT statement in the last row is used as the value of the outputs parameter. This outputs parameter is added to Output Parameters for the assignment node. The data is passed to the descendant nodes of the assignment node as a one-dimensional array whose elements are separated by commas (,).

Limits

Only DataWorks of the Standard Edition or a more advanced edition supports assignment nodes.

Procedure

This topic describes how to use assignment nodes in the Python, ODPS SQL, and Shell languages to pass data to down_compare. In this example, the data in the output of the last row of the code for each assignment node is passed to the descendant node of the assignment node by configuring Input Parameters and Output Parameters. To pass the data, you must perform the following steps:
  1. Create assignment nodes and other nodes
  2. Configure dependencies for the created nodes
  3. Configure Output Parameters for fuzhi_sql and Input Parameters for down_compare
  4. Configure Output Parameters for fuzhi_python and Input Parameters for down_compare
  5. Configure Output Parameters for fuzhi_shell and Input Parameters for down_compare

Create assignment nodes and other nodes

In this example, three assignment nodes that use different languages (Python, ODPS SQL, and Shell) are created: fuzhi_python, fuzhi_sql, and fuzhi_shell.

  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Workspaces.
  3. After you select the region in which the workspace that you want to manage resides, find the workspace and click Data Analytics in the Actions column.
  4. Move the pointer over the Create icon and choose General > Assignment Node.
    Alternatively, find the desired workflow in the Scheduled Workflow pane, click the workflow name, right-click General, and then choose Create > Assignment Node.
  5. In the Create Node dialog box, specify Node Name and Location.
    Notice The node name must be 1 to 128 characters in length and can contain letters, digits, underscores (_), and periods (.).
  6. Click Commit.
After you create the assignment nodes, you need to create other nodes based on your business requirements. In this example, you need to create a node named start and a node named down_compare. start is a zero load node, and down_compare is a Shell node. For more information about how to create these two nodes, see Create a zero-load node and Create a Shell node.

Configure dependencies for the created nodes

After you create the assignment nodes and other nodes, you must configure dependencies for these nodes based on your business requirements. Assignment node - DependenciesIn this example, you can draw lines to connect start to all the three assignment nodes to use start as the ancestor node of the assignment nodes. Similarly, you can draw lines to connect all the three assignment nodes to down_compare to use down_compare as the descendant node of the assignment nodes. For more information, see Configuration by drawing lines to connect nodes.

In addition, you can configure basic properties, time properties, and resource properties for all the nodes based on your business requirements. For more information, see Basic properties, Configure time properties, and Configure the resource group.

Configure Output Parameters for fuzhi_sql and Input Parameters for down_compare

This section describes how to configure Output Parameters for fuzhi_sql and Input Parameters for down_compare.

  1. Configure fuzhi_sql.
    1. In the desired workflow, find fuzhi_sql and double-click its name.
    2. On the configuration tab of fuzhi_sql, select ODPS SQL for Language and write value assignment code.
      Sample code:
      select * from xc_dpe_e2.xc_rpt_user_info_d  where dt='20191008' limit 10;  
    3. Click the Properties tab in the right-side navigation pane. Then, specify Output Parameters in the Parameters section of the Properties tab.
      fuzhi_sql assigns the data in the output of the code to the outputs parameter. Ancestor node
  2. Configure down_compare.
    1. In the desired workflow, find down_compare and double-click its name.
    2. On the configuration tab of down_compare, write code.
      Sample code:
      echo '${sql_inputs}';
      echo 'Use the data in the first row in the output of fuzhi_sql as the input'${sql_inputs[0]};
      echo 'Use the data in the second row in the output of fuzhi_sql as the input'${sql_inputs[1]};
      echo 'Use the value of the second field in the first row in the output of fuzhi_sql as the input'${sql_inputs[0][1]};
      echo 'Use the value of the third field in the second row in the output of fuzhi_sql as the input'${sql_inputs[1][2]};
    3. Click the Properties tab in the right-side navigation pane. Then, specify Input Parameters in the Parameters section of the Properties tab.
      Rename the outputs parameter of fuzhi_sql to sql_inputs and add sql_inputs to Input Parameters for down_compare.
  3. Run the code and view the reference result.
    1. Click the Run icon in the top toolbar.
    2. In the Warning message, click Continue to Run.
    3. View the reference result.

Configure Output Parameters for fuzhi_python and Input Parameters for down_compare

This section describes how to configure Output Parameters for fuzhi_python and Input Parameters for down_compare.

  1. Configure fuzhi_python.
    1. In the desired workflow, find fuzhi_python and double-click its name.
    2. On the configuration tab of fuzhi_python, select Python for Language and write value assignment code.
      Sample code:
      print "a,b,c";
    3. Click the Properties tab in the right-side navigation pane. Then, specify Output Parameters in the Parameters section of the Properties tab.
      fuzhi_python assigns the data in the output of the code to the outputs parameter. In this example, the data that is assigned is a,b,c. Python

      The data a,b,c is assigned to the outputs parameter of fuzhi_python as a one-dimensional array.

  2. Configure down_compare.
    1. In the desired workflow, find down_compare and double-click its name.
    2. On the configuration tab of down_compare, write code.
      Sample code:
      echo 'The output of fuzhi_python'${python_inputs};
      echo 'Use the first value in the output of fuzhi_python as the input'${python_inputs[0]};
      echo 'Use the second value in the output of fuzhi_python as the input'${python_inputs[1]};
    3. Click the Properties tab in the right-side navigation pane. Then, specify Input Parameters in the Parameters section of the Properties tab.
      Rename the outputs parameter of fuzhi_python to python_inputs and add python_inputs to Input Parameters for down_compare.
  3. Run the code and view the reference result.
    1. Click the Run icon in the top toolbar.
    2. In the Warning message, click Continue to Run.
    3. View the reference result.

Configure Output Parameters for fuzhi_shell and Input Parameters for down_compare

This section describes how to configure Output Parameters for fuzhi_shell and Input Parameters for down_compare.

  1. Configure fuzhi_shell.
    1. In the desired workflow, find fuzhi_shell and double-click its name.
    2. On the configuration tab of fuzhi_shell, write value assignment code.
      Sample code:
      echo "hello,world";
    3. Click the Properties tab in the right-side navigation pane. Then, specify Output Parameters in the Parameters section of the Properties tab.
      fuzhi_shell assigns the data in the output of the code to the outputs parameter. In this example, the data that is assigned is hello,world. SHELL

      The data hello,world is assigned to the outputs parameter of fuzhi_shell as a one-dimensional array.

  2. Configure down_compare.
    1. In the desired workflow, find down_compare and double-click its name.
    2. On the configuration tab of down_compare, write code.
      Sample code:
      echo 'The output of fuzhi_shell'${shell_inputs};
      echo 'Use the first value in the output of fuzhi_shell as the input'${shell_inputs[0]};
      echo 'Use the second value in the output of fuzhi_shell as the input'${shell_inputs[1]};
    3. Click the Properties tab in the right-side navigation pane. Then, specify Input Parameters in the Parameters section of the Properties tab.
      Rename the outputs parameter of fuzhi_shell to shell_inputs and add shell_inputs to Input Parameters for down_compare.
  3. Run the code and view the reference result.
    1. Click the Run icon in the top toolbar.
    2. In the Warning message, click Continue to Run.
    3. View the reference result.