An assignment node runs a short script — ODPS SQL, Python 2, or Shell — and passes its output directly as a parameter to downstream nodes. Use it to share dynamic values, such as query results or computed strings, between nodes in a workflow without an intermediate table.
Key constraints: The outputs parameter cannot exceed 2 MB. Parameters are passed to immediate downstream nodes only.
Prerequisites
Before you begin, ensure that you have:
-
A DataWorks workspace with Standard Edition or higher
-
A RAM user added to the workspace and granted developer permissions
How it works
The assignment node transfers data through a system-generated output parameter named outputs.
-
Assignment node (producer): DataWorks captures the last output of the script and assigns it to
outputs. -
Downstream node (consumer): You create an input parameter on the downstream node, map it to
outputs, and reference it in your script using the${param}syntax.
The output format of outputs depends on the script language:
| Language | Source | Output format |
|---|---|---|
| ODPS SQL | Last SELECT statement |
Two-dimensional array |
| Python 2 | Last print statement |
One-dimensional array (split by ,) |
| Shell | Last echo statement |
One-dimensional array (split by ,) |
Configure an assignment node and pass the output to a Shell node
The following example shows the complete workflow: configure an ODPS SQL assignment node, connect it to a Shell downstream node, and reference the passed value in the Shell script.
Step 1: Configure the assignment node
-
Log on to the DataWorks console. In the left navigation pane, click Data Development & O&M > Data Development.
-
In your workflow, create and open an assignment node. On the configuration page, select ODPS SQL for the language and write your query.
select * from xc_dpe_e2.xc_rpt_user_info_d where dt='20191008' limit 10; -
(Optional) Click Properties in the right pane and select the Input and Output Parameters tab to confirm that DataWorks has automatically created an output parameter named
outputs.
Step 2: Configure the Shell node
-
Create a Shell node. In the workflow canvas, drag a connection from the assignment node to the Shell node to set it as a downstream node.
-
On the Shell node's configuration page, click Properties in the right pane and select the Input and Output Parameters tab.
-
In the Input Parameters section, click Create.
-
In the dialog box, select the
outputsparameter from the assignment node and set a name for the current node's input parameter, such asparam.This step creates a dependency between the Shell node and the assignment node.
-
Reference the value in the Shell node's script using the
${param}syntax:${param},${param[0]}, and${param[0][1]}are DataWorks variable syntax, not standard Shell. DataWorks statically replaces these placeholders with actual values before submitting the script for execution.echo '${param}'; echo 'First row: '${param[0]}; echo 'Second field of the first row: '${param[0][1]};
Step 3: Run and verify
Open the business flow and click Run in the toolbar to run the workflow. The expected output is similar to the following:
Full result set: value1,value2
value3,value4
First row: value1,value2
Second field of the first row: value2
To test in a scheduled environment, commit the node to the development environment, go to Operation Center, and use data backfill to validate the execution.
Examples by language
The following examples show the assignment node code, the downstream Shell node reference script, and the expected output for each supported language.
Pass an ODPS SQL result
A SQL SELECT result is passed as a two-dimensional array.
Assignment node
select * from xc_dpe_e2.xc_rpt_user_info_d where dt='20191008' limit 2;
Shell node (input parameter param mapped to outputs)
# Full 2D array
echo "Full result set: ${param}"
# First row (1D array)
echo "First row: ${param[0]}"
# Second field of the first row
echo "Second field of the first row: ${param[0][1]}"
Expected output
Full result set: value1,value2
value3,value4
First row: value1,value2
Second field of the first row: value2
Pass a Python 2 output
The last print statement's output is split by commas and passed as a one-dimensional array.
Assignment node
print "hello,dataworks";
Shell node (input parameter param mapped to outputs)
# Full 1D array
echo "Full result set: ${param}"
# Elements by index
echo "First element: ${param[0]}"
echo "Second element: ${param[1]}"
Expected output
Full result set: "hello","dataworks"
First element: hello
Second element: dataworks
Pass a Shell output
The last echo statement's output is split by commas and passed as a one-dimensional array.
Assignment node
echo "hello,dataworks";
Shell node (input parameter param mapped to outputs)
# Full 1D array
echo "Full result set: ${param}"
# Elements by index
echo "First element: ${param[0]}"
echo "Second element: ${param[1]}"
Expected output
Full result set: "hello","dataworks"
First element: hello
Second element: dataworks
Limitations
| Constraint | Detail |
|---|---|
| Scope | Parameters are passed to immediate downstream nodes only; they cannot skip nodes in the dependency chain. |
| Size | The outputs parameter cannot exceed 2 MB. If it does, the assignment node fails. |
| No comments | Do not include comments in the assignment node's code. Comments interfere with output parsing and can cause the node to fail or produce incorrect values. |
| No `WITH` clause | The WITH clause is not supported in ODPS SQL mode. |
What's next
-
Loop nodes: To pass an assignment node's output to a
for-eachordo-whilenode, see Configure a for-each node and Configure a do-while node. -
Built-in parameter assignment: Several node types support built-in parameter assignment and can achieve the same result without a separate assignment node. These include: EMR Hive, Hologres SQL, EMR Spark SQL, AnalyticDB for PostgreSQL, ClickHouse SQL, and MySQL.