Use an assignment node to pass the query result or output from an upstream node to a downstream node. An assignment node supports MaxCompute SQL, Python 2, and Shell. It automatically assigns the result of the last query or output statement to an output parameter named outputs. Downstream nodes can then reference this parameter to retrieve the result.
Prerequisites
-
Edition requirements: This feature is available only in DataWorks Standard Edition and later.
-
Permission requirements: Your RAM account must be added to the corresponding workspace and granted the Development or Workspace Manager role. For more information, see Add members to a workspace.
Key concepts: Parameter passing and referencing
The core function of an assignment node is parameter passing, which passes data generated by an upstream node to a downstream node.
-
Upstream assignment node: Generates data. It automatically assigns the last output or query result to an output parameter named
outputs. -
Downstream business node: Receives and uses the data. To use the data, add an input parameter, such as
param, to the node configuration and configure it to reference the upstream node'soutputsparameter. You can then use this parameter in your code to access the data.
Parameter format
The following table describes the format of the value passed by an assignment node.
|
Language |
Value |
Format |
|
MaxCompute SQL |
The output of the last |
The node passes the output to the downstream node as a two-dimensional array. |
|
Python 2 |
The output of the last |
The system splits the output string by commas ( For example, the last line of output from the assignment node is Important
If the output content contains a comma, you must escape it. For example, if you output |
|
Shell |
The output of the last |
Procedure
The following example shows how to pass the result of an assignment node to a Shell node. In practice, you can use any node type as the downstream node.
-
Configure the upstream assignment node
In your workflow, create and edit an assignment node. Select MaxCompute SQL, Python 2, or Shell, and then write the code that generates the result to pass to the downstream node.
print '10,20,30,40' -
Configure the downstream Shell node
Create a Shell node and reference the result from the upstream node in the Shell node editor.
-
In the Scheduling Settings panel on the right, select the Node Context Parameters tab.
-
In the Input Parameters section, click Add parameters.
-
In the dialog box that appears, select the
outputsparameter of the upstream assignment node and specify a custom Parameter Name for the current node's input parameter, such asparam.NoteAfter you complete the configuration, a dependency is automatically created between the downstream node and the upstream assignment node.
-
After configuring the parameter, you can use the format
${param}in the code of the downstream Shell node to reference the value from the upstream node.
-
-
Run and verify
-
Return to the workflow canvas, click Deploy in the toolbar, and select Full Deployment.
-
Go to and run a smoke test on the workflow.
-
In the test instance, verify that the result is as expected.
-
Limitations
-
Passing level: An assignment node can pass parameters only to its direct downstream nodes. It does not support passing parameters across multiple levels of nodes.
-
Size limit: The passed value can be a maximum of 2 MB. If the output of the assignment statement exceeds this limit, the assignment node fails to run.
-
Syntax restrictions:
-
Do not add comments to the code of an assignment node, as this can cause unexpected results.
-
MaxCompute SQL mode does not support the WITH clause.
-
Examples by language
The data format of the outputs parameter and the way downstream nodes reference it vary slightly depending on the language you use. The following examples use a Shell node as the downstream node to illustrate these differences.
Example 1: Pass a MaxCompute SQL query result
The assignment node passes the SQL query result to the downstream node as a two-dimensional array.
-
Upstream node (SQL assignment node) configuration
Assume the following SQL code returns a result set of two rows and two columns:
SELECT 'beijing', '1001' UNION ALL SELECT 'hangzhou', '1002'; -
Downstream node (Shell node) configuration and output
Add an input parameter named
regionto the Shell node and reference theoutputsof the upstream SQL node.Write the following code to read the data:
echo "Full result set: ${region}" echo "First row: ${region[0]}" echo "Second field of the first row: ${region[0][1]}"The code returns the following output:
Full result set: beijing,1001 hangzhou,1002 First row: beijing,1001 Second field of the first row: 1001
Example 2: Pass a Python 2 output result
The system splits the output of a Python 2 print statement by commas (,) and passes it to the downstream node as a one-dimensional array.
-
Upstream node (Python 2 assignment node) configuration
The Python 2 code is as follows:
print 'Electronics, Clothing, Books'; -
Downstream node (Shell node) configuration and output
Add an input parameter named
typesto the Shell node and reference theoutputsof the upstream assignment node.Write the following code to read the data:
# Directly output the entire one-dimensional array. echo "Full result set: ${types}" # Output an element from the array by its index. echo "Second element: ${types[1]}"The code returns the following output:
Full result set: Electronics,Clothing,Books Second element: Clothing
The processing logic for a Shell node is similar to that for a Python 2 node and is therefore not repeated here.
Use case: Batch processing for multiple business lines
This example shows how to use an assignment node and a for-each node to batch process user behavior data for multiple lines of business. This enables automated data processing where a single set of logic serves multiple product lines.
Background
Assume you are a data engineer at a large company responsible for processing data from three core lines of business: e-commerce (ecom), finance (finance), and logistics (logistics), with the possibility of adding more in the future. You need to apply the same aggregation logic to the user behavior logs of these lines of business every day to calculate the daily page views (PV) for each user and store the results in a unified summary table.
-
Upstream source tables (DWD layer):
-
dwd_user_behavior_ecom_d: E-commerce user behavior table. -
dwd_user_behavior_finance_d: Finance user behavior table. -
dwd_user_behavior_logistics_d: Logistics user behavior table. -
dwd_user_behavior_${line_of_business}_d: User behavior tables for other potential lines of business. -
These tables have the same schema and are partitioned by day (
dt).
-
-
Downstream destination table (DWS layer):
-
dws_user_summary_d: User summary table. -
This table is partitioned by line of business (
biz_line) and day (dt) to store the aggregated results from all lines of business.
-
Using a for-each node, you only need to maintain one set of processing logic, and the system automatically iterates through all lines of business to complete the calculations.
Data preparation
First, create the example tables and insert test data. This example uses the business date 20251010.
-
Associate a MaxCompute compute resource with the workspace.
-
Go to Data Studio and create a MaxCompute SQL node.
-
Create the source tables (DWD layer). In the MaxCompute SQL node, add the following code, select it, and run it.
-- E-commerce user behavior table CREATE TABLE IF NOT EXISTS dwd_user_behavior_ecom_d ( user_id STRING COMMENT 'User ID', action_type STRING COMMENT 'Behavior type', event_time BIGINT COMMENT 'Event time as a millisecond-level Unix timestamp' ) COMMENT 'E-commerce user behavior log detail table' PARTITIONED BY (dt STRING COMMENT 'Date partition in yyyymmdd format'); INSERT OVERWRITE TABLE dwd_user_behavior_ecom_d PARTITION (dt='20251010') VALUES ('user001', 'click', 1760004060000), -- 2025-10-10 10:01:00.000 ('user002', 'browse', 1760004150000), -- 2025-10-10 10:02:30.000 ('user001', 'add_to_cart', 1760004300000); -- 2025-10-10 10:05:00.000 -- Verify that the e-commerce user behavior table is created. SELECT * FROM dwd_user_behavior_ecom_d where dt='20251010'; -- Finance user behavior table CREATE TABLE IF NOT EXISTS dwd_user_behavior_finance_d ( user_id STRING COMMENT 'User ID', action_type STRING COMMENT 'Behavior type', event_time BIGINT COMMENT 'Event time as a millisecond-level Unix timestamp' ) COMMENT 'Finance user behavior log detail table' PARTITIONED BY (dt STRING COMMENT 'Date partition in yyyymmdd format'); INSERT OVERWRITE TABLE dwd_user_behavior_finance_d PARTITION (dt='20251010') VALUES ('user003', 'open_app', 1760020200000), -- 2025-10-10 14:30:00.000 ('user003', 'transfer', 1760020215000), -- 2025-10-10 14:30:15.000 ('user003', 'check_balance', 1760020245000), -- 2025-10-10 14:30:45.000 ('user004', 'open_app', 1760020300000); -- 2025-10-10 14:31:40.000 -- Verify that the finance user behavior table is created. SELECT * FROM dwd_user_behavior_finance_d where dt='20251010'; -- Logistics user behavior table CREATE TABLE IF NOT EXISTS dwd_user_behavior_logistics_d ( user_id STRING COMMENT 'User ID', action_type STRING COMMENT 'Behavior type', event_time BIGINT COMMENT 'Event time as a millisecond-level Unix timestamp' ) COMMENT 'Logistics user behavior log detail table' PARTITIONED BY (dt STRING COMMENT 'Date partition in yyyymmdd format'); INSERT OVERWRITE TABLE dwd_user_behavior_logistics_d PARTITION (dt='20251010') VALUES ('user001', 'check_status', 1760032800000), -- 2025-10-10 18:00:00.000 ('user005', 'schedule_pickup', 1760032920000); -- 2025-10-10 18:02:00.000 -- Verify that the logistics user behavior table is created. SELECT * FROM dwd_user_behavior_logistics_d where dt='20251010'; -
Create the destination table (DWS layer). In the MaxCompute SQL node, add the following code, select it, and run it.
CREATE TABLE IF NOT EXISTS dws_user_summary_d ( user_id STRING COMMENT 'User ID', pv BIGINT COMMENT 'Daily page views' ) COMMENT 'User daily page views summary table' PARTITIONED BY ( dt STRING COMMENT 'Date partition in yyyymmdd format', biz_line STRING COMMENT 'Line of business partition, such as ecom, finance, or logistics' );ImportantIf your workspace uses the standard development environment, you must deploy this node to the production environment and perform a data backfill.
Workflow implementation
-
Create a workflow. In the Scheduling Parameters panel on the right, set the scheduling parameter bizdate to the previous day:
$[yyyymmdd-1]. -
In the workflow, create an assignment node named get_biz_list. Use the MaxCompute SQL language to write the following code. This node outputs a list of the lines of business to be processed.
-- Output all lines of business to be processed. SELECT 'ecom' AS biz_line UNION ALL SELECT 'finance' AS biz_line UNION ALL SELECT 'logistics' AS biz_line; -
Configure the for-each node
-
Return to the workflow canvas and create a downstream for-each node for the get_biz_list assignment node.
-
Go to the for-each node settings panel. In the Scheduling panel on the right, under the tab, bind the loopDataArray parameter to the outputs of the get_biz_list node.
-
In the for-each node loop body, click Create Internal Node to create a MaxCompute SQL node, and then write the processing logic for the loop body.
Note-
This script is driven by the for-each node and runs once for each line of business.
-
At runtime, the system dynamically replaces the built-in variable ${dag.foreach.current} with the current line of business name. The expected iteration values are 'ecom', 'finance', and 'logistics'.
SET odps.sql.allow.dynamic.partition=true; INSERT OVERWRITE TABLE dws_user_summary_d PARTITION (dt='${bizdate}', biz_line) SELECT user_id, COUNT(*) AS pv, '${dag.foreach.current}' AS biz_line FROM dwd_user_behavior_${dag.foreach.current}_d WHERE dt = '${bizdate}' GROUP BY user_id; -
-
-
Add a verification node
Return to the workflow canvas. For the for-each node, click Create Downstream to create a MaxCompute SQL node, and then add the following code.
SELECT * FROM dws_user_summary_d WHERE dt='20251010' ORDER BY biz_line, user_id;
Deployment and results
Deploy the workflow to the production environment. Go to , find the workflow, and run a smoke test. Select '20251010' as the business date.
After the run completes, the final node produces the following output:
|
user_id |
pv |
dt |
biz_line |
|
user001 |
2 |
20251010 |
ecom |
|
user002 |
1 |
20251010 |
ecom |
|
user003 |
3 |
20251010 |
finance |
|
user004 |
1 |
20251010 |
finance |
|
user001 |
1 |
20251010 |
logistics |
|
user005 |
1 |
20251010 |
logistics |
Benefits
-
High scalability: To add a new line of business, you only need to add one line of SQL to the assignment node without modifying the processing logic.
-
Easy maintenance: All lines of business share the same processing logic. A change in one place applies to all.
FAQ
-
Q: In MaxCompute SQL mode, I get the error "find no select sql in sql assignment!". What should I do?
A: The MaxCompute SQL statement is missing a
SELECTstatement. Add aSELECTstatement.WITHstatements are not supported and also cause this error. -
Q: In Shell or Python mode, I get the error "OutPut Result is null, cannot handle!". What should I do?
A: The script has no output. Check whether your code contains a print statement (
printorecho). -
Q: In Shell or Python mode, how do I handle output elements that contain commas?
A: You need to escape the comma (
,) by replacing it with\,. The following Python code shows an example:categories = ["Electronics", "Clothing, Shoes & Accessories"] # Escape commas in each element by replacing ',' with '\,'. escaped_categories = [cat.replace(",", "\,") for cat in categories] # Join the escaped elements with a comma. output_string = ",".join(escaped_categories) print output_string # The final string passed to the downstream node is: # Electronics,Clothing\, Shoes & Accessories -
Q: Can a downstream node be configured to receive results from multiple upstream assignment nodes?
A: Yes. You only need to assign the results of different nodes to different parameters.

-
Q: Does the assignment node support other languages?
A: The assignment node currently supports only MaxCompute SQL, Python 2, and Shell. Some nodes, such as EMR Hive, Hologres SQL, EMR Spark SQL, AnalyticDB for PostgreSQL, ClickHouse SQL, and MySQL nodes, have a built-in assignment parameter feature that serves the same purpose.
In the Output Parameters section, click + Create Parameter.
Related reading
-
To iterate through results in a downstream node, see for-each node and do-while node.
-
To pass parameters across multiple node levels, see parameter node.
-
For more information about configuring parameters, see node context parameters.