In complex data workflows, it's often necessary to pass dynamic information between nodes. While a common method is to use an intermediate table, this approach is highly inefficient for passing small amounts of data, adding unnecessary I/O and complexity. The assignment node provides a lightweight solution: it executes a short script (MaxCompute SQL, Python 2, or Shell) and passes its output directly as a parameter to the downstream node. This allows you to build flexible pipelines where tasks are dynamically configured based on the results from upstream tasks.
Usage notes
Edition: DataWorks Standard Edition or higher.
Permissions: You need to have the Development or Workspace Manager role in your DataWorks workspace. For more information, see Add members to a workspace.
How it works
The core function of the assignment node is parameter passing: transferring data from an assignment node to a downstream node.
The assignment node produces data by automatically assigning the last output or query result to a system-generated node output parameter named
outputs.The downstream node consumes the data. You configure it to use the
outputsby adding a node input parameter (e.g.,param) .
Parameter format
The value and format of the outputs parameter depend on the script language used:
Language | Value | Format |
MaxCompute SQL | The output of the last | The result set is passed as a two-dimensional array. |
Python 2 | The output of the last | The output is treated as a single string, which is then split by commas ( Example: If the output is Important Escape commas within the output content. For example, if the output is |
Shell | The output of the last |
Procedure
The result of an assignment node can be passed to any type of downstream node. The following example demonstrates this workflow using a Shell node.
Configure the assignment node.
In your workflow, create and edit an assignment node. Select MaxCompute SQL, Python 2, or Shell as needed, and write the code.

Configure the Shell node.

Create a Shell node.
In the Scheduling panel on the right, select the Input and Output Parameters tab.
In the Input Parameters section, click Create Parameter.
Assign a parameter name to the input parameter, such as
paramand set the value to theoutputsparameter.NoteAfter this configuration, DataWorks automatically creates a dependency between the Shell and assignment nodes.
After configuring the parameter, you can reference the passed value in your Shell script by using the format
${param}.
Run and verify.
On the workflow canvas, click Deploy in the top toolbar and select full deployment.
Navigate to .
Perform a smoke test on your target workflow and verify that the results are as expected.
Limitations
Parameters can only be passed to immediate downstream nodes.
Size limit: The
outputscannot exceed 2 MB, otherwise the assignment node will fail.Syntax limitations:
Do not include comments in the assignment node's code. Comments can disrupt output parsing and cause the node to fail or produce incorrect values.
The
WITHclause is not supported in MaxCompute SQL mode.
Examples by language
The outputs data format and referencing method vary by language.
Example 1: Pass a MaxCompute SQL query result
The result of a SQL query is passed to the Shell node as a two-dimensional array.
Assignment node
Assume the SQL code returns two rows and two columns:
SELECT 'beijing', '1001' UNION ALL SELECT 'hangzhou', '1002';Shell node
Add an input parameter named
paramthat references the assignment node'soutputs.Use the following script to read the data:
echo "Entire result set: ${region}" echo "First row: ${region[0]}" echo "Second field of the first row: ${region[0][1]}"DataWorks performs static replacement. Output:
Full result set: beijing,1001 hangzhou,1002 First row: beijing,1001 Second field of the first row: 1001
Example 2: Pass a Python 2 output
The output of a Python 2 print statement is split by commas (,) and passed as a one-dimensional array.
Assignment node
The Python 2 code is as follows:
print 'Electronics, Clothing, Books';Shell node
Add an input parameter named
paramthat references the assignment node'soutputs. Use the following script to read the data:Use the following script to read the data:
# Output the full array echo "Full result set: ${types}" # Output a specific element by index echo "Second element: ${types[1]}"DataWorks performs static replacement. Output:
Full result set: Electronics,Clothing,Books Second element: Clothing
The processing logic for Shell nodes is similar to Python 2 nodes and will not be described again.
Scenario: Batch process data from partitioned tables for multiple lines of business
This example shows how to use an assignment node and a for-each node to batch process user behavioral data from multiple lines of business. This approach automates data processing by allowing you to use a single set of logic for multiple product lines.
Background
Assume that you are a data developer for a large internet company. You are responsible for processing data from three core lines of business: E-commerce (ecom), finance (finance), and logistics (logistics). More lines of business may be added in the future. You need to run the same aggregation logic on the user behavioral logs of these three lines of business every day. The logic calculates the daily popularity (PV) for each user and stores the results in a unified aggregate table.
Upstream source tables (DWD layer):
dwd_user_behavior_ecom_d: E-commerce user behavior table.dwd_user_behavior_finance_d: Finance user behavior table.dwd_user_behavior_logistics_d: Logistics user behavior table.dwd_user_behavior_${line-of-business}_d: User behavior tables for other potential lines of business.These tables have the same schema and are partitioned by day (
dt).
Downstream target table (DWS layer):
dws_user_summary_d: User aggregate table.This table is partitioned by both line of business (
biz_line) and day (dt). It is used to store the aggregated results for all lines of business.
Creating a separate task for each line of business is costly to maintain and prone to errors. If you use a for-each node, you only need to maintain one set of processing logic. The system automatically traverses all lines of business to complete the calculations.
Data preparation
First, create the example tables and insert test data. This example uses the data timestamp 20251010.
Associate a MaxCompute computing resource to the workspace.
Go to DataStudio to perform data development and create a MaxCompute SQL node.
Create the source tables (DWD layer): Add the following code to the MaxCompute SQL node, select it, and run it.
-- E-commerce user behavior table CREATE TABLE IF NOT EXISTS dwd_user_behavior_ecom_d ( user_id STRING COMMENT 'User ID', action_type STRING COMMENT 'Behavior type', event_time BIGINT COMMENT 'Millisecond-level UNIX timestamp of the event occurrence' ) COMMENT 'Details of E-commerce user behavioral logs' PARTITIONED BY (dt STRING COMMENT 'Date partition in yyyymmdd format'); INSERT OVERWRITE TABLE dwd_user_behavior_ecom_d PARTITION (dt='20251010') VALUES ('user001', 'click', 1760004060000), -- 2025-10-10 10:01:00.000 ('user002', 'browse', 1760004150000), -- 2025-10-10 10:02:30.000 ('user001', 'add_to_cart', 1760004300000); -- 2025-10-10 10:05:00.000 -- Verify that the E-commerce user behavior table is created. SELECT * FROM dwd_user_behavior_ecom_d where dt='20251010'; -- Finance user behavior table CREATE TABLE IF NOT EXISTS dwd_user_behavior_finance_d ( user_id STRING COMMENT 'User ID', action_type STRING COMMENT 'Behavior type', event_time BIGINT COMMENT 'Millisecond-level UNIX timestamp of the event occurrence' ) COMMENT 'Details of finance user behavioral logs' PARTITIONED BY (dt STRING COMMENT 'Date partition in yyyymmdd format'); INSERT OVERWRITE TABLE dwd_user_behavior_finance_d PARTITION (dt='20251010') VALUES ('user003', 'open_app', 1760020200000), -- 2025-10-10 14:30:00.000 ('user003', 'transfer', 1760020215000), -- 2025-10-10 14:30:15.000 ('user003', 'check_balance', 1760020245000), -- 2025-10-10 14:30:45.000 ('user004', 'open_app', 1760020300000); -- 2025-10-10 14:31:40.000 -- Verify that the finance user behavior table is created. SELECT * FROM dwd_user_behavior_finance_d where dt='20251010'; -- Logistics user behavior table CREATE TABLE IF NOT EXISTS dwd_user_behavior_logistics_d ( user_id STRING COMMENT 'User ID', action_type STRING COMMENT 'Behavior type', event_time BIGINT COMMENT 'Millisecond-level UNIX timestamp of the event occurrence' ) COMMENT 'Details of logistics user behavioral logs' PARTITIONED BY (dt STRING COMMENT 'Date partition in yyyymmdd format'); INSERT OVERWRITE TABLE dwd_user_behavior_logistics_d PARTITION (dt='20251010') VALUES ('user001', 'check_status', 1760032800000), -- 2025-10-10 18:00:00.000 ('user005', 'schedule_pickup', 1760032920000); -- 2025-10-10 18:02:00.000 -- Verify that the logistics user behavior table is created. SELECT * FROM dwd_user_behavior_logistics_d where dt='20251010';Create the target table (DWS layer): Add the following code to the MaxCompute SQL node, select it, and run it.
CREATE TABLE IF NOT EXISTS dws_user_summary_d ( user_id STRING COMMENT 'User ID', pv BIGINT COMMENT 'Daily popularity', ) COMMENT 'Daily user popularity aggregate table' PARTITIONED BY ( dt STRING COMMENT 'Date partition in yyyymmdd format', biz_line STRING COMMENT 'Line-of-business partition, such as ecom, finance, logistics' );ImportantIf the workspace uses the standard environment, you must publish this node to the production environment and perform a data backfill.
Workflow implementation
Create a workflow. In the Scheduling Parameters pane on the right, set the scheduling parameter bizdate to the previous day
$[yyyymmdd-1].
In the workflow, create an assignment node named get_biz_list. Write the following code in MaxCompute SQL. This node outputs a list of lines of business to process.
-- Output all lines of business to be processed SELECT 'ecom' AS biz_line UNION ALL SELECT 'finance' AS biz_line UNION ALL SELECT 'logistics' AS biz_line;Configure the for-each node
Return to the workflow canvas and create a downstream for-each node for the get_biz_list assignment node.
Go to the settings page of the for-each node. On the Schedule tab on the right, under , set the value of the loopDataArray parameter to the outputs of the get_biz_list node.

In the for-each node loop body, click Create Inner Node. Create a MaxCompute SQL node and write the processing logic for the loop body.
NoteThis script is driven by the for-each node and runs once for each line of business.
The built-in variable ${dag.foreach.current} is dynamically replaced with the current line-of-business name at runtime. The expected iteration values are 'ecom', 'finance', and 'logistics'.
SET odps.sql.allow.dynamic.partition=true; INSERT OVERWRITE TABLE dws_user_summary_d PARTITION (dt='${bizdate}', biz_line) SELECT user_id, COUNT(*) AS pv, '${dag.foreach.current}' AS biz_line FROM dwd_user_behavior_${dag.foreach.current}_d WHERE dt = '${bizdate}' GROUP BY user_id;
Add a verification node
Return to the workflow canvas. For the for-each node, click Create Downstream Node. Create a MaxCompute SQL node and add the following code.
SELECT * FROM dws_user_summary_d WHERE dt='20251010' ORDER BY biz_line, user_id;
Publish and run
Publish the workflow to the production environment. In Operation Center, navigate to , find the target workflow, run a smoke test, and select '20251010' as the data timestamp.
After the run is complete, view the run log in the test instance. The expected output of the final node is:
user_id | pv | dt | biz_line |
user001 | 2 | 20251010 | ecom |
user002 | 1 | 20251010 | ecom |
user003 | 3 | 20251010 | finance |
user004 | 1 | 20251010 | finance |
user001 | 1 | 20251010 | logistics |
user005 | 1 | 20251010 | logistics |
Advantages of this solution
High extensibility: If a new line of business is added, you only need to add one line of SQL code in the assignment node. You do not need to modify the processing logic.
Easy maintenance: All lines of business share the same processing logic. A change in one place takes effect for all of them.
FAQ
Q: In MaxCompute SQL, why do I receive the error "find no select sql in sql assignment!"?
A: This error occurs because the MaxCompute SQL code is missing a
SELECTstatement. You must add aSELECTstatement. The WITH syntax is not currently supported, and using it also causes this error.Q: In Shell or Python, why do I receive the error "OutPut Result is null, cannot handle!"?
A: This error occurs because the output is missing. Check whether your code contains a print statement, such as
printorecho.Q: In Shell or Python, how do I handle output elements that contain commas?
A: You need to escape the comma
,by converting it to\,. The following Python code provides an example.categories = ["Electronics", "Clothing, Shoes & Accessories"] # Escape commas contained in each element # Replace ',' with '\,' escaped_categories = [cat.replace(",", "\,") for cat in categories] # Join the escaped elements with a comma output_string = ",".join(escaped_categories) print output_string # The final string output to the downstream node is: # Electronics,Clothing\, Shoes & AccessoriesQ: Can a downstream node receive results from multiple upstream assignment nodes?
A: Yes, it can. You can assign the results from different nodes to different parameters.

Q: Does the assignment node support other languages?
A: The assignment node currently supports only MaxCompute SQL, Python 2, and Shell. However, some nodes, such as EMR Hive, Hologres SQL, EMR Spark SQL, AnalyticDB for PostgreSQL, ClickHouse SQL, and MySQL, have a built-in parameter assignment feature that provides the same functionality.

References
For more information about how to perform loop processing in a downstream node, see for-each node and do-while node.
For more information about how to pass parameters across levels, see parameter node.
For more information about parameter passing configurations, see Node context parameters.