Node context parameters are a core mechanism in DataWorks for passing dynamic data between task nodes. An upstream node (producer) can pass its output value to one or more downstream nodes. The downstream nodes can then reference these values in their code to dynamically adjust their behavior based on the upstream output. This greatly enhances the flexibility and automation of your workflows.
How it works
Node context parameters pass values by defining an output parameter in an upstream node (producer) and referencing that parameter in a downstream node (consumer).
-
Upstream node (producer): Generates a value and provides it as an output parameter. There are two ways to provide a value:
-
Pass a constant or variable: In the Node Output Parameters section of the upstream node, define a parameter and assign it a value. The value can be a constant, such as
'abc', or a system context variable, such as${status}. -
Pass an assignment result: The system captures the last query result of the node's code, such as
SELECT 'table_A';, assigns it to a built-in output parameter namedoutputs, and then passes the value of this parameter to downstream nodes. The parameter value depends on the code's runtime result. Assignment nodes and some SQL nodes support this method.
-
-
Downstream node (consumer): Receives and uses the value provided by the upstream node.
-
Configure input parameters: In the Node Input Parameters section of the downstream node, add an input parameter and set its value source to the output parameter of the upstream node.
-
Establish a scheduling dependency: After you configure the input parameter, the system automatically creates a same-cycle scheduling dependency from the downstream node to the upstream node.
-
Reference in code: In the code of the downstream node, reference the value using the
${InputParameterName}format. For example, if the upstream node passes the valuetable_A, the downstream codeSELECT * FROM ${input};becomesSELECT * FROM table_A;at runtime.
-
Limits
-
Edition: Some nodes support the Add Assignment Parameter feature, which is used to pass query results. This feature requires DataWorks Standard Edition or a later edition.
-
Node Type: The node types that support the Add Assignment Parameter feature are: EMR Hive, EMR Spark SQL, ODPS Script, Hologres SQL, AnalyticDB for PostgreSQL, ClickHouse SQL, and database nodes.
Procedure
Step 1: Configure the upstream node to output parameters
-
Log on to the DataWorks console. Switch to the destination region. In the navigation pane on the left, click . Select the desired workspace from the drop-down list and click Go to DataStudio.
-
In the Data Development pane, double-click the target upstream node to open its editor page.
-
On the right side of the canvas, click Scheduling Settings. In the Node Context Parameters section, choose a method to configure Node Output Parameters as needed.
Method 1: Pass a constant or variable
-
In the Output Parameters of This Node section, click Add Parameter.
-
Configure the parameter information.
Parameter
Description
Parameter Name
A custom name for the output parameter, for example,
my_param.Parameter Value
The value of the parameter. The following types are supported:
-
Constant: such as
hello. -
System context variables: such as
${status}. -
Scheduling parameters: such as
$bizdate, or custom scheduling parameters${...}and$[...].
-
Method 2: Pass an assignment result
-
Use an assignment node
An assignment node (the upstream node) supports MaxCompute SQL, Python 2, and Shell. It automatically assigns the result of the last query or output to the node's output parameter (outputs). Downstream nodes can then reference this parameter to retrieve the output result of the assignment node. For more information, see Assignment node.
-
Use an assignment parameter
In a node that supports assignment parameters, perform the following steps:
-
In the Node Output Parameters section, click Add Assignment Parameter.
-
The system automatically adds an output parameter named
outputs. You do not need to configure this parameter. Its value is the last query result of the node's code. -
Click Save.
NoteAfter you click Add Assignment Parameter, the assignment parameter passes the query result of this node to any downstream node that references it. If the result is empty, the execution of this node is not blocked, but the downstream nodes that reference the parameter may fail.
For a specific example, see the MaxCompute language example in Assignment node.
-
Output parameters can be deleted. Before you delete an output parameter, ensure that no downstream nodes are using it. Otherwise, the execution of downstream tasks is affected.
Step 2: Configure the downstream node to use the parameter
-
Configure input parameters
-
Open the editor page of the downstream node. Go to the configuration page. In the Node Input Parameters section, click Add Parameter.
-
Configure the input parameter. Select an output parameter from an upstream node as the Value Source for this parameter, and define a Parameter Name for this node.
-
On the toolbar, click Save to save the parameter.
-
-
Establish dependencies
After you attach the output parameter of the upstream node, the system automatically adds a same-cycle dependency on that node. You do not need to configure it manually.
-
Reference parameters
In the code of the downstream node, reference the parameter using the
${InputParameterName}format.The following example shows how to reference the input parameter
paramin a Shell node:echo "The value from upnode is ${param}"If the upstream node passes an assignment result to the downstream node, the parameter value is usually a two-dimensional array or a comma-separated one-dimensional array. You can access the values within the array as follows:
-
If the upstream node is an SQL node (two-dimensional array):
-
Row:
${param}. -
Cell:
${param[j]}.
-
-
If the upstream node is a Python/Shell node (one-dimensional array):
Row:${param}.
All indexes start from 0.
-
Step 3: Debug and run
Context parameters are passed in scheduling order only in recurring instances that are triggered by a workflow. Running a downstream node independently fails to retrieve upstream parameters and causes the task to fail. When you debug, you must start from the upstream node and execute the nodes in the order of the business flow.
-
Return to the workflow. On the toolbar above the workflow, click Run, or right-click the downstream node and select Run to this node.
-
In the generated directed acyclic graph (DAG) instance, click a node to view its operational log and check whether the result is as expected.
System context variables
|
System Variable |
Description |
|
${projectId} |
Project ID. |
|
${projectName} |
MaxCompute project name. |
|
${nodeId} |
Node ID. |
|
${gmtdate} |
The time 00:00:00 on the day of the instance's scheduled time. The format is yyyy-MM-dd 00:00:00. |
|
${taskId} |
Task instance ID. |
|
${seq} |
The ordinal number of the task instance, which indicates its sequence among instances of the same node on the same day. |
|
${cyctime} |
The scheduled time of the instance. |
|
${status} |
The status of the instance: success (SUCCESS) or failure (FAILURE). |
|
${bizdate} |
The data timestamp. |
|
${finishTime} |
The end time of the instance. |
|
${taskType} |
The runtime type of the instance: Normal (NORMAL), Manual (MANUAL), Pause (PAUSE), Dry-run (SKIP), Not Selected (UNCHOOSE), or Skip Cycle (SKIP_CYCLE). |
|
${nodeName} |
The node name. |