DataWorks provides the HTTP trigger node to trigger and run tasks across different tenants. This topic describes how to set up an HTTP trigger node in DataWorks to orchestrate cross-tenant workflows.
Background
In many organizations, data processing and analysis workflows operate across multiple tenants or in different regions under the same tenant. DataWorks provides the HTTP trigger node to support these scenarios:
Orchestrating dependent tasks across different tenant environments.
Coordinating workflows between different scheduling systems in multiple regions under the same tenant. For example, a workflow in System B can start only after a node in System A completes its task.
The following diagram shows how the trigger mechanism works in a cross-tenant scenario.
Diagram description: A workflow in the Tenant B environment contains an HTTP trigger node and is deployed to Operation Center. When the workflow runs, its upstream nodes execute sequentially based on their schedules. When the execution reaches the HTTP trigger node, the node and its downstream tasks enter a waiting state. After the HTTP trigger node receives and validates a trigger command from Tenant A, the HTTP trigger node and its downstream tasks start running in sequence.
Prerequisites
You have two tenant environments, each with a separate root account.
You have created a workspace in each of the two environments.
You have associated a serverless resource group with the workspace in each environment.
You have configured an Internet NAT Gateway and an EIP for the VPC of the serverless resource group that is associated with the triggering workspace.
ImportantWhen a shell node triggers an HTTP trigger node, the trigger command must be sent over the public internet.
You have associated a MaxCompute compute resource with the workspace that contains the shell node that sends the trigger command. For more information, see Manage compute resources.
Notes
The HTTP trigger node functions only as a trigger. You do not need to configure any code for the node.
Follow the example that matches your workspace type:
Use Data Studio (New Version) workspaces: Follow the Data Studio (new version) example.
For Use Data Studio (New Version) workspaces: Follow the legacy Data Studio example.
NoteFor workspaces that have not enabled Use Data Studio (New Version), only DataWorks Enterprise Edition supports HTTP trigger nodes.
Data Studio (new version) example
If you want to trigger node execution in the Data Studio (new version) environment of Tenant B from Tenant A, refer to the following instructions.
Tenant B environment: Create an HTTP trigger node workflow
Create a workflow in the workspace of Tenant B that will be triggered remotely. The workflow must contain an HTTP trigger node that receives the trigger command and a downstream business node (a shell node is used as an example in this topic) so that you can verify the cross-tenant triggering effect.
Go to Data Studio.
Go to the DataWorks workspace list page, switch to the target region in the top navigation bar, find the target workspace, and then click in the Operation column to go to Data Studio.
Create a workflow.
In the left-side navigation pane, click
, click the
icon next to Project Directory, and then select Create Workflow to open the Create Workflow dialog.In the Create Workflow dialog, specify a custom Name for the workflow (
HTTP_Workflowin this example), and then click Confirm to go to the workflow editing page.
Create nodes.
In the Create Node panel on the workflow editing page, drag the node and the General > Shell node to the workflow editing page.
Add the shell node as a downstream node of the HTTP trigger node.
Edit the shell node in the workflow.
ImportantThe HTTP trigger node functions only as a trigger. You do not need to configure any code for the node.
Hover over the shell node and click Open Node. Enter the following content on the node editing page.
echo "DataWorks";On the right side of the shell node editing page, in Scheduling Settings, set Resource Group to the serverless resource group associated with your workspace.
Click Save in the toolbar at the top of the shell node editing page.
Deploy the workflow.
Find the workflow
HTTP_Workflowthat you created in the project directory. On the right side of the workflow editing page, in Scheduling Settings, set Instance generation method to Instant generation after publishing.Click the
button in the toolbar to open the deployment dashboard, and then click Start Release Production. The task will be deployed through the deployment check process. For more information, see Deploy nodes.
Record the HTTP instance parameter information.
Because the HTTP trigger node generates scheduled instances immediately, you can go to Operation Center to view and record the HTTP instance parameter information.
Log on to the DataWorks console. In the target region, click in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Operation Center.
In the left-side navigation pane, click to go to the Cycle Examples page.
Find the HTTP trigger node instance that you created and record the Task ID and Scheduled Time of the instance.
NoteHover over the name of the HTTP trigger node instance to view the Task ID of the instance.
Local environment: Prepare sample code
Add POM dependencies.
You can go to the TriggerSchedulerTaskInstance debugging page and view the complete SDK installation information on the SDK Example tab.
<build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-assembly-plugin</artifactId> <version>3.3.0</version> <configuration> <archive> <manifest> <mainClass>com.example.demo.CrossTenantTriggerSchedulerTaskInstance</mainClass> <!-- Replace this with your main class --> </manifest> </archive> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id> <phase>package</phase> <!-- Execute during the package phase --> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin> </plugins> </build> <dependencies> <dependency> <groupId>com.aliyun</groupId> <artifactId>dataworks_public20240518</artifactId> <version>6.2.0</version> </dependency> </dependencies>ImportantAfter you finish developing the code, replace the
mainClassparameter with the main class name of the Java code that you created. The main class name uses the format of the full package name followed by the class name. For example,com.example.demo.CrossTenantTriggerSchedulerTaskInstance.Develop the code.
package com.example.demo; import com.aliyun.dataworks_public20240518.Client; import com.aliyun.dataworks_public20240518.models.TriggerSchedulerTaskInstanceRequest; import com.aliyun.dataworks_public20240518.models.TriggerSchedulerTaskInstanceResponse; import com.aliyun.teautil.models.RuntimeOptions; import java.text.SimpleDateFormat; import java.util.Calendar; public class CrossTenantTriggerSchedulerTaskInstance { // Method to create an Alibaba Cloud DataWorks client public static Client createClient20240518(String accessId, String accessKey, String endpoint) throws Exception { // Initialize OpenAPI configuration object com.aliyun.teaopenapi.models.Config config = new com.aliyun.teaopenapi.models.Config(); config.setAccessKeyId(accessId); // Set AccessKey ID config.setAccessKeySecret(accessKey); // Set AccessKey Secret config.setEndpoint(endpoint); // Set service endpoint return new Client(config); // Return the initialized client instance } // Method to trigger a DataWorks node run public static TriggerSchedulerTaskInstanceResponse runTriggerScheduler(Client client, Long nodeId, String EnvType,Long TriggerTime) throws Exception { TriggerSchedulerTaskInstanceRequest request = new TriggerSchedulerTaskInstanceRequest(); // Create API request object request.setTaskId(nodeId); // Set the node ID to trigger request.setEnvType(EnvType); // Set the project environment request.setTriggerTime(TriggerTime); // Set the scheduled trigger time (millisecond timestamp) RuntimeOptions runtime = new RuntimeOptions(); // Initialize runtime options return client.triggerSchedulerTaskInstanceWithOptions(request, runtime); // Execute the API call and return the response } // Main entry method public static void main(String[] args) throws Exception { // Initialize node ID (example value) String nodeId1 = ""; // Initialize project environment (example value) String EnvTypeStr = ""; // Initialize scheduled time (example value) String cycTimeStr = ""; // Process command-line arguments in a for loop, assigning values to nodeId1, cycTimeParam, and bizTimeParam int i; for(i = 0; i < args.length; ++i) { if (i == 0) { nodeId1 = args[i]; } else if (i == 1) { EnvTypeStr = args[i]; }else if (i == 2) { cycTimeStr = args[i]; } } // Convert string to Long type node ID Long nodeId = Long.parseLong(nodeId1); // Print usage instructions System.out.println("Usage: java -jar test-1.0-SNAPSHOT.jar nodeId EnvTypeStr cycTimeParam"); // Parse scheduled time and convert to timestamp SimpleDateFormat sdft = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); Calendar calendar = Calendar.getInstance(); calendar.setTime(sdft.parse(cycTimeStr)); // Parse time string Long cycTime = calendar.getTimeInMillis(); // Get millisecond timestamp // Print debug info (timestamp) System.out.println("Scheduled timestamp: " + cycTime); // Configure Alibaba Cloud service parameters String endpoint = "dataworks.cn-hangzhou.aliyuncs.com"; // Service endpoint String accessId = "xxx"; // AccessKey ID String accessKey = "xxx"; // AccessKey Secret // Create Alibaba Cloud client instance Client client = createClient20240518(accessId, accessKey, endpoint); // Execute trigger node operation TriggerSchedulerTaskInstanceResponse response = runTriggerScheduler(client, nodeId,EnvTypeStr, cycTime); // Print API response result (JSON format) System.out.println(com.aliyun.teautil.Common.toJSONString(com.aliyun.teautil.Common.toMap(response))); } }Replace the
endpoint,accessId, andaccessKeyparameters in the preceding code with the actual values based on the following parameter descriptions.Parameter
Description
endpoint
The endpoint of the workspace where the target HTTP trigger node resides. For more information, see Alibaba Cloud OpenAPI Developer Portal endpoints.
accessId
The
AccessKey IDof the Alibaba Cloud account that owns the workspace where the target HTTP trigger node resides.Log on to the DataWorks console, hover over the profile icon in the upper-right corner of the top navigation bar, and then go to AccessKey to obtain the
AccessKey IDandAccessKey Secretof the RAM user that owns the target HTTP trigger node.WarningThe
AKof an Alibaba Cloud root account has full permissions. If the AK is leaked, severe security risks may occur. We recommend that you use theAKof a RAM user that has only the workspace administrator role in the target workspace.accessKey
The
AccessKey Secretof the Alibaba Cloud account that owns the workspace where the target HTTP trigger node resides.Package the preceding code into a JAR file with the
jar-with-dependencies.jarsuffix.
Tenant A environment: Configure a shell node
Create and configure a shell node in the workspace of Tenant A to send the trigger command.
Go to Data Studio.
Go to the DataWorks workspace list page, switch to the target region in the top navigation bar, find the target workspace, and then click in the Operation column to go to Data Studio.
Upload a JAR resource.
In the left-side navigation pane, click the
icon to go to resource management.On the Resource Management page, click the Create button or the
icon, and then select .In the New Resource and Function dialog, enter the resource name
http_node_work.jarand click Confirm.On the resource details page, click the Click Upload button to upload the JAR file that you generated in the Local environment: Prepare sample code step, and then set Data Sources to the MaxCompute data source that you associated.
Save and deploy the resource.
After the JAR resource is uploaded, click the
button in the toolbar to open the deployment dashboard, and then click Start Release Production. Complete the deployment check process. For more information, see Deploy nodes.Create a triggering shell node.
In the left-side navigation pane, click
, and then click the
icon next to Project Directory.Select General > Shell to open the New Node dialog.
Specify a custom Name for the node and click Confirm to go to the node editing page.
Edit the triggering shell node.
In the left-side navigation pane of the shell node editing page, click the
icon and find the JAR resource that you uploaded (http_node_work.jar).Right-click the JAR resource that you uploaded and select Insert Resource Path.
Complete the trigger code execution parameters in the shell node.
##@resource_reference{"http_node_work.jar"} java -jar http_node_work.jar nodeId "EnvTypeStr" "cycTimeParam"Parameter
Description
java -jar
The JAR execution command.
http_node_work.jar
The name of the resource that you referenced.
nodeId
The Task ID of the HTTP trigger node that you recorded in the Tenant B environment: Create an HTTP trigger node workflow step.
EnvTypeStr
The project environment of the target HTTP trigger node. In this topic, the HTTP trigger node is deployed to the production environment. Set this parameter to
Prod.To run the HTTP trigger node in the development environment, set this parameter to
Dev.cycTimeParam
The Scheduled Time of the HTTP trigger node task that you recorded in the Tenant B environment: Create an HTTP trigger node workflow step. The time format is
yyyy-MM-dd HH:mm:ss.Configure the triggering shell node.
On the right side of the shell node editing page, in Scheduling Settings, navigate to , and then select the serverless resource group associated with the workspace as the Resource Group.
Run and view results
Run the shell node in Tenant A to trigger the HTTP trigger node and its downstream tasks in Tenant B.
Run the triggering shell node.
Click Run in the toolbar above the shell node that you configured in the Tenant A environment: Configure a shell node step.
View the execution results.
Go to the Tenant B environment and follow the steps below to view the execution results.
Log on to the DataWorks console. In the target region, click in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Operation Center.
In the left-side navigation pane, click to go to the scheduled instance page.
Find the HTTP trigger node instance that you want to trigger and view the execution results.
Legacy Data Studio example
If you want to trigger node execution in the legacy Data Studio environment of Tenant B from Tenant A, refer to the following instructions.
Tenant B environment: Create an HTTP trigger node workflow
Create a workflow in the workspace of Tenant B that will be triggered remotely. The workflow must contain an HTTP trigger node that receives the trigger command and a downstream business node (a shell node is used as an example in this topic) so that you can verify the cross-tenant triggering effect.
Create an HTTP trigger node. For more information, see Create an HTTP trigger node.
Create a shell node. For more information, see Create a shell node.
Enter the following content on the node editing page. On the right side of the shell node editing page, configure RUN Attribute and Resource Group for Scheduling in the schedule settings. Then click Save to save the node configuration.
echo "DataWorks";Deploy the workflow to Operation Center.
Record the HTTP instance parameter information.
Because the HTTP trigger node supports only
T+1generation of scheduled instances, you can click the
icon in the upper-right corner of the workflow the next day to view the HTTP instance parameter information in Operation and Maintenance Center under .Find the HTTP trigger node instance that you created and record the Task ID, Scheduled Time, and Business Date.
NoteHover over the name of the HTTP trigger node instance to view the Task ID of the instance.
Local environment: Prepare code
Add POM dependencies.
You can go to the RunTriggerNode debugging page and view the complete SDK installation information on the SDK Example tab.
<build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-assembly-plugin</artifactId> <version>3.3.0</version> <configuration> <archive> <manifest> <mainClass>com.example.demo.CrossTenantTriggerNode</mainClass> <!-- Replace this with your main class --> </manifest> </archive> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id> <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin> </plugins> </build> <dependencies> <dependency> <groupId>com.aliyun</groupId> <artifactId>dataworks_public20200518</artifactId> <version>7.0.1</version> </dependency> <dependency> <groupId>com.aliyun</groupId> <artifactId>tea-openapi</artifactId> <version>0.3.8</version> </dependency> <dependency> <groupId>com.aliyun</groupId> <artifactId>tea-console</artifactId> <version>0.0.1</version> </dependency> <dependency> <groupId>com.aliyun</groupId> <artifactId>tea-util</artifactId> <version>0.2.23</version> </dependency> <dependency> <groupId>com.aliyun</groupId> <artifactId>credentials-java</artifactId> <version>1.0.1</version> </dependency> </dependencies>ImportantAfter you finish developing the code, replace the
mainClassparameter with the main class name of the Java code that you created. The main class name uses the format of the full package name followed by the class name. For example,com.example.demo. CrossTenantTriggerNode.Develop the code.
package com.example.demo; import java.text.SimpleDateFormat; import java.util.Calendar; import com.aliyun.dataworks_public20200518.Client; import com.aliyun.dataworks_public20200518.models.RunTriggerNodeRequest; import com.aliyun.dataworks_public20200518.models.RunTriggerNodeResponse; import com.aliyun.teautil.models.RuntimeOptions; public class CrossTenantTriggerNode { // Method to create an Alibaba Cloud DataWorks client public static Client createClient20200518(String accessId, String accessKey, String endpoint) throws Exception { // Initialize OpenAPI configuration object com.aliyun.teaopenapi.models.Config config = new com.aliyun.teaopenapi.models.Config(); config.setAccessKeyId(accessId); // Set AccessKey ID config.setAccessKeySecret(accessKey); // Set AccessKey Secret config.setEndpoint(endpoint); // Set service endpoint return new Client(config); // Return the initialized client instance } // Method to trigger a DataWorks node run public static RunTriggerNodeResponse runTriggerNode(Client client, Long nodeId, Long cycleTime, Long bizDate, Long appId) throws Exception { RunTriggerNodeRequest request = new RunTriggerNodeRequest(); // Create API request object request.setNodeId(nodeId); // Set the node ID to trigger request.setCycleTime(cycleTime); // Set cycle time (millisecond timestamp) request.setBizDate(bizDate); // Set business date (millisecond timestamp) request.setAppId(appId); // Set application ID RuntimeOptions runtime = new RuntimeOptions(); // Initialize runtime options return client.runTriggerNodeWithOptions(request, runtime); // Execute the API call and return the response } // Main entry method public static void main(String[] args) throws Exception { // Initialize node ID (example value) String nodeId1 = ""; // Initialize scheduled time and business date (example values) String cycTimeStr = ""; String bizTimeParam = ""; // Process command-line arguments in a for loop, assigning values to nodeId1, cycTimeParam, and bizTimeParam int i; for(i = 0; i < args.length; ++i) { if (i == 0) { nodeId1 = args[i]; } else if (i == 1) { cycTimeStr = args[i]; }else if (i == 2) { bizTimeParam = args[i]; } } // Convert string to Long type node ID Long nodeId = Long.parseLong(nodeId1); // Print usage instructions System.out.println("Usage: java -jar test-1.0-SNAPSHOT.jar nodeId cycTimeParam bizTimeParam"); // Parse scheduled time and convert to timestamp SimpleDateFormat sdft = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"); Calendar calendar = Calendar.getInstance(); calendar.setTime(sdft.parse(cycTimeStr)); // Parse time string Long cycTime = calendar.getTimeInMillis(); // Get millisecond timestamp // Parse business date and convert to timestamp SimpleDateFormat sdfti = new SimpleDateFormat("yyyy-MM-dd"); Calendar calendari = Calendar.getInstance(); calendari.setTime(sdfti.parse(bizTimeParam)); Long bizTime = calendari.getTimeInMillis(); // Print debug info (timestamps) System.out.println("Scheduled timestamp: " + cycTime); System.out.println("Business date timestamp: " + bizTime); // Configure Alibaba Cloud service parameters String endpoint = "dataworks.cn-hangzhou.aliyuncs.com"; // Service endpoint String accessId = "xxx"; // AccessKey ID String accessKey = "xxx"; // AccessKey Secret Long appId = Long.valueOf(xxx); // Application ID (replace with actual value) // Create Alibaba Cloud client instance Client client = createClient20200518(accessId, accessKey, endpoint); // Execute trigger node operation RunTriggerNodeResponse response = runTriggerNode(client, nodeId, cycTime, bizTime, appId); // Print API response result (JSON format) System.out.println(com.aliyun.teautil.Common.toJSONString(com.aliyun.teautil.Common.toMap(response))); } }Replace the
endpoint,accessId,accessKey, andappIdparameters in the preceding code with the actual values.Parameter
Description
endpoint
The endpoint of the workspace where the target HTTP trigger node resides. For more information, see Alibaba Cloud OpenAPI Developer Portal endpoints.
accessId
The
AccessKey IDof the Alibaba Cloud account that owns the workspace where the target HTTP trigger node resides.Log on to the DataWorks console, hover over the profile icon in the upper-right corner of the top navigation bar, and then go to AccessKey to obtain the
AccessKey IDandAccessKey Secret.WarningThe
AKof an Alibaba Cloud root account has full permissions. If the AK is leaked, severe security risks may occur. We recommend that you use theAKof a RAM user that has only the workspace administrator role in the target workspace.accessKey
The
AccessKey Secretof the Alibaba Cloud account that owns the workspace where the target HTTP trigger node resides.appId
The ID of the workspace where the target HTTP trigger node resides. You can view the workspace ID in Administration.
Package the preceding code into a JAR file with the
jar-with-dependencies.jarsuffix.
Tenant A environment: Configure a shell node
Follow these steps to configure a shell node in the workspace of Tenant A to trigger the HTTP trigger node in the workspace of Tenant B.
Create and upload a JAR resource.
Create the JAR file that you packaged in the Local environment: Prepare code step as a MaxCompute resource. For more information, see Create and use resources.
Develop the triggering shell node.
Create a shell node and reference a MaxCompute resource in the shell node. Complete the trigger code execution parameters in the shell node. The following sample code is provided:
##@resource_reference{"http_node_work.jar"} java -jar http_node_work.jar nodeId "cycleTime" "bizDate"Parameter
Description
http_node_work.jar
The name of the resource that you referenced.
nodeId
The task ID of the HTTP trigger node that you recorded in the Tenant B environment: Create an HTTP trigger node workflow step.
cycleTime
The Scheduled Time of the trigger node task that you recorded in the Tenant B environment: Create an HTTP trigger node workflow step. The time format is
yyyy-MM-dd HH:mm:ss.bizDate
The Business Date of the trigger node task that you recorded in the Tenant B environment: Create an HTTP trigger node workflow step. The time format is
yyyy-MM-dd HH:mm:ss.Configure the triggering shell node.
On the right side of the shell node editing page, in Scheduling Settings, navigate to , and then set it to the serverless resource group associated with the workspace.
Run and view results
Run the shell node in Tenant A to trigger the HTTP trigger node and its downstream tasks in Tenant B.
Run the triggering shell node.
Click the
icon in the toolbar above the shell node that you configured in the Tenant A environment: Configure a shell node step to run the node task.View the execution results.
Go to the Tenant B environment and follow the steps below to view the execution results.
Log on to the DataWorks console. In the target region, click in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Operation Center.
In the left-side navigation pane, click to go to the scheduled instance page.
Find the HTTP trigger node instance that you want to trigger and view the execution results.