DataWorks provides an HTTP trigger node that allows you to trigger the node and its downstream nodes in a workflow from an external environment, such as a local system or a cross-tenant environment, by calling an OpenAPI operation. This allows you to trigger tasks from your local system or handle cross-tenant task dependencies.
Product overview
An HTTP trigger node is a special virtual node that lets you use the DataWorks OpenAPI operation TriggerSchedulerTaskInstance to trigger scheduling for this node and its downstream nodes.
Triggering mechanism
An HTTP trigger node triggers its downstream nodes only after its upstream nodes succeed and an external system sends a scheduling command. For a diagram illustrating the use of the HTTP trigger node, see Usage diagram.
Usage diagram
The HTTP trigger node is commonly used for communication between external environments and the DataWorks scheduling system.
Diagram description:
In Data Studio, create a workflow that contains an HTTP trigger node, configure the dependencies for each node, and deploy the workflow to Operation Center.
The system automatically generates scheduled instances based on the scheduling time. You can obtain information about the HTTP trigger node, such as its task ID and trigger time, from these instances.
Use this information in Java or Python code, or on the API debugging page, to call the OpenAPI operation and trigger the node. To trigger a specific instance and its subsequent workflow, you can set the
TriggerTimeparameter to a fixed value. To dynamically trigger all instances, setTriggerTimeto a dynamic variable. After the HTTP trigger node receives and validates the trigger command, it executes the downstream nodes in sequence.
Trigger conditions
An HTTP trigger node can be triggered only if the following conditions are met:
A scheduled instance of the HTTP trigger node must exist. You can find this instance on the Cycle Examples page in Operation Center. Before the instance is successfully triggered by the TriggerSchedulerTaskInstance API, it remains in the Waiting for Trigger state. Its downstream nodes remain blocked until the
TriggerSchedulerTaskInstanceAPI is successfully called to trigger the HTTP trigger node, at which point the downstream nodes run in sequence.All upstream nodes that the HTTP trigger node depends on have run successfully, meaning their instances are in a successful state.
The scheduled time for the scheduled instance of the HTTP trigger node has arrived.
The scheduling resource group used by the HTTP trigger node has sufficient resources at the time of the trigger.
The HTTP trigger node is not in a frozen state.
Only HTTP trigger node instances in the Waiting for Trigger state can be triggered. Instances that have already been successfully triggered cannot be triggered again.
Common scenarios
External file-driven scheduling: Use an external script (such as Python) to read date or condition information from files such as Excel or CSV files. When the trigger conditions are met, call the DataWorks OpenAPI operation TriggerSchedulerTaskInstance to trigger the HTTP trigger node and implement automated scheduling based on external data.
Cross-tenant task dependency: When task dependencies exist between DataWorks instances of different tenants, you can use the HTTP trigger node to trigger tasks across tenants. For more information, see Use the HTTP trigger node to trigger node execution across tenants.
If you use DataWorks Professional Edition or a higher edition, you can also use the Check node to implement similar external condition checking.
Notes
An HTTP trigger node triggers its downstream nodes only after its upstream nodes have run successfully and the external environment sends a scheduling command.
If the external environment sends a scheduling command before the upstream nodes have finished running, the HTTP trigger node does not trigger the downstream nodes. The system retains the scheduling command from the external environment and triggers the downstream nodes through the HTTP trigger node after the upstream nodes finish running.
NoteThe trigger command from the external environment is retained for only 24 hours. If the upstream nodes do not finish running within 24 hours, the trigger command is lost and the scheduling command sent by the external environment becomes invalid.
After the current HTTP trigger node instance is successfully triggered, it cannot be triggered again.
Prerequisites
The RAM user that you want to use is added to your workspace.
If you want to use a RAM user to develop tasks, you must add the RAM user to your workspace as a member and assign the Develop or Workspace Administrator role to the RAM user. The Workspace Administrator role has more permissions than necessary. Exercise caution when you assign the Workspace Administrator role. For more information about how to add a member and assign roles to the member, see Add members to a workspace.
A serverless resource group is associated with your workspace. For more information, see the topics in the Use serverless resource groups directory.
Before you develop an HTTP trigger node, you must create the corresponding HTTP trigger node. For more information, see Create an HTTP trigger node.
Limitations
The HTTP trigger node feature is available only in Dataworks Enterprise Edition and higher editions. For more information about DataWorks editions, see DataWorks editions.
The HTTP trigger node is used only to trigger tasks and cannot be used as a compute task. You must set the task nodes that you want to run as downstream nodes of the HTTP trigger node to trigger and execute the tasks.
Create an HTTP trigger node
Create an HTTP trigger node
For information about how to create a node, see Create an HTTP trigger node.
Configure the HTTP trigger node
After you create an HTTP trigger node, configure the following parameters in Scheduling Settings on the right side of the node editing page. For more information about other parameters, see Schedule settings.
Parameter | Description |
Resource Group for Scheduling | Select the serverless resource group that you have associated. |
Instance generation method | You can select T +1 generated next day or Instant generation after publishing. |
The HTTP trigger node is a zero-load node. You do not need to write any node content.
If the HTTP trigger node has no upstream node, the root node of the workflow is used as the default upstream node.
Deploy the HTTP trigger node
After you complete the schedule settings, submit and deploy the HTTP trigger node to the production environment. For more information, see Deploy nodes.
After a task is deployed, it runs periodically based on your schedule settings. You can view the deployed scheduled tasks in and perform O&M operations on the tasks. For more information, see Manage scheduled tasks.
Trigger configuration in external scheduling environments
When you configure trigger settings in an external scheduling environment, you must configure the HTTP trigger node instance parameters obtained earlier in one of the following methods and call the TriggerSchedulerTaskInstance operation to trigger the HTTP trigger node.
Obtain HTTP trigger node instance parameters
Based on the instance generation method you selected when you configured the HTTP trigger node, view and record the HTTP trigger node instance parameters in Operation Center.
T+1 next day generation: You must go to Operation Center the next day to view and record the HTTP instance parameters.
Instant generation after deployment: You can go to Operation Center immediately to view and record the HTTP instance parameters.
Log on to the DataWorks console. In the target region, click in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Operation Center.
In the left-side navigation pane, click to go to the Cycle Examples page.
In the list, find the HTTP trigger node instance that you created and record the Task ID and Scheduled Time of the instance.
NoteHover over the name of the HTTP trigger node instance to view the Task ID of the instance.
Reference
DataWorks provides the HTTP trigger node feature to trigger and execute tasks in cross-tenant scenarios. For more information, see Use the HTTP trigger node to trigger node execution across tenants.