To schedule offline integration tasks to run periodically, it is necessary to define their scheduling properties, including the scheduling cycle, dependencies, and parameters. This topic outlines how to configure these properties and schedule offline tasks.
Notes
The system only supports scheduling configuration for offline integration tasks with a scheduling type of recurring task node.
Dependencies are semantic links between nodes, where the status of an upstream node influences the running status of downstream nodes.
The scheduling rule for dependent nodes is twofold: firstly, a downstream node can be scheduled only after the upstream node has finished running; secondly, the decision to execute the schedule is based on the node's predefined scheduling time.
Scheduling configurations submitted before the preset scheduling time will take effect at that time. Dependencies set after the preset scheduling time will only generate instances after a one-day delay.
A task's scheduling configuration is used solely to define its properties during scheduling. The task must be deployed to the production environment before it can be scheduled according to this configuration.
The scheduling time specifies only the intended execution time of the task. However, the actual execution time depends on the execution status of upstream dependencies. For detailed information on task execution conditions, see instance running diagnostics.
Offline integration task properties entry
On the Dataphin home page, select Development > Data Integration from the top menu bar.
At the top menu bar of the Integration page, choose Project.
In the left-side navigation pane, select Integration > Batch Pipeline. Then, in the Batch Pipeline list, click the desired task name.
In the task tab, click Attribute on the right to open the Attribute panel.
Configure offline task properties
On the offline task properties page, configure the basic information and scheduling-related properties as outlined in the table below.
Configuration Item | Description |
Includes task name, ID, node type, development owner, operations owner, and description.
| |
Determines how the task is scheduled to recur in the production environment.
| |
Defines the task's upstream and downstream dependencies, ensuring nodes are executed in sequence. Descendant nodes run after ancestor nodes complete, facilitating the timely generation of valid business data. Dependencies can be set automatically or manually. | |
Specifies the task's timeout settings and rerun policy in case of failure, preventing resource waste due to prolonged task execution and enhancing task reliability. |