Dataphin uses scheduling dependency configurations to run nodes in a business flow in a specific order. This ensures that business data is produced effectively and on time. This topic describes how to configure scheduling dependencies and the main principles for configuration.
Background information
A scheduling dependency is an upstream or downstream relationship between nodes. In Dataphin, a descendant task node runs only after its ancestor task nodes run successfully. You can configure scheduling dependencies to ensure that scheduled tasks retrieve the correct data at runtime. When an ancestor node runs successfully, Dataphin detects that the latest data is available in the ancestor table. The descendant node can then retrieve the data. This prevents errors caused by a descendant node trying to retrieve data before it is ready.
Procedure
On the Dataphin home page, choose Develop > Data Development from the top menu bar.
On the Develop page, select a Project from the top menu bar. In Dev-Prod mode, you must also select an environment.
In the navigation pane on the left, choose Data Processing > Script Task.
In the list of computing tasks, click the target computing task to open its tab.
Click Property in the sidebar on the right to open the Property panel. In the Schedule Dependency section, configure the following parameters.
Upstream Dependency
Auto Parse
If the node is a SQL task, click Auto Parse. Dataphin automatically parses the task code to obtain the upstream tasks and output tables. After parsing, all detected dependency tables are added to the upstream dependency list. You can view, edit, or delete the parsed dependency tables. For more information, see Auto-parsing process.
NoteIf an auto-parsed input table has multiple output tasks, all output tasks are set as upstream dependencies by default.
The dependency cycle for all parsed dependency tables is set to Current Cycle by default.
If the code references a project variable or does not specify a project, the system resolves it to the production project name by default. This ensures the stability of the generated schedule. For example, if the development project name is
onedata_dev:If the code specifies
select * from s_order, the schedule parses the dependency asonedata.s_order.If the code specifies
select * from ${onedata}.s_order, the schedule parses the dependency asonedata.s_order.If the code specifies
select * from onedata.s_order, the schedule parses the dependency asonedata.s_order.If the code specifies
select * from onedata_dev.s_order, the schedule parses the dependency asonedata_dev.s_order.
Add Root Vertex
If a task has no upstream dependencies, click Add Root Vertex to set the root vertex as the upstream dependency.
NoteEach tenant or enterprise has a virtual root vertex that starts with virtual_root_node during init.
Add Previous Cycle of This Node
The task for this node depends on its successful completion in the previous cycle, such as the previous day or previous N hours.
Add Dependency
If Auto Parse fails to parse scheduling dependencies or the upstream dependency configuration generated by Auto Parse does not match the actual application, you can manually click +Add Dependency to add upstream dependencies for the node.
ImportantWhen you add a dependency, the Dependency Cycle and Dependency Policy for physical nodes and logical table nodes automatically use the recommended settings. To make changes, click the
edit icon for a dependency to modify its Dependency Cycle and Dependency Policy.Dependency Cycle: The time range for the scheduled runtime of the upstream task instance. Usually, this is the current day, which is the range [00:00–24:00).
Dependency Policy: A dependency cycle may contain multiple instances, which requires you to specify a dependency policy. If there is only one instance, you can set the dependency policy to any option. However, to remain compatible with potential changes to the scheduling settings of an upstream task, only relative path policies are supported.
For the default policy for cross-cycle dependencies, see Appendix 2: Default policy for cross-cycle dependencies.
Add physical node dependency
Click Add Dependency and select Physical Node.

In the Add Dependency - Physical Node dialog box, select one or more nodes. You can filter the target nodes by project, node type, node name, or output table name.
Click OK.
Add logical table node
Click Add Dependency and select Logical Table Node.
In the Add Dependency - Logical Table Node dialog box, select one or more nodes. You can filter the target nodes by logical table type, business category, or logical table name.
(Optional) In the node list, click the
icon in the Dependency Fields column of the target node to view the fields of the logical table.Click OK.
Node Outputs
The system automatically generates an output name for the node. To add more output names, click Auto-generate Output Name.
ImportantThe system uses output names to build the scheduling dependency graph and generates them automatically. Do not change this setting manually.
Click OK to complete the scheduling dependency configuration.
Preview dependency cycles and policies
Click Property for the target offline computing task. In the Property panel, go to the Schedule Dependency section.
In the Upstream Dependency list in the Schedule Dependency area, click the
icon in the Actions column of the target dependency.In the Edit Dependency dialog box, view information such as the dependent node name, dependency cycle, dependency policy, and node dependency cycle preview.
Dependency Cycle: The time range for the scheduled runtime of the upstream task instance. Usually, this is the current day, which is the range [00:00–24:00).
Dependency Policy: A dependency cycle may contain multiple instances, which requires you to specify a dependency policy. If there is only one instance, you can set the dependency policy to any option. However, to remain compatible with potential changes to the scheduling settings of an upstream task, only relative path policies are supported.
Node Dependency Cycle Preview: View the list of instances for the current node and the selected ancestor node for a specified data timestamp.

Section
Description
① Instance list of the selected ancestor node
Data Timestamp: Determined by the Dependency Cycle and the selected Data Timestamp of the current node.
This epoch (the current day): The data timestamp is the same as the data timestamp of the current node.
If the dependency epoch is Previous epoch (T-1), the data timestamp is the current node's data timestamp - 1 day.
If the dependency epoch is Previous N days, the data timestamp is the current node's data timestamp - N days.
If the dependency cycle is Last 24 Hours, and if the included instances span two data timestamps, the data timestamp is displayed as
{yyyy-MM-dd ~ yyyy-MM-dd}.
Instance List: Shows the total number of instances for the selected ancestor node on the specified data timestamp.
If the total number of instances is 5 or less for the data timestamp, the instance list displays all instances.
If the total number of instances is greater than 5, you can click Expand All to view all instances.
If a dependent instance in the ancestor node list is either the first instance or the last instance, the view displays only the first instance and the last instance.
If a dependent instance in the ancestor node list is not the first or last instance, the view displays the first instance, the last instance, and the dependent instance.
Instances are displayed in the format
Instance n ({Instance Scheduled Time}), where n increments starting from 1.
② Instance list of the current node
The total number of instances for the current node on the selected data timestamp.
If the total number of instances for the selected data timestamp is 5 or less, the instance list displays all instances. If the total is greater than 5, the list displays only the first instance and the last instance. You can click Expand All to view all instances. The first instance (Instance 1) is selected by default. Click an instance to select it.
Instances are displayed in the format
Instance n ({Instance Scheduled Time}), where n increments starting from 1.③ Line connecting the selected instance on the right to its dependent instance on the left
If the Dependency Policy is set to First Instance, Last Instance, Nearest Subsequent Instance, or Nearest Preceding Instance, a single line connects the dependent ancestor instance to the selected current instance.
If the Dependency Policy is All Instances, all instances in the ancestor node list are selected. The connecting line indicates that all instances in the ancestor node list are dependencies of the selected instance in the current node list.
Appendix 1: Initial default dependency cycle and policy
Current node scheduling cycle | Ancestor node scheduling cycle | Ancestor node has self-dependency | Default dependency cycle | Default dependency policy |
Day/Week/Month | Day | Yes/No | Current Cycle (Day) | Last Instance |
Day/Week/Month | Hour/Minute | No | Current Cycle (Day) | All Instances |
Day/Week/Month | Hour/Minute | Yes | Current Cycle (Day) | Last Instance |
Month/Week/Day/Hour/Minute | Month/Week | Yes | Current Cycle (Day) | Last Instance |
Month/Week/Day/Hour/Minute | Month/Week | No | Current Cycle (Day) | Last Instance |
Hour/Minute | Day | Yes/No | Current Cycle (Day) | Last Instance |
Hour/Minute | Hour/Minute | Yes/No | Current Cycle (Day) | Last Instance |
Appendix 2: Default policy for cross-cycle dependencies
In the following table, - indicates Not applicable.
Current node scheduling cycle | Ancestor node | Ancestor node scheduling cycle | Ancestor node has self-dependency | Default dependency cycle |
Month | Current node (self-dependency) | - | - | Previous Cycle (Previous Day) |
Week | Current node (self-dependency) | - | - | Previous Cycle (Previous Day) |
Day | Current node (self-dependency) | - | - | Previous Cycle (Previous Day) |
Hour | Current node (self-dependency) | - | - | Last 24 Hours |
Minute | Current node (self-dependency) | - | - | Last 24 Hours |
Day/Week/Month | Not the current node | Day | - | Current Cycle (Day) |
Day/Week/Month | Not the current node | Hour/Minute | No | Current Cycle (Day) |
Day/Week/Month | Not the current node | Hour/Minute | Yes | Current Cycle (Day) |
Month/Week/Day/Hour/Minute | Not the current node | Month/Week | Yes | Current Cycle (Day) |
Month/Week/Day/Hour/Minute | Not the current node | Month | No | Current Cycle (Day) |
Month/Week/Day/Hour/Minute | Not the current node | Week | No | Current Cycle (Day) |
Hour/Minute | Not the current node | Day | - | Current Cycle (Day) |
Hour/Minute | Not the current node | Hour/Minute | - | Current Cycle (Day) |