All Products
Search
Document Center

Dataphin:Configure scheduling dependencies for offline tasks

Last Updated:Jan 19, 2026

Dataphin uses scheduling dependency configurations to run nodes in a business flow in a specific order. This ensures that business data is produced effectively and on time. This topic describes how to configure scheduling dependencies and the main principles for configuration.

Background information

A scheduling dependency is an upstream or downstream relationship between nodes. In Dataphin, a descendant task node runs only after its ancestor task nodes run successfully. You can configure scheduling dependencies to ensure that scheduled tasks retrieve the correct data at runtime. When an ancestor node runs successfully, Dataphin detects that the latest data is available in the ancestor table. The descendant node can then retrieve the data. This prevents errors caused by a descendant node trying to retrieve data before it is ready.

Procedure

  1. On the Dataphin home page, choose Develop > Data Development from the top menu bar.

  2. On the Develop page, select a Project from the top menu bar. In Dev-Prod mode, you must also select an environment.

  3. In the navigation pane on the left, choose Data Processing > Script Task.

  4. In the list of computing tasks, click the target computing task to open its tab.

  5. Click Property in the sidebar on the right to open the Property panel. In the Schedule Dependency section, configure the following parameters.

    1. Upstream Dependency

      • Auto Parse

        • If the node is a SQL task, click Auto Parse. Dataphin automatically parses the task code to obtain the upstream tasks and output tables. After parsing, all detected dependency tables are added to the upstream dependency list. You can view, edit, or delete the parsed dependency tables. For more information, see Auto-parsing process.

        Note
        • If an auto-parsed input table has multiple output tasks, all output tasks are set as upstream dependencies by default.

        • The dependency cycle for all parsed dependency tables is set to Current Cycle by default.

        • If the code references a project variable or does not specify a project, the system resolves it to the production project name by default. This ensures the stability of the generated schedule. For example, if the development project name is onedata_dev:

          • If the code specifies select * from s_order, the schedule parses the dependency as onedata.s_order.

          • If the code specifies select * from ${onedata}.s_order, the schedule parses the dependency as onedata.s_order.

          • If the code specifies select * from onedata.s_order, the schedule parses the dependency as onedata.s_order.

          • If the code specifies select * from onedata_dev.s_order, the schedule parses the dependency as onedata_dev.s_order.

      • Add Root Vertex

        If a task has no upstream dependencies, click Add Root Vertex to set the root vertex as the upstream dependency.

        Note

        Each tenant or enterprise has a virtual root vertex that starts with virtual_root_node during init.

      • Add Previous Cycle of This Node

        The task for this node depends on its successful completion in the previous cycle, such as the previous day or previous N hours.

      • Add Dependency

        If Auto Parse fails to parse scheduling dependencies or the upstream dependency configuration generated by Auto Parse does not match the actual application, you can manually click +Add Dependency to add upstream dependencies for the node.

        Important
        • When you add a dependency, the Dependency Cycle and Dependency Policy for physical nodes and logical table nodes automatically use the recommended settings. To make changes, click the image.png edit icon for a dependency to modify its Dependency Cycle and Dependency Policy.

          • Dependency Cycle: The time range for the scheduled runtime of the upstream task instance. Usually, this is the current day, which is the range [00:00–24:00).

          • Dependency Policy: A dependency cycle may contain multiple instances, which requires you to specify a dependency policy. If there is only one instance, you can set the dependency policy to any option. However, to remain compatible with potential changes to the scheduling settings of an upstream task, only relative path policies are supported.

        • For the default policy for cross-cycle dependencies, see Appendix 2: Default policy for cross-cycle dependencies.

        • Add physical node dependency

          1. Click Add Dependency and select Physical Node.image

          2. In the Add Dependency - Physical Node dialog box, select one or more nodes. You can filter the target nodes by project, node type, node name, or output table name.

          3. Click OK.

        • Add logical table node

          1. Click Add Dependency and select Logical Table Node.

          2. In the Add Dependency - Logical Table Node dialog box, select one or more nodes. You can filter the target nodes by logical table type, business category, or logical table name.

          3. (Optional) In the node list, click the image..png icon in the Dependency Fields column of the target node to view the fields of the logical table.

          4. Click OK.

    2. Node Outputs

      The system automatically generates an output name for the node. To add more output names, click Auto-generate Output Name.

      Important

      The system uses output names to build the scheduling dependency graph and generates them automatically. Do not change this setting manually.

  6. Click OK to complete the scheduling dependency configuration.

Preview dependency cycles and policies

  1. Click Property for the target offline computing task. In the Property panel, go to the Schedule Dependency section.

  2. In the Upstream Dependency list in the Schedule Dependency area, click the image icon in the Actions column of the target dependency.

  3. In the Edit Dependency dialog box, view information such as the dependent node name, dependency cycle, dependency policy, and node dependency cycle preview.

    • Dependency Cycle: The time range for the scheduled runtime of the upstream task instance. Usually, this is the current day, which is the range [00:00–24:00).

    • Dependency Policy: A dependency cycle may contain multiple instances, which requires you to specify a dependency policy. If there is only one instance, you can set the dependency policy to any option. However, to remain compatible with potential changes to the scheduling settings of an upstream task, only relative path policies are supported.

    • Node Dependency Cycle Preview: View the list of instances for the current node and the selected ancestor node for a specified data timestamp.image

      Section

      Description

      ① Instance list of the selected ancestor node

      • Data Timestamp: Determined by the Dependency Cycle and the selected Data Timestamp of the current node.

        • This epoch (the current day): The data timestamp is the same as the data timestamp of the current node.

        • If the dependency epoch is Previous epoch (T-1), the data timestamp is the current node's data timestamp - 1 day.

        • If the dependency epoch is Previous N days, the data timestamp is the current node's data timestamp - N days.

        • If the dependency cycle is Last 24 Hours, and if the included instances span two data timestamps, the data timestamp is displayed as {yyyy-MM-dd ~ yyyy-MM-dd}.

      • Instance List: Shows the total number of instances for the selected ancestor node on the specified data timestamp.

        • If the total number of instances is 5 or less for the data timestamp, the instance list displays all instances.

        • If the total number of instances is greater than 5, you can click Expand All to view all instances.

          • If a dependent instance in the ancestor node list is either the first instance or the last instance, the view displays only the first instance and the last instance.

          • If a dependent instance in the ancestor node list is not the first or last instance, the view displays the first instance, the last instance, and the dependent instance.

        • Instances are displayed in the format Instance n ({Instance Scheduled Time}), where n increments starting from 1.

      Instance list of the current node

      The total number of instances for the current node on the selected data timestamp.

      If the total number of instances for the selected data timestamp is 5 or less, the instance list displays all instances. If the total is greater than 5, the list displays only the first instance and the last instance. You can click Expand All to view all instances. The first instance (Instance 1) is selected by default. Click an instance to select it.

      Instances are displayed in the format Instance n ({Instance Scheduled Time}), where n increments starting from 1.

      Line connecting the selected instance on the right to its dependent instance on the left

      • If the Dependency Policy is set to First Instance, Last Instance, Nearest Subsequent Instance, or Nearest Preceding Instance, a single line connects the dependent ancestor instance to the selected current instance.

      • If the Dependency Policy is All Instances, all instances in the ancestor node list are selected. The connecting line indicates that all instances in the ancestor node list are dependencies of the selected instance in the current node list.

Appendix 1: Initial default dependency cycle and policy

Current node scheduling cycle

Ancestor node scheduling cycle

Ancestor node has self-dependency

Default dependency cycle

Default dependency policy

Day/Week/Month

Day

Yes/No

Current Cycle (Day)

Last Instance

Day/Week/Month

Hour/Minute

No

Current Cycle (Day)

All Instances

Day/Week/Month

Hour/Minute

Yes

Current Cycle (Day)

Last Instance

Month/Week/Day/Hour/Minute

Month/Week

Yes

Current Cycle (Day)

Last Instance

Month/Week/Day/Hour/Minute

Month/Week

No

Current Cycle (Day)

Last Instance

Hour/Minute

Day

Yes/No

Current Cycle (Day)

Last Instance

Hour/Minute

Hour/Minute

Yes/No

Current Cycle (Day)

Last Instance

Appendix 2: Default policy for cross-cycle dependencies

In the following table, - indicates Not applicable.

Current node scheduling cycle

Ancestor node

Ancestor node scheduling cycle

Ancestor node has self-dependency

Default dependency cycle

Month

Current node (self-dependency)

-

-

Previous Cycle (Previous Day)

Week

Current node (self-dependency)

-

-

Previous Cycle (Previous Day)

Day

Current node (self-dependency)

-

-

Previous Cycle (Previous Day)

Hour

Current node (self-dependency)

-

-

Last 24 Hours

Minute

Current node (self-dependency)

-

-

Last 24 Hours

Day/Week/Month

Not the current node

Day

-

Current Cycle (Day)

Day/Week/Month

Not the current node

Hour/Minute

No

Current Cycle (Day)

Day/Week/Month

Not the current node

Hour/Minute

Yes

Current Cycle (Day)

Month/Week/Day/Hour/Minute

Not the current node

Month/Week

Yes

Current Cycle (Day)

Month/Week/Day/Hour/Minute

Not the current node

Month

No

Current Cycle (Day)

Month/Week/Day/Hour/Minute

Not the current node

Week

No

Current Cycle (Day)

Hour/Minute

Not the current node

Day

-

Current Cycle (Day)

Hour/Minute

Not the current node

Hour/Minute

-

Current Cycle (Day)