All Products
Search
Document Center

Dataphin:Configure Scheduling Dependencies for Offline Tasks

Last Updated:Mar 05, 2026

Dataphin runs nodes in your business workflows according to the scheduling dependency configurations that you set for each node. This process ensures that business data is generated correctly and on time. This topic describes how scheduling dependencies work and the key principles for their configuration.

Background information

A scheduling dependency defines an upstream-downstream relationship between nodes. In Dataphin, a downstream task node starts only after its upstream task node completes successfully. By configuring scheduling dependencies, you ensure that downstream tasks run with the correct data. When an upstream node runs successfully, Dataphin detects that the latest data is available in the upstream table. The downstream node can then read that data. This process prevents errors that can occur if a downstream node attempts to read data before the upstream table is populated.

Procedure

  1. On the Dataphin homepage, in the top menu bar, choose Develop > Data Development.

  2. On the Develop page, select a project from the top menu bar. In Dev-Prod mode, select an environment as well.

  3. In the left navigation pane, select Data Processing > Compute Tasks.

  4. In the list of computing tasks, click the target computing task to open its configuration tab.

  5. In the right-side sidebar, click Property to open the Property panel. In the Schedule Dependency section, configure the following parameters.

    1. Upstream dependency

      • Auto-parse

        • If the node type is SQL, you can click Auto-parse. Dataphin then parses the code to detect upstream tasks and output tables. Dataphin adds all parsed dependencies to the upstream dependency list. You can view, edit, or delete any parsed dependency table.

        Note
        • If an automatically parsed input table has multiple producing tasks, all of these tasks are treated as upstream dependencies by default.

        • For all parsed dependencies, the default dependency cycle is This cycle.

        • If your code references a project variable or does not specify a project, Dataphin uses the production project name by default to ensure stable scheduling. For example, if your development project name is onedata_dev:

          • If your code contains select * from s_order, Dataphin resolves the dependency as onedata.s_order.

          • If your code contains select * from ${onedata}.s_order, Dataphin resolves the dependency as onedata.s_order.

          • If your code contains select * from onedata.s_order, Dataphin resolves the dependency as onedata.s_order.

          • If your code contains select * from onedata_dev.s_order, Dataphin resolves the dependency as onedata_dev.s_order.

      • Add root node

        If your task has no upstream dependency, you can click Add root node to set the root node as the upstream dependency for the task.

        Note

        Each tenant or enterprise has a virtual root node named virtual_root_node that is created upon initialization.

      • Add previous cycle of this node

        This option makes the task wait for its previous cycle instance, such as the instance from the previous day or n hours earlier, to complete successfully before the current instance runs.

      • Add dependency

        If Automatic Parsing cannot parse the scheduling dependencies, or if the upstream dependency configuration generated by Automatic Parsing does not meet your requirements, you can click + Add Dependency to manually add upstream dependencies for the node.

        Important
        • When you add a dependency, the system automatically applies default settings for Dependency cycle and Dependency policy. To change these settings, click the image.png icon next to a dependency in the list to edit its Dependency cycle and Dependency policy.

          • Dependency cycle: The scheduled runtime window (trigger time) for upstream task instances. By default, this is today: [00:00–24:00).

          • Dependency policy: Some dependency cycles may have multiple instances. You must specify a policy to select an instance. If only one instance exists, you can choose any policy. To ensure compatibility with potential future changes to upstream task scheduling, only relative path policies are supported.

        • For information about default policies for cross-cycle dependencies, see Appendix 2: Default policies for cross-cycle dependencies.

        • Add dependency – Physical node

          You can select one or more physical nodes from the node list. You can filter the list by This project, Project, Node type, Node name, or Output table name.

        • Add dependency – Logical table node

          You can select one or more logical table nodes from the node list and search for the nodes by logical table type, business segment, and logical table name.

          To depend on specific fields in a logical table rather than the entire table, you can click the image..png icon in the Dependency Fields column of the node list to view the available table fields and select the ones that meet your requirements.

        • Add dependency – Cross-tenant node

          You can select one or more cross-tenant nodes from the node list. You can filter the list by Tenant, Node type, or Node name.

    2. This node output

      Dataphin automatically generates an output name for the node. To add more output names, you can click Auto-generate output name.

      Important

      Dataphin uses output names to build the scheduling dependency graph. These names are auto-generated. Do not manually set them.

  6. Click OK to finish configuring the scheduling dependencies.

Preview dependency cycles and policies

  1. For the target offline computing task, click Property. In the Property panel, navigate to the Schedule Dependency section.

  2. In the Schedule Dependency section, find the target dependency in the Upstream Dependency list and click the image icon in the Actions column.

  3. In the Edit Dependency dialog box, you can view the node name, dependency cycle, dependency policy, and a preview of the node dependency cycle.

    • Dependency cycle: The scheduled runtime window (trigger time) for upstream task instances. By default, this is today: [00:00–24:00).

    • Dependency policy: Some dependency cycles may have multiple instances. You must specify a policy to select an instance. If only one instance exists, you can choose any policy. To ensure compatibility with potential future changes to upstream task scheduling, only relative path policies are supported.

    • Node dependency cycle preview: This section displays the instance list for the current node and the selected upstream node for a given data timestamp. image

      Block

      Description

      ① Instance list for the selected upstream node

      • Data timestamp: This is determined by the Dependency cycle and the Data timestamp of the current node.

        • The dependency period is defined as the current period (today), where the data timestamp matches the business date of the current node.

        • For the previous cycle (previous day) dependency, the data timestamp is the current node data timestamp - 1 day.

        • If the dependency period is Previous N days, the data timestamp is calculated as current node's data timestamp - N days.

        • If the dependency cycle is Last 24 hours, and the instances span two data timestamps, the timestamp is displayed as {yyyy-MM-dd ~ yyyy-MM-dd}.

      • Instance list: This shows the total number of instances for the selected upstream node on the data timestamp.

        • If the total number of instances is 5 or less, the list displays all instances.

        • If the total number of instances is greater than 5, you can click Expand all to view all instances.

          • If an instance in the left list (ancestor node instances) is a dependency for the currently selected instance in the right list (current node instances), and that instance is the first instance or the last instance in the left list, the left list displays the first instance and the last instance.

          • If an instance in the left-side list (selected ancestor node instance list) is a dependency for the selected instance in the right-side list (instance list of the current node), and is not the first or last instance in the list, the left-side list displays the first instance, the dependency instance, and the last instance.

        • Instances are displayed in order as Instance n ({scheduled trigger time}) , where n starts at 1 and increases.

      Instance list for this node

      This shows the total number of instances for this node on the selected data timestamp.

      If the total number of instances is 5 or less, the list displays all instances. If the total is greater than 5, the list displays only the first and last instances. You can click Expand all to view all instances. By default, the first instance (Instance 1) is selected. You can click another instance to switch the selection.

      Instances are displayed in order as Instance n ({scheduled trigger time}) , where n starts at 1 and increases.

      Line connecting the selected instance on the right to instances on the left

      • If the Dependency policy is First instance, Last instance, Nearest later instance, or Nearest earlier instance, a line connects the selected instance of the current node to a single instance of the upstream node.

      • If the Dependency policy is All instances, all instances of the upstream node are selected. The connecting line indicates that all upstream instances are dependencies for the selected instance of the current node.

Appendix 1: Default dependency cycles and policies

Scheduling cycle for this node

Scheduling cycle for upstream node

Does upstream node self-depend?

Default dependency cycle

Default dependency policy

Day/Week/Month

Day

Yes/No

This cycle (today)

Last instance

Day/Week/Month

Hour/Minute

No

This cycle (today)

All instances

Day/Week/Month

Hour/Minute

Yes

This cycle (today)

Last instance

Month/Week/Day/Hour/Minute

Month/Week

Yes

This cycle (today)

Last instance

Month/Week/Day/Hour/Minute

Month/Week

No

This cycle (today)

Last instance

Hour/Minute

Day

Yes/No

This cycle (today)

Last instance

Hour/Minute

Hour/Minute

Yes/No

This cycle (today)

Last instance

Appendix 2: Default policies for cross-cycle dependencies

In the table below, - means Not applicable.

Scheduling cycle for this node

Upstream node

Scheduling cycle for upstream node

Does upstream node self-depend?

Default dependency cycle

Month

This node (self-dependency)

-

-

Previous cycle (1 day ago)

Week

This node (self-dependency)

-

-

Previous cycle (1 day ago)

Day

This node (self-dependency)

-

-

Previous cycle (1 day ago)

Hour

This node (self-dependency)

-

-

Last 24 hours

Minute

This node (self-dependency)

-

-

Last 24 hours

Day/Week/Month

Not this node

Day

-

This cycle (today)

Day/Week/Month

Not this node

Hour/Minute

No

This cycle (today)

Day/Week/Month

Not this node

Hour/Minute

Yes

This cycle (today)

Month/Week/Day/Hour/Minute

Not this node

Month/Week

Yes

This cycle (today)

Month/Week/Day/Hour/Minute

Not this node

Month

No

This cycle (today)

Month/Week/Day/Hour/Minute

Not this node

Week

No

This cycle (today)

Hour/Minute

Not this node

Day

-

This cycle (today)

Hour/Minute

Not this node

Hour/Minute

-

This cycle (today)