Dataphin uses scheduling dependencies to manage the execution order of logical table and logical fact table task nodes, ensuring the timely and efficient processing of business data. This topic explains how to set up these dependencies.
Procedure
Navigate to the Development section from the top menu bar on the Dataphin home page, and select Data Development.
Access the Scan Configuration information page by following the steps below.
Choose Project (Dev-Prod mode requires selecting the environment) -> Click Logical Table -> Select the logical table task requiring scheduling configuration -> Click Scan Configuration.
In the Scheduling Properties section, set up the upstream dependencies for the logical table.
Upstream Dependencies
Auto Parsing
Click Auto Parsing to let Dataphin automatically identify upstream tasks and output tables based on the logical table's computation logic. The system will add all identified dependencies to the upstream dependency list, where you can review details or edit and delete entries.
NoteBy default, all output tasks of the automatically parsed input table are set as upstream dependencies.
The default dependency cycle for all identified tables is set to This Cycle.
Add Root Node
If the task lacks a corresponding upstream dependency, click Add Root Node to establish it as the upstream dependency for the current task.
NoteEach tenant or enterprise has a virtual root node, named virtual_root_node, created during initialization.
Add This Node's Previous Cycle
This option sets the node task's scheduling to depend on the successful execution of this node in the previous cycle, such as the previous day or the previous number of hours.
Add Dependency
If Auto Parsing cannot parse the scheduling dependencies or the upstream dependency configuration generated by Auto Parsing does not match the actual application, you can manually click +add Dependency to add the node's Upstream Dependency.
ImportantWhen adding dependencies, the system automatically applies the recommended best settings for the Dependency Cycle and Dependency Policy of physical nodes and logical table nodes. Should you wish to alter these settings, simply click on the dependency list
to edit the Dependency Cycle and Dependency Policy for an individual dependency.
Dependency Cycle: Defines the scheduled run time range (start time) for the upstream task instance, typically the current day [00:00~24:00).
Dependency Policy: Specifies the policy for handling multiple instances within a dependency cycle. When only one instance exists, any policy can be set. To accommodate potential scheduling changes in upstream tasks, only the relative path policy is supported.
For the default policy on cross-cycle dependencies, refer to Appendix: Default Policy for Cross-Cycle Dependencies.
Add Physical Node Dependency
Area
Description
①Search and Filter Area
Use the search and filter area to locate the required Physical Table Node based on filter conditions such as This Project, Project, Node Type, and the Node Name or Output Table Name.
②Node List
The node list displays the Physical Nodes available for dependency, allowing you to select as needed.
Add Logical Table Node
Area
Description
①Search and Filter Area
Filter the Logical Table Node in the search and filter area using criteria such as Logical Table Type, Belonging Section, and the Logical Table Name.
②Node List
The node list displays Logical Table Nodes available for dependency, allowing you to select based on your requirements.
To depend on specific fields within a logical table rather than the entire table, click on the Dependent Fields column in the node list. Then, click
to display the available fields of the logical table, allowing you to select according to your needs.
This Node's Output
The system automatically assigns an output name for the created node. To add multiple output names, click Auto Generate Output Name.
ImportantThe system generates the output name to construct the scheduling dependency graph and does so automatically. Manual intervention in the settings is discouraged.
To finalize the scheduling dependency setup, click OK.
Appendix: Default Policy for Cross-Cycle Dependencies
This Node's Scheduling Cycle | Upstream Node | Upstream Node's Scheduling Cycle | Is Upstream Node Self-Dependent | Default Dependency Cycle |
Month | This Node (Self-Dependent) | - | | Previous Cycle (Previous 1 Day) |
Week | This Node (Self-Dependent) | - | | Previous Cycle (Previous 1 Day) |
Day | This Node (Self-Dependent) | - | | Previous Cycle (Previous 1 Day) |
Hour | This Node (Self-Dependent) | - | | Last 24 Hours |
Minute | This Node (Self-Dependent) | - | | Last 24 Hours |
Day/Week/Month | Non-This Node | Day | | This Cycle (Today) |
Day/Week/Month | Non-This Node | Hour/Minute | No | This Cycle (Today) |
Day/Week/Month | Non-This Node | Hour/Minute | Yes | This Cycle (Today) |
Month/Week/Day/Hour/Minute | Non-This Node | Month/Week | Yes | This Cycle (Today) |
Month/Week/Day/Hour/Minute | Non-This Node | Month | No | This Cycle (Today) |
Month/Week/Day/Hour/Minute | Non-This Node | Week | No | This Cycle (Today) |
Hour/Minute | Non-This Node | Day | | This Cycle (Today) |
Hour/Minute | Non-This Node | Hour/Minute | | This Cycle (Today) |