A cross-cycle dependency refers to the dependency of a node on its last-cycle instance, last-cycle instances of its child nodes, or last-cycle instances of specified nodes. After you configure a cross-cycle dependency for a node, the node is run in the current cycle only if relevant last-cycle instances are properly run.

DataWorks supports the following types of cross-cycle dependencies:
  • Dependency on last-cycle instances of child nodes
    • Node dependency: A node depends on the last-cycle instances of its child nodes. For example, node A has three child nodes B, C, and D. If you configure this type of dependency for node A, node A is run in the current cycle only if all the last-cycle instances of nodes B, C, and D are properly run.
    • Business scenario: A node is run in the current cycle only if its child nodes properly cleanse the data in the output tables of the node in the last cycle. To check whether the output data of the node is cleansed by the child nodes as expected, you can configure Data Quality to check the result tables that are generated by the child nodes.
  • Dependency on the last-cycle instance of the current node
    • Node dependency: A node depends on its last-cycle instance. The node is run in the current cycle only if the last-cycle instance of the node is properly run.
    • Business scenario: Whether a node is run in the current cycle depends on the business data that was generated by the last-cycle instance of the node. To check whether the output data of the node is cleansed as expected, you can configure Data Quality to check the result tables that are generated by the node.
  • Dependency on last-cycle instances of specified nodes: To configure this type of dependency, you must manually enter the IDs of the nodes that you want a node to depend on. You can specify multiple nodes and separate their IDs with commas (,). For example, 12345,23456.
    • Node dependency: A node depends on the last-cycle instances of specified nodes. The node is run in the current cycle only if the last-cycle instances of the specified nodes are properly run.
    • Business scenario: In the business logic, a node depends on the business data that was generated by other nodes but not processed by the node itself.

The difference between cross-cycle and same-cycle dependencies is that cross-cycle dependencies appear as dotted lines in Operation Center.

Before you undeploy a node, you must delete the dependencies that are configured for the node, including the cross-cycle dependency and same-cycle dependencies. The following figure shows the Scheduling configuration pane of a node. You must delete the cross-cycle dependency for the node in the section marked as ① and same-cycle dependencies in the section marked as ②.Delete icon

A node can depend on whether the instance of its parent node is properly run in the last or current cycle. You can configure a cross-cycle or same-cycle dependency based on your actual needs. Generally, you do not need to configure a same-cycle dependency on the parent node and a cross-cycle dependency on the parent node for a node at the same time. If you enable the auto parsing feature for a node, the scheduling system automatically configures a same-cycle dependency on the parent node for the node. If this does not meet your requirements, you can delete the default same-cycle dependency and configure a cross-cycle dependency for the node. For more information, see Logic of scheduling dependencies.

The following figure shows the dependency between two nodes in a workflow. The two nodes xc_create and xc_select in the workflow are for reference only.Dependency
The following figure shows how the dependency appears in Operation Center.Dependency

The xc_create node creates the xc_1 and xc_2 tables if they do not already exist. Then, the node inserts data to the two tables. The xc_1 and xc_2 tables are the output of the xc_create node.

As shown in the preceding figure, the xc_select node queries the xc_1 and xc_2 tables. Based on the auto parsing feature, the same-cycle dependency on the xc_create node is automatically configured for the xc_select node.

Dependency on last-cycle instances of child nodes

Node dependency: A node depends on the last-cycle instances of its child nodes. For example, node A has three child nodes B, C, and D. If you configure this type of dependency for node A, node A is run in the current cycle only if all the last-cycle instances of nodes B, C, and D are properly run.

Business scenario: A node is run only if its child nodes properly cleanse the data in the output tables of the node in the last cycle. Otherwise, the node is not run in the current cycle.

In this example, when you configure the xc_create node, select Rely on previous cycle and set the Dependencies parameter to Level-1 child node.Level-1 child node

Dependency on the last-cycle instance of the current node

Node dependency: A node depends on its last-cycle instance. The node is run in the current cycle only if the last-cycle instance of the node is properly run.

Business scenario: Whether a node is run in the current cycle depends on the business data that was generated by the last-cycle instance of the node. In this example, set the node to be scheduled by week. This way, you can conveniently view the dependencies of the node in Operation Center.

To view the dependencies of the node, go to Operation Center. In the left-side navigation pane, choose Cycle Task Maintenance > Cycle Instance. Search for the node to view its dependencies.
Note Assume that a node is scheduled by hour and depends on its last-cycle instance. If the instance that is generated for a specific hour is not properly run, the instance that is generated for the next hour is not run.

For example, if the first instance that is generated on a day is not run or the running fails, the instances that are generated for the rest of the day are not run.

Dependency on last-cycle instances of specified nodes

Node dependency: The output tables of the xc_information node are not referenced in the code of the xc_create node. However, the xc_create node depends on the data output of the xc_information node in the last cycle, as configured in the business logic. Logically, the xc_create node depends on the last-cycle instance of the xc_information node.

Business scenario: Based on the business logic, a node depends on the business data that is generated by another node. However, the business data is not referenced in the code of the current node. Namely, the current node does not perform operations on the business data.

In this example, when you configure the xc_create node, select Rely on previous cycle, set the Dependencies parameter to Customize, and then enter 1000374815, which is the ID of the xc_information node.Nodes

To view the dependencies of the node, go to Operation Center. In the left-side navigation pane, choose Cycle Task Maintenance > Cycle Instance. Search for the node to view its dependencies.

Advanced configuration

A branch node has two child nodes. Generally, only one of the child nodes is actually run. The scheduling system generates and runs an instance for one child node. For the other child node, the scheduling system generates an instance and directly returns a successful response without running the instance. In other words, the instance of this child node is dry run. For the child node of this node, the scheduling system also performs a dry run. If this does not meet your requirements, you can select The upstream node does not conduct cross-cycle transmission when you configure this child node of the branch node.

If a child node of a branch node depends on its own last-cycle instance and the last-cycle instance is dry run, the child node is also dry run in the current cycle. As a result, the child node becomes a dry-run node forever.

If the scheduling system performs a dry run for the child node on the left, the scheduling system also performs a dry run for the child node of this node, which is the next-cycle instance of this child node.

Your business may require that whether a child node of a branch node is run depends only on the result of the branch node in the current cycle, and that the child node is not affected by its dry run in the last cycle. To meet this requirement, perform the following steps:
  1. On the configuration tab of this child node of the branch node, click the Scheduling configuration tab in the right-side navigation pane.
  2. In the Time attribute section of the Scheduling configuration pane, select Rely on previous cycle.
  3. Click Advanced configuration.
  4. In the pop-up dialog box, select The upstream node does not conduct cross-cycle transmission.Advanced configuration
Note This advanced configuration applies only to child nodes of branch nodes. For other nodes, a dry run in the last cycle does not affect the running in the current cycle.

Typical scenarios of cross-cycle dependencies

  • Scenario 1
    • Scenario description: Node A is scheduled by day. Node B is scheduled by hour. Node A depends on node B. By default, node A is run after node B has been run for 24 times at the end of each day. However, you want node A to be run at 12:00 every day.
    • Solution: When you configure node B, select Rely on previous cycle and set the Dependencies parameter to This node. When you configure node A, select Timing scheduling and set the Specific time parameter to 12:00. Do not configure cross-cycle dependencies for node A.

      This way, after an instance is generated and run for node B at 12:00, the scheduling system runs node A.

  • Scenario 2
    • Scenario description: Node A is scheduled by day. Node B is scheduled by hour. Node A depends on the data that is generated by node B on the previous day.
    • Solution: When you configure node A, select Rely on previous cycle, set the Dependencies parameter to Customize, and then enter the ID of node B.
  • Scenario 3
    • Scenario description: Node A is scheduled by hour. Node B is scheduled by day. Node A depends on node B. After node B is run on a day, node A has gone through 24 cycles and the scheduling system starts to generate and run 24 instances at the same time.
    • Solution: When you configure node A, select Rely on previous cycle and set the Dependencies parameter to This node.
  • Scenario 4
    • Scenario description: A node depends on the data that is generated by the node in the last cycle.
    • Solution: When you configure the node, select Rely on previous cycle and set the Dependencies parameter to This node.