DataWorks allows you to configure cross-cycle scheduling dependencies for nodes. You can configure the instance generated for a node in the current cycle to depend on the instances generated for one or more specific nodes in the previous cycle. The instance generated for the node in the current cycle can start to run only after the instances generated for one or more specific nodes on which the node depends are successfully run. If the instance generated for a node in the current cycle needs to depend on the data of an instance generated for another node on the previous day or if the instance generated for a node scheduled by hour or minute in the current cycle needs to depend on the instance generated for the same node in the previous cycle, you can configure cross-cycle scheduling dependencies. This topic describes how to configure cross-cycle scheduling dependencies for a node and the types of cross-cycle dependencies.

Precautions

When you configure cross-cycle scheduling dependencies, take note of the items described in the following table.
ItemDescriptionReferences
Display of cross-cycle scheduling dependenciesCross-cycle scheduling dependencies are presented as dash lines in the directed acyclic graph (DAG) of a node. Appendix: Use the features provided in a DAG
Confirmation of the requirement of configuring same-cycle scheduling dependencies after cross-cycle scheduling dependencies are configuredAfter you configure scheduling dependencies for a node, the node can start to run only after all the ancestor nodes are successfully run.

By default, the automatic parsing feature for same-cycle scheduling dependencies is enabled. If cross-cycle scheduling dependencies are configured for a node, you must check whether the node requires same-cycle scheduling dependencies. If the node does not require same-cycle scheduling dependencies, you must delete the automatically generated same-cycle scheduling dependencies to prevent the running of the node from being affected.

Delete scheduling dependencies
Complex scenarios in which cross-cycle scheduling dependencies are requiredIn some complex scenarios, same-cycle scheduling dependencies may not be able to meet your business requirements. In this case, you can configure cross-cycle scheduling dependencies.

For example, if a node scheduled by day depends on a node scheduled by hour, the instance generated for the node scheduled by day depends on all instances generated for the node scheduled by hour on the current day by default. You can configure the self-dependency for the node scheduled by hour. This way, the instance generated for the node scheduled by day can depend on the instance that is generated for the node scheduled by hour in a specific scheduling cycle.

Principles and samples of scheduling configurations in complex dependency scenarios
Preview of scheduling dependencies of a nodeTo prevent an auto triggered node in the production environment from being delayed due to the scheduling dependencies that do not meet expectations, we recommend that you preview the scheduling dependencies of the node before you deploy the node to the production environment. This ensures that the instances generated for the auto triggered node can run as expected. Preview scheduling dependencies of a node
Node deploymentAfter you configure cross-cycle scheduling dependencies for a node, you must deploy the node and its ancestor nodes to the production environment. After the deployment is complete, you can view the cross-cycle scheduling dependencies in Operation Center in the production environment. Deploy nodes

Entry point for configuring cross-cycle scheduling dependencies

Go to the configuration tab of the desired node in DataStudio. Click the Properties tab in the right-side navigation pane. In the Dependencies section of the Properties tab, select Previous Cycle and configure scheduling dependencies for the node. Scheduling dependencies

Types of cross-cycle scheduling dependencies

The following table describes the types of cross-cycle scheduling dependencies.
TypeDescriptionScenario
Dependency on the instance generated for the current node in the previous cycle

The instance generated for a node in the current cycle can start to run only after the instance generated for the same node in the previous cycle is successfully run.

The instance generated for a node in the current cycle depends on the latest business data of the instance generated for the same node in the previous cycle.

Dependency on the instances generated for the level-1 descendant nodes of a node in the previous cycle

The instance generated for a node in the current cycle can start to run only after the instances generated for the descendant nodes of the current node in the previous cycle are successfully run.

The instance generated for a node in the current cycle depends on whether the output table data of the current node in the previous cycle is cleansed by the instances generated for the descendant nodes of the current node in the previous cycle.

Dependency on the instances generated for one or more specified nodes in the previous cycle

The instance generated for a node in the current cycle can start to run only after the instances generated for one or more specified nodes in the previous cycle are successfully run.

The instance generated for a node in the current cycle depends on the output table data of the instances generated for one or more other nodes in the previous cycle in the business logic but does not use the data in the code.

Dependency on the instance generated for the current node in the previous cycle

The instance generated for a node in the current cycle depends on the latest business data of the instance generated for the same node in the previous cycle. The following figure shows the configuration of the scheduling dependencies and the dependency relationship between instances. Dependency on the instance generated for the current node in the previous cycle
Note The running results of instances generated for the node scheduled by hour in different scheduling cycles and those generated for the node scheduled by minute in different scheduling cycles affect each other.
If the node scheduled by day depends on the node scheduled by hour or minute, the time when the instance generated for the node scheduled by day starts to run is affected by whether the node scheduled by hour or minute is configured with the self-dependency.
  • Node scheduled by hour or minute not configured with the self-dependency

    If the node scheduled by hour or minute is not configured with the self-dependency, the instance generated for the node scheduled by day depends on all instances generated for the node scheduled by hour or minute on the current day. In this case, the node scheduled by day aggregates and processes all table data of all instances generated for the node scheduled by hour or minute on the current day.

  • Node scheduled by hour or minute configured with the self-dependency

    If the node scheduled by hour or minute is configured with the self-dependency, the instance generated for the node scheduled by day depends only on a specific instance generated for the node scheduled by hour or minute based on the principle of scheduling time proximity. The scheduling time of the two instances are the closest.

For more information, see Appendix: Complex dependency scenarios.

Dependency on the instances generated for the level-1 descendant nodes of a node in the previous cycle

If you configure this type of scheduling dependency for a node, the instance generated for the node in the current cycle can start to run only after the instances generated for the level-1 descendant nodes of the node in the previous cycle are successfully run.
For example, Node C has two descendant nodes: Node A and Node B, and the instance generated for Node C in the current cycle depends on the instances generated for Node A and Node B in the previous cycle. In this example, the current cycle is T, and the previous cycle is T-1. The instance generated for Node C in the current cycle can start to run only after the instances generated for Node A and Node B in the previous cycle are successfully run. Dependency on the instances generated for the level-1 descendant nodes of a node in the previous cycle

Dependency on the instances generated for one or more specified nodes in the previous cycle

If you configure this type of scheduling dependency for a node, the instance generated for the node in the current cycle can start to run only after the instances generated for one or more specified nodes in the previous cycle are successfully run.
For example, Node C has two descendant nodes: Node A and Node B, and the instance generated for Node B in the current cycle depends on the instance generated for Node D in the previous cycle. In this example, the current cycle is T, and the previous cycle is T-1. The instance generated for Node B in the current cycle can start to run only after the instance generated for Node D in the previous cycle is successfully run. Dependency on the instances generated for one or more specified nodes in the previous cycle

Passing of the dry-run attribute of an ancestor node

In most cases, if you want to use branch nodes, you must configure this setting.
  • Entry point
    You can set the Follow the upstream air running attribute parameter to No for a node. This way, all the instances generated for the node and descendant nodes of the node can normally run. Pass the dry-run attribute of an ancestor node
  • Scenario

    A node has multiple descendant nodes. When the node is run, the status of a descendant node is dry-run. If you configure the instance generated for the dry-run descendant node in the current cycle to depend on the instance generated for the dry-run descendant node in the previous cycle, the dry-run attribute of the node is passed to the descendant nodes of the dry-run node. In this case, all the instances generated for the dry-run node and the descendant nodes of the dry-run node are dry-run. If you do not want the dry-run attribute to be passed, you can set the Follow the upstream air running attribute parameter to No for the dry-run descendant node in the Dependencies section of the Properties tab.

  • Example
    • Assign_Node is an assignment node. Branch_Node is a branch node. Shell_Node1 and Shell_Node2 are the descendant nodes of Branch_Node. All these nodes are scheduled by day.
    • Shell_Node1 is dry-run, and Shell_Node2 normally runs.
    • The instance generated for Shell_Node1 in the current cycle is configured to depend on the instance generated for Shell_Node1 in the previous cycle.
    • Shell_Node1 generates an auto triggered instance Shell_Node1' in the current cycle (T).
    • Shell_Node1 generates an auto triggered instance Shell_Node1 in the previous cycle (T-1).
    ExampleThe Shell_Node1' instance depends on the Shell_Node1 instance. The dry-run attribute of the Shell_Node1 node is passed. Therefore, all the instances generated for the Shell_Node1 node and descendant nodes of the Shell_Node1 node are dry-run.

Preview scheduling dependencies

After you configure scheduling dependencies for a node, you can preview the scheduling dependencies. For more information, see Subsequent steps: Check whether the scheduling dependencies meet your expectations.