Special scenarios in periodic scheduling - DataWorks - Alibaba Cloud Documentation Center

Special scenarios in periodic scheduling are node dependency loop and node isolation. Both scenarios can cause problems on task scheduling. This topic describes these two scenarios and provides solutions.

Scenario 1: Node dependency loop

Cause and solution

If a node not only serves as an ancestor node of some other nodes but also depends on one or more of its descendant nodes, a node dependency loop is formed. In this case, you need to analyze the workflow and remove the dependency that causes the loop at the earliest opportunity.

For example, if an ancestor node depends on the data in the table generated by its descendant node in the same scheduling cycle, and writes data processing results to the table, a node dependency loop is formed. In this case, confirm the business scenario and modify the dependency to enable the ancestor node to depend on the data in the table generated by its descendant node in the previous scheduling cycle.

Sample scenario: Node A is run to query data in Table C and generates Table A. Node B cleanses data in Table A and writes the obtained data to Table B. Then, Node C cleanses data in Table B and writes the obtained data to Table C. In this case, a node dependency loop is formed. The following figure shows the node dependency loop.
Solution: Analyze the workflow and remove the dependency that causes the loop. If you want an ancestor node to cleanse data generated by its descendant node in the previous scheduling cycle, you can configure cross-cycle scheduling dependencies. In the sample scenario shown in the following figure, you can configure cross-cycle scheduling dependencies between Node A and Node C.

Monitoring and alerting

DataWorks provides built-in alert rules to monitor and scan auto triggered tasks on a regular basis. This ensures that auto triggered tasks can be run as scheduled and instances can be generated for auto triggered tasks. If an exception occurs, an alert is triggered. An alert notification is automatically sent if a node dependency loop is formed. We recommend that you handle the alert at the earliest opportunity.

Note

DataWorks scans auto triggered tasks at 09:00, 12:00, 16:00, 20:00, and 22:00 every day. If an exception occurs, DataWorks sends an alert notification. However, if an exception occurs within 10 minutes before a scan starts, the exception is out of the scanning scope of the current scan and can be detected until the next scan.
Alert rules for node dependency loops are built-in rules provided by DataWorks. After an alert rule is triggered, an alert notification is sent to the node owner by text message or email. You can change the alert contact on the Rule Management page. For more information, see Create a custom alert rule.

Scenario 2: Node isolation

Definition

An isolated node is a node that does not have an ancestor node. When you right-click an isolated node and select Show Ancestor Nodes on the Auto Triggered Nodes or Auto Triggered Instances page, no ancestor node appears. An isolated node cannot be automatically scheduled. If multiple nodes depend on an isolated node, your business may be severely affected. An alert notification is automatically sent if an isolated node is identified. We recommend that you handle the alert at the earliest opportunity.

Note

In DataWorks, except for the root node in a workspace, each auto triggered node that you created must have ancestor nodes. If you do not configure ancestor nodes for an auto triggered node, the auto triggered node cannot be scheduled as expected.

Causes and solutions

Cause	Description	Solution
An auto triggered node and its ancestor node have different instance generation modes.	An auto triggered node is newly created with the Instance Generation Mode parameter set to Immediately After Deployment. This node depends on only another newly created node whose Instance Generation Mode parameter is set to Next Day. In this case, the current auto triggered node becomes an isolated node because no instances are generated for the ancestor node on the current day.	Modify the Instance Generation Mode parameter for the ancestor node and deploy the ancestor node again. For more information, see Scenario 3: Configure different instance generation modes for an auto triggered task and its ancestor task.
The ancestor node of an auto triggered node is out of the specified validity period for scheduling.	In DataWorks, no instances are generated for nodes that are out of the specified validity period for scheduling. If an auto triggered node depends on only one node, and the ancestor node is out of the specified validity period for scheduling, the auto triggered node becomes an isolated node.	Modify the Effective Period parameter in the Scheduling Time section of the Properties tab on the configuration tab of the ancestor node.
The output of the ancestor node of an auto triggered node is changed.	An auto triggered node depends on only one node. If the business of the ancestor node changes, the output of the ancestor node is changed accordingly. As a result, dependencies between the auto triggered node and its ancestor node become invalid, and the auto triggered node becomes an isolated node.	Reconfigure scheduling dependencies for the auto triggered node.
Cross-workspace dependencies are configured for an auto triggered node, but periodic scheduling is not enabled for the workspace in which the ancestor node of the auto triggered node resides.	If an auto triggered node depends on only one node, the auto triggered node and its ancestor node reside in different workspaces, and periodic scheduling is not enabled for the workspace in which the ancestor node resides, the auto triggered node becomes an isolated node.	Contact the owner of the workspace to enable periodic scheduling or remove the cross-workspace dependencies.
The scheduling time of an intermediate task is not within the specified time period for data backfilling.	If the scheduling time of an intermediate task for which upstream and downstream scheduling dependencies are configured does not fall within the time period for data backfilling, a descendant node of the node that runs the intermediate task may become an isolated node. Sample scenarios: Task A and Task C are scheduled by hour and run every hour. Task B is scheduled by day and runs at `02:00` every day. Task C depends on Task B, and Task B depends on Task A. Data that is within the time period from `00:00 to 01:00` is backfilled for Task A and Task C. The scheduling time of Task B is not within the time period, which indicates that no instance is generated for Task B within the time period. As a result, the node that runs Task C becomes an isolated node due to no instance dependencies.	Backfill data for the intermediate task within the same specified time period. For more information, see Backfill data and view data backfill instances (new version). In this example, you must backfill data for Task B within the time period from `00:00 to 01:00`.

Monitoring and alerting

DataWorks generates auto triggered node instances for auto triggered nodes every night. The auto triggered node instances are scheduled to run on the next day. DataWorks provides built-in alert rules to monitor and scan auto triggered nodes on a regular basis. This ensures that auto triggered nodes can be run as scheduled and instances can be generated for auto triggered nodes. If an exception occurs, an alert is triggered. An alert notification is automatically sent if an isolated node is identified. We recommend that you handle the alert at the earliest opportunity.

DataWorks scans auto triggered nodes at 09:00, 12:00, and 16:00 every day. If an exception occurs, DataWorks sends an alert notification. However, if an exception occurs within 10 minutes before a scan starts, the exception is out of the scanning scope of the current scan and can be detected until the next scan.
Alert rules for isolated nodes are built-in rules provided by DataWorks. After an alert rule is triggered, an alert notification is sent to the node owner by text message or email. You can change the alert contact on the Rule Management page. For more information, see Create a custom alert rule.

View isolated nodes

In Operation Center, you can go to the Focus On section of the Workbench Overview tab on the O&M Dashboard page to view the number and details of isolated nodes.