The task orchestration feature of Data Management (DMS) allows you to configure a dependency check node for a task flow. You can specify whether a task flow depends on a node in the same task flow or on another task flow. This topic describes how to configure a dependency check node for a task flow.

Scenarios

  • Inter-task flow dependency: Task Flow A runs only after Task Flow B succeeds.

    For example, Task Flow A runs at 2:00 every Monday to train a recommendation model. Task Flow A runs on top of the business data of the last week (Monday to Sunday) that is generated by Task Flow B. Hence, Task Flow A is dependent on Task Flow B.

  • Self-dependency: Task Flow A runs only after its previous cycle is complete.

    For example, Task Flow A performs data cleansing. It can run only after the previous instance of Task Flow A is complete.

Usage notes

A new task flow must have at least one completed node before the task flow can depend on the node.
  • Manually triggered task flows: Select a descendant node of the dependency check node for the task flow, and click Test Run Current Node or Start Test Run from Current Node. The dependency check node for the task flow is skipped, and a record indicating that the test run is successful is generated.
  • Auto triggered task flows: On the task flow details page, click the xiala icon, and select Dry Run.
    Note Then a successful auto triggering record is generated. (The task flow is not run.)
  • If a node depends on multiple nodes, you must configure the dependency of the node on other nodes separately. For example, if Node D depends on Nodes A, B, and C, you must configure the dependency of Node D on each of these nodes. The following steps show how to configure the dependency.

Procedure

  1. Log on to the DMS console V5.0.
  2. In the top navigation bar, click DTS. In the left-side navigation pane, choose Data Development > Task Orchestration.
  3. Click the name of the task flow you want to manage to go to the details page of the task flow.
    Note If you need to create a task flow, see Add a task flow.
  4. In the Task Type list on the left side of the canvas, drag the Dependency Check node to the blank area on the canvas.
  5. Double-click the Dependency Check node.
  6. On the configuration page of the Dependency Check node, set the following items.
    Item Description
    Task Flow The task flow for which you want to configure a dependency check node. You can search for and select the task flow in the Task Flow field.
    Note
    • If you select the current task flow, the task flow depends on the result of its previous cycle.
    • If you select a different task flow, the current task flow depends on the result of the selected task flow.
    Dependency Object The type of the dependency object. Valid values:
    • Task Flow: The current task flow depends on another task flow.
    • Single Node: The current task flow depends on an existing node in the task flow.
    Dependency Settings The expected time settings of the dependency check, including the start time offset and end time offset.
    The start time offset and end time offset are based on the business time (the day before the node runs):
    • For an auto triggered task flow, the business time is calculated based on the scheduled execution time of the task flow.
    • For a manually triggered task flow, the business time is calculated based on the time when the task flow is manually triggered.
    Dependency configuration examples:
    • Example 1: Task Flow A, which is scheduled to start daily at 7:00, depends on the last scheduled cycle of Task Flow B, which is scheduled to start daily at 7:00. In the dependency check configuration, the start time offset of Task Flow A remains at 7:00, and its end time offset increases by 1 day.
      Note The last scheduled cycle ends when the last execution of a task flow is complete. For more information about scheduled execution time, see Configure scheduling properties for the task flow.
    • Example 2: Task Flow A, which is scheduled to start daily at 8:00, depends on the last scheduled cycle of Task Flow B, which is scheduled to start daily at 7:00. In the dependency check configuration, the start time offset of Task Flow A remains at 8:00, and its end time offset increases by 1 day and decreases by 1 hour.
    • Example 3: Task Flow A, which is scheduled to start daily at 8:10, depends on the last scheduled cycle of Task Flow B, which is scheduled to start daily at 7:00. In the dependency check configuration, the start time offset of Task Flow A remains at 8:10, and its end time offset increases by 1 day and decreases by 1 hour and 10 minutes.
    Note Click Preview in the upper part of the current configuration page. In the Time Preview dialog box, check whether the time values for the task flow are configured as expected and whether they ensure a successful run of the node.
    Check Policy for Instances The check policy for the task flow. Valid values:
    • Last Node Successful: The task flow passes the dependency check after the last node that the current task flow depends on succeeds.
    • All Nodes Successful: The task flow passes the dependency check only after the successful run of all the nodes that the current task flow depends on.
    • Specified Node Successful: The task flow passes the dependency check after the successful run of the specified node that the current task flow depends on.
    Note
    • For a manually triggered task flow, only manually triggered nodes are checked.
    • For an auto triggered task flow, only auto triggered nodes are checked.
    For example, assume that Task Flow A was manually triggered and was never automatically triggered. If you perform a dependency check on Task Flow A, the test run will succeed, but the task flow will fail to be automatically triggered.
  7. Click Test Run.
    • If status SUCCEEDED appears in the last line of the execution log, the test run is successful.
    • If status FAILED appears in the last line of the execution log, the test run fails.
      Note If the test run fails, you can view the node on which the failure occurs and the reason for the failure in the execution log. Then, you can modify the configuration of the node and try again.