All Products
Search
Document Center

DataWorks:SUB_PROCESS node

Last Updated:Apr 21, 2025

SUB_PROCESS is a special task type that you can use in a workflow to reference another workflow. SUB_PROCESS allows you to split a complex task into multiple subtasks. You can independently define and manage each subtask, which improves the maintainability and reusability of the task.

Background information

You can use a SUB_PROCESS node in a workflow to reference another existing workflow to implement nested calls between workflows. You do not need to configure scheduling tasks or dependencies for the referenced workflow. The execution of the referenced workflow is determined by the referencing workflow. The following content provides the details:

  • Whether a referencable workflow is run depends on whether it is referenced by another workflow.

  • The time when a referencable workflow is run depends on the running time of the workflow that references the referencable workflow.

    If multi-layer nested references are involved, the scheduling system starts the execution from the last layer of the references, such as the fourth layer in the left diagram. If a SUB_PROCESS node is detected in a workflow during the execution, the workflow that is referenced by the SUB_PROCESS node is also triggered to run. For example, in the left diagram, a SUB_PROCESS node is detected in Workflow-E and the SUB_PROCESS node references Workflow-D. In this case, when Workflow-E is run, Workflow-D is triggered to run at the same time.

  • The number of times that a referencable workflow is run depends on the number of times that the workflow is referenced.

    If multiple SUB_PROCESS nodes exist in each layer of multi-layer nested references, the number of times that the referenced workflow is run depends on the number of times that the workflow is referenced.

You can build a simple and direct linear reference workflow, as shown in the left diagram, based on your business requirements. You can also build a more complex workflow that contains multiple parallel branches, as shown in the right diagram. Each of the following structures has its own characteristics and applicable scenarios. Select a structure that meets your business requirements:

image
  • Left diagram: This diagram shows a linear workflow hierarchy. The workflow at each layer depends on the workflow at the previous layer, and the SUB_PROCESS node in each workflow is used to reference the previous workflow. For example, Workflow-E uses a SUB_PROCESS node to reference Workflow-D. It continues in the same manner until Workflow-B uses a SUB_PROCESS node to reference Workflow-A.

  • Right diagram: This diagram shows a complex workflow hierarchy. A workflow at each layer can be referenced by multiple workflows by using a SUB_PROCESS node. For example, Workflow-A can be referenced by Workflow-B1 and Workflow-B2 by using SUB_PROCESS nodes in Workflow-B1 and Workflow-B2.

Prerequisites

A workspace for which Participate in Public Preview of Data Studio is turned on is created, and a resource group is associated with the workspace. For more information, see Create a workspace.

Usage notes

  • If you use SUB_PROCESS nodes to implement multi-layer nesting, the number of layers for nested references can be up to 5 (including the root workflow). The total number of workflows cannot exceed 200.

  • The workflow reference relationship formed by using a SUB_PROCESS node is a triggered relationship, rather than an ancestor and descendant dependency relationship. If a SUB_PROCESS node is detected in a workflow, the workflow referenced by the SUB_PROCESS node is triggered to run.

  • For a workflow that is triggered to run due to a reference relationship, the values assigned to the scheduling parameters of the referenced workflow are determined by the scheduling time of the referencing workflow.

  • After you turn on Referencable for a workflow, the workflow and its nodes cannot depend on or be depended on by other tasks. Other tasks include other workflow tasks, tasks of the nodes that do not belong to the current workflow, and the task of the root node of the workspace.

Configure workflows

The example in this topic describes how to use a SUB_PROCESS node in Workflow2 to reference Workflow1.

Workflow1: Enable Workflow1 to be referencable

To enable a workflow to be referencable, you must remove the dependencies between the nodes in the workflow, and disable the configuration of the scheduling time for the workflow. You can perform the following steps to enable a workflow to be referencable. We recommend that you create a workflow to experience this functionality.

You can perform the following steps to create a workflow named Workflow1 and turn on Referencable for Workflow1:

  1. Create a workflow named Workflow1.

    1. Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose Shortcuts > Data Studio in the Actions column.

    2. In the left-side navigation pane of the Data Studio page, click the image icon. In the DATASTUDIO pane, find the Workspace Directories section, click the image icon, and then select Create Workflow. In the Create Workflow dialog box, configure the Name parameter and click OK. The configuration tab of the workflow appears.

  2. Turn on Referencable for Workflow1.

    In the right-side navigation pane of the configuration tab of Workflow1, click Property. On the tab that appears, turn on Referencable.

    Note
    • After you turn on the switch, the current workflow can be referenced by another workflow by using a SUB_PROCESS node. You do not need to configure properties such as the scheduling time and scheduling dependencies.

    • After you turn on Referencable for a workflow, the workflow and its nodes cannot depend on or be depended on by other tasks. Other tasks include other workflow tasks, tasks of the nodes that do not belong to the current workflow, and the task of the root node of the workspace.

Workflow2: Reference Workflow1

You can create a SUB_PROCESS node in Workflow2 to reference Workflow1.

  1. Create a workflow named Workflow2. For more information, see Auto triggered workflow.

  2. Create a SUB_PROCESS node.

    Click Workflow2 that you created to go to the configuration tab of Workflow2. Drag a SUB_PROCESS node below Logical Node to the canvas.

  3. Configure the reference relationship.

    In the Create Node dialog box, select Select Existing Workflow for the Workflow to Reference parameter and select Workflow1 from the drop-down list. You can use the system-generated name for the Node Name parameter or specify a custom name. Then, click Confirm.

    Note
    • If you want to reference a new workflow, select Create for the Workflow to Reference parameter, configure the Workflow Name and Node Name parameters based on your business requirements, and then click Confirm.

    • Move the pointer over the SUB_PROCESS node and click Open Referenced Workflow, which appears above the rectangular box of the SUB_PROCESS node. In the dialog box that appears, click Save and Open. The configuration tab of the new workflow appears.

    • On the Properties tab, which appears after you click Property in the right-side navigation pane of the configuration tab of the new workflow, Referencable is automatically turned on.

  4. Save the configurations and view the reference details.

    Go to the configuration page of Workflow2 and click Save in the top toolbar. In the Change Review dialog box, click Save.

    Note

    After you turn on Referencable for a workflow, the workflow can be referenced by multiple workflows. To view the number of times that a workflow is referenced, click Property in the right-side navigation pane of the configuration tab of the workflow. Then, check the value of the Referenced Times parameter. You can also click View Details for more information.

What to do next

After you configure references between workflows, you can deploy the workflow task to the scheduling system for periodic scheduling and monitor the running status of the workflows in the workflow task by using the data backfill feature.

  1. Configure the scheduling properties.

    You can periodically schedule the workflow task only after you configure the scheduling properties for the workflow task. For more information, see Configure scheduling properties.

  2. Deploy the workflow task.

    The workflow task is automatically scheduled only after you deploy the workflow task to the production environment. For information about how to deploy a workflow task, see the Deploy a workflow section in the "Workflow" topic.

    Important

    When you deploy Workflow2 to which the SUB_PROCESS node belongs, you must deploy Workflow1, which is referenced by the SUB_PROCESS node, in advance. Otherwise, Workflow2 fails to be deployed.

  3. Run and view the workflow task.

    After you deploy the workflow task, we recommend that you use the data backfill feature to view the running status of the auto triggered workflow task in Operation Center. For more information, see Getting started with Operation Center.

    In Operation Center, find the SUB_PROCESS node that is running, right-click the SUB_PROCESS node, and select View internal tasks that reference a workflow to view the running status of all nodes in the referenced workflow.