DataWorks allows you to create manually triggered nodes in DataStudio and manage the manually triggered nodes in Operation Center in the production environment. This topic describes how to create manually triggered nodes and deploy the manually triggered nodes to the production environment.

Usage notes

  • If nodes do not need to be deployed to the production environment or used to access compute engine data in the production environment, you can create ad hoc queries. For information about how to create ad hoc queries, see Create an ad hoc query.
  • Manually triggered nodes cannot be automatically scheduled.
  • You can draw lines between manually triggered nodes to specify the sequence in which the nodes are run. However, this operation is not performed to configure scheduling dependencies for the manually triggered nodes.
  • The page on which you configure a manually triggered workflow is partially different from the page on which you configure a manually triggered node. For more information, see Features on the DataStudio page.
  • If nodes do not need to be deployed to the production environment, you can create ad hoc queries. For information about how to create ad hoc queries, see Create an ad hoc query.

Go to the Manually Triggered Workflows pane

If you want to create a manually triggered node, you must go to the Manually Triggered Workflows pane in DataStudio to create a manually triggered workflow first.

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. In the top navigation bar, select the region in which the workspace that you want to manage resides. Find the workspace and click DataStudio in the Actions column.
  2. In the left-side navigation pane of the DataStudio page, click Manually Triggered Workflows. If the Manually Triggered Workflows module is not displayed in the left-side navigation pane, add this module first. For more information, see Adjust the displayed DataStudio modules.

Create a manually triggered workflow

DataWorks organizes data development processes by using workflows. DataWorks provides dashboards for different types of nodes in each workflow and allows you to use tools and optimize and manage nodes on the dashboards. This facilitates data development and management. You can place nodes of the same type in one workflow based on your business requirements. To create a workflow, perform the following steps:

  1. Create a workflow. You can use one of the following methods to create a workflow:
    • Method 1: Move the pointer over the Create a workflow icon and click Create Workflow.
    • Method 2: Right-click Manually Triggered Workflows in the Manually Triggered Workflows pane and select Create Workflow.
  2. Configure the Workflow Name and Description parameters for the workflow, and click Create.

For information about how to use a workflow, see Create and manage business processes.

Create a manually triggered node

DataWorks allows you to create a manually triggered node in the Manually Triggered Workflows pane or on the configuration tab of a manually triggered workflow.

  1. Create a manually triggered node.
    • Method 1: Create a node in the Manually Triggered Workflows pane.
      1. In the Manually Triggered Workflows pane of the DataStudio page, click Manually Triggered Workflows, find the workflow that you created, and then click the name of the workflow.
      2. Right-click the type of the compute engine that you want to use, move the pointer over Create Node, and then select the required node type.
    • Method 2: Create a node on the configuration tab of a manually triggered workflow.
      1. In the Manually Triggered Workflows pane of the DataStudio page, click Manually Triggered Workflows, find the workflow that you created, and then click the name of the workflow.
      2. Double-click the name of the workflow to go to the configuration tab of the workflow.
      3. In the left-side section of the configuration tab, click the required node type or drag the required node type to the canvas on the right side.
  2. Configure the Engine Instance, Node Type, Path, and Name parameters for the node.
  3. Define the code of the node.
    You can edit the node code based on the type of the compute engine and the syntax for the compute engine. To enable parameters to be dynamically passed in the node code, you can define variables in the node code in the ${Variable name} format, and assign built-in parameters to the variables as values when you configure properties for the node. The way you define variables in the code of a manually triggered node is consistent with that you define variables in the code of an auto triggered node.
    Note The format of a scheduling parameter varies based on the type of a node. For example, you can configure scheduling parameters for a Shell node only in the $N format. N indicates an integer that starts from 1. For more information, see Configure scheduling parameters for different types of nodes.

(Optional) Specify the sequence in which manually triggered nodes are run

If you want to run the manually triggered nodes in a manually triggered workflow in sequence, you can draw lines between the nodes on the configuration tab of the workflow to specify a sequence. If you do not specify the sequence, the nodes are run at the same time.

Configure properties for a manually triggered node

If a manually triggered node needs to be deployed to the production environment and used to access compute engine data in the production environment, you can configure properties that determine how the node is run in the production environment on the tab that appears after you click General in the right-side navigation pane on the configuration tab of the manually triggered node. The functionalities of the properties you configure for a manually triggered node are consistent with the properties you configure for an auto triggered node. The following table describes the properties that you need to configure.

PropertyDescription
GeneralIn this section, the node name, node ID, node type, and owner of the node are automatically displayed. You do not need to configure additional settings.
Note
  • By default, the owner of the node is the current user. You can modify the owner of the node based on your business requirements. You can select only a member in the current workspace as the owner of the node.
  • An ID is automatically generated after the node is committed.
ParametersThe parameters that you define to run the node.
Note DataWorks provides scheduling parameters that can be classified into custom parameters and built-in variables based on their value assignment methods. Scheduling parameters support dynamic parameter settings for node scheduling. If a variable is defined during the development of the node code, you can assign a value to the variable in the Parameters section.
Resource GroupThe resource group that is used to issue the node after the node is deployed to the production environment. The resource groups for scheduling that are available in the current workspace are displayed in the Resource Group drop-down list.
Note If a large number of nodes need to be run in parallel, exclusive computing resources are required to ensure that the nodes can be run as scheduled. In this case, we recommend that you use an exclusive resource group for scheduling. For more information about exclusive resource groups for scheduling, see Exclusive resource groups for scheduling.

Debug the manually triggered node

You can debug a manually triggered node by clicking Debugging and Debugging in the top toolbar on the configuration tab of the node. You can also debug the manually triggered workflow to which the manually triggered node belongs by clicking Debugging in the top toolbar on the configuration tab of the workflow.

Note In most cases, the debugging operations are performed by using your personal account that you configure to access the compute engine associated with the workspace in the development environment. For information about the compute engine in the development environment, go to the Compute Engine Information section on the Workspace Management page. For more information, see Go to the Workspace Management page.

(Optional) Configure parameters for the manually triggered workflow

If you define variables with the same name in the manually triggered workflow and values of the variables can be assigned in a unified manner, you can configure parameters for the workflow on the configuration tab of the workflow. For more information, see Use workflow parameters. After you configure the parameters for the workflow, run the workflow. Values are assigned to the parameters. You can view the status of the workflow based on the value assignment result.

You can use the default values of the parameters. You can also specify only names for the parameters. Each time you run the workflow in the production environment, you can separately assign a value for each parameter.

Commit and deploy the manually triggered node

To run the manually triggered node in the production environment, you must save the node configurations, and commit and deploy the node to Operation Center in the production environment. For more information about how to commit and deploy a node, see Deploy nodes. The deployment operation is not necessarily successful. You must confirm the final status of the node.

To view the status of the node that is deployed to the production environment, go to the Manual Task page in Operation Center.

Run the manually triggered node in the production environment

Manually triggered nodes cannot be automatically scheduled. To run the manually triggered node, go to the Manual Task page in Operation Center, find the desired node, and then run the node. You can run the entire workflow to which the node belongs or run some nodes in the workflow. You can also specify the time at which the nodes are run.

  • Value assignment for workflow parameters: If you configure parameters for the workflow, you can assign values to the variables that have the same name in the code of the workflow in a unified manner by assigning values to each workflow parameter each time you run the workflow. You need to assign values to workflow parameters only if you configure parameters for the workflow.
  • Sequence: The nodes are run based on the sequence that you specify. For more information, see the (Optional) Specify the sequence in which manually triggered nodes are run section.
    • If you do not specify the sequence, the nodes are run at the same time.
    • If you specify the sequence in which nodes in a workflow are run, the nodes are run in sequence. Ancestor and descendant nodes

An instance is generated for a node in the production environment each time you run the node. Therefore, a manually triggered instance is generated for a manually triggered node each time the node is run. To view running results of a manually triggered node, go to the Manual Instance page under Manual Task in Operation Center.View and manage manually triggered node instances