This topic describes how to deploy nodes in a workspace in standard mode and how to use the cross-workspace cloning feature to clone and deploy nodes in a workspace in basic mode.

Background information

In a rigorous data development process, developers develop and debug code and configure dependencies and scheduling properties for nodes in the development environment. Then, developers commit the nodes to run them in the production environment.

DataWorks workspaces in standard mode provide both the development environment and the production environment within a single workspace. We recommend that you use workspaces in standard mode to develop and produce data. For more information, see Basic mode and standard mode.

In a workspace in standard mode, committed nodes are automatically added to the Create Deploy Task page. This page displays created, updated, and deleted nodes, resources, and functions.

After you deploy a node on the Create Deploy Task page, a deployment task is generated for the node. You can view the deployment record and status of the node on the Deploy Tasks page. Create Deploy Task page

To make the nodes, resources, and functions that you create, update, or delete on the DataStudio page take effect in the production environment, you must deploy them to the production environment on the Create Deploy Task page. On the Create Deploy Task page, you can add one or more nodes to the list of nodes to be deployed and deploy the nodes at a time.

On the Create Deploy Task page, you can modify the number of items that can be displayed on each page.

Find the node that you want to view and click View in the Actions column. You can view the changes that are made in the current version to the code and scheduling properties of the node. The following table describes the parameters related to the configurations of scheduling properties.
Parameter Description
appId The ID of the DataWorks workspace to which the node belongs. You can go to the Workspace Management page to view the ID. For more information, see Configure a workspace.
createUser The ID of the user that created the node.
createTime The time when the node was created.
lastModifyUser The ID of the user that last modified the node.
lastModifyTime The time when the node was last modified.
owner The ID of the owner of the node. You can view the owner ID in the General section of the Properties panel. For more information, see Configure basic properties.
startRightNow The mode in which auto triggered node instances are generated for the node. Valid values:
  • 0: Auto triggered node instances are generated on the next day after the node is deployed.
  • 1: Auto triggered node instances are immediately generated after the node is deployed.
For more information, see Configure immediate instance generation for a node.
taskRerunTime The number of times for which the node is rerun.
taskRerunInterval The interval between two consecutive automatic reruns. Unit: milliseconds.
reRunAble Indicates whether the node can be rerun. Valid values:
  • 0: The node can be rerun only after it fails to run.
  • 1: The node can be rerun regardless of whether it is run as expected or fails to run.
  • 2: The node cannot be rerun regardless of whether it is run as expected or fails to run.
startEffectDate The start date and time of the period during which scheduling takes effect.
endEffectDate The end date and time of the period during which scheduling takes effect.
cycleType The scheduling type of the node. Valid values:
  • 0: The node is scheduled by day, week, month, or year.
  • 1, 2, or 3: The node is scheduled by minute or hour.
cronExpress The CRON expression used for configuring periodic scheduling.
extConfig The additional configurations of the node. The value must be in the JSON format and contains the settings of the following fields:
  • ignoreBranchConditionSkip: specifies whether the dry-run property in the previous cycle is passed to the current cycle of the node.
    • true: The dry-run property in the previous cycle is passed to the current cycle of the node.
    • false: The dry-run property in the previous cycle is not passed to the current cycle of the node.
    For more information, see Pass the dry-run attribute of an ancestor node.
  • alisaTaskKillTimeout: the timeout period. Unit: hours.
resgroupId The ID of the resource group for scheduling used to run the node. For more information, see Configure the resource group.
isAutoParse Specifies whether to enable the automatic parsing feature for the node. Valid values:
  • 1: enables the automatic parsing feature for the node.
  • 0: disables the automatic parsing feature for the node.
For more information, see Configure same-cycle scheduling dependencies.
input The input and output configurations of the node. The parameter values contain the settings of the following fields:
  • str: the input or output value.
  • refTableName: the output table.
  • parseType: the way in which the scheduling dependencies of the node are configured. Valid values:
    • 0: The scheduling dependencies are automatically configured for the node based on the automatic parsing feature.
    • 1: The scheduling dependencies are manually configured for the node.
    • 2: The scheduling dependencies for the node are automatically configured based on the connections between nodes.
For more information, see Logic of same-cycle scheduling dependencies.
inputList
output
outputList
dependentTypeList The previous-cycle scheduling dependencies of the node. Valid values:
  • 0: no specified node.
  • 1: one or more specified nodes.
  • 2: child nodes.
  • 3: the current node.
For more information, see Configure previous-cycle scheduling dependencies.
dependentDataNode The IDs of the one or more nodes that are specified as the previous-cycle scheduling dependencies of the node. This parameter is valid only if the dependentTypeList parameter is set to 1.
inputContextList The context-based input and output parameters of the node. For more information, see Configure context-based parameters.
outputContextList
tags The reserved parameters.
tagList
fileId
isStop
dependentType
For more information about the time properties, see Configure time properties.
The time when instances are generated varies based on the instance generation mode.
  • Nodes for which the Instance Generation Mode parameter is set to Next Day: If you update and deploy an auto triggered node before 23:30, instances are generated for the updated node the next day.
  • Nodes for which the Instance Generation Mode parameter is set to Immediately After Deployment: If you create and deploy a node, instances whose scheduled time is 10 minutes later than the time when the node is deployed are generated as expected. If you update and deploy a node, instances whose scheduled time is 10 minutes later than the time when the node is deployed are generated again based on the latest scheduling configuration. These instances replace those that are generated before the update. For more information, see Configure immediate instance generation for a node.
  • If you create or update a node and deploy the node after 23:30, instances are generated for the new or updated node on the third day.
  • If you deploy a node after 23:30 and the Instance Generation Mode parameter is set to Immediately After Deployment, instances are not immediately generated for the node.

Deploy nodes in a workspace in standard mode

Each DataWorks workspace in standard mode is associated with two MaxCompute projects, one as the development environment and the other as the production environment. You can directly commit and deploy nodes from the development environment to the production environment.

  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. Find the workspace that you want to use and click Data Development in the Actions column.
  2. Commit the node.
    In a workspace in standard mode, only members that are assigned the developer role are allowed to commit nodes.
    1. Double-click a configured workflow. On the tab that appears, click the Submit icon icon in the top toolbar.
    2. In the Commit dialog box, select the nodes to be committed, set the Change description parameter, and then select Ignore I/O Inconsistency Alerts.
      Note If all nodes of a workflow are committed and you modify only the workflow or node properties, you can enter the description and commit the workflow, without the need to select the nodes. The changes are automatically committed.

      If a node has been committed and the node code remains unchanged, you cannot select the node again.

    3. Click Commit.
  3. After you commit the nodes, click Deploy in the upper-right corner.
    In a workspace in standard mode, only members that are assigned the administration expert, deployment expert, or administrator role can deploy nodes.
  4. On the Create Deploy Task page, select the nodes to be deployed at a time and click Add to List.
    You can filter and search for nodes by setting parameters such as Committed By, Node ID/Name, and Change Type. You can click Deploy to immediately deploy the selected nodes to the production environment. Add to List
  5. Click View List. Check whether the node information in the list is correct and click Create Task for All. All nodes in the list are deployed to the production environment.
    Create Task for All
    Note Workspaces in basic mode do not allow you to directly perform operations on table data in the production environment. Workspaces in standard mode ensure a stable, secure, and reliable production environment. Therefore, we recommend that you deploy and run nodes in a workspace in standard mode.

Deploy nodes in a workspace in basic mode

If you want to isolate the development environment from the production environment when you use workspaces in basic mode, create two workspaces, one for development and the other for production. You can clone nodes from the development workspace to the production workspace.

For example, two workspaces in basic mode are created, one for development and the other for production. You can use the cross-workspace cloning feature to clone nodes from Workspace A to Workspace B, and then commit the cloned nodes to the scheduling engine in Workspace B.
Note
  • Permission requirement: Only workspace administrators and RAM users who are assigned the administration expert role can clone nodes. The administration expert role has permissions to create clone tasks and deploy cloned nodes.
  • Supported workspace type: You can clone nodes only from workspaces in basic mode to other workspaces.
  • Prerequisites: The source workspace in basic mode and the destination workspace in standard mode are created.
  1. Go to the DataStudio page.
    1. Log on to the DataWorks console.
    2. In the left-side navigation pane, click Workspaces.
    3. Find the workspace that you want to use and click Data Development in the Actions column.
  2. Commit the node.
    1. Double-click a configured workflow. On the tab that appears, click the Submit icon icon in the top toolbar.
    2. In the Commit dialog box, select the nodes to be committed, set the Change description parameter, and then select Ignore I/O Inconsistency Alerts.
    3. Click Commit.
  3. Click Cross-project cloning in the upper-right corner.
  4. On the Create Clone Task page, select the nodes to be cloned and set the Target Workspace parameter.
  5. Click Set Compute Engine Mapping. Configure the mapping between the compute engines of the current workspace and the destination workspace.
    If the destination workspace has multiple compute engines, you must configure the mapping between the compute engines of the current workspace and the destination workspace before you clone nodes. If no mapping is configured, the nodes are cloned to the default compute engine of the destination workspace.
    Note
    • If the type of the compute engine to which the nodes to be cloned does not exist in the destination workspace, a message appears in the Set Compute Engine Mapping dialog box. You can choose to skip these nodes that cannot be cloned. Otherwise, an error is reported during the cloning process.
    • The Set Compute Engine Mapping button is displayed only if an engine type in the source or destination workspace has more than two engine instances.
  6. Click Add to List. The selected nodes are added to the list of nodes to be cloned.
    Add to List
  7. Click To-Be-Cloned Node List in the upper-right corner. Click Clone All.
  8. After the engine mapping is prechecked, confirm the information and click Clone.
  9. After the nodes are cloned, go to the destination workspace and view the cloned nodes. In most cases, the overall directory structure of the workflow is cloned.