The Deploy Center in DataWorks is an enhancement of the task publishing feature in Data Development. It is used to publish objects, such as nodes, functions, resources, and widgets, across multiple environments. You can use this feature to sync objects from a source workspace to a target workspace with a single click. This topic describes the common scenarios, logic, and publishing process of the Deploy Center.
Function introduction
In the Deploy Center, a publishing operation treats objects such as nodes, functions, resources, and widgets as the minimum execution units. Associated business flows and node dependencies are deployed to the target workspace at the same time. For more information, see Publishing logic.
Based on different publishing environments, the Deploy Center in DataWorks supports same-workspace publishing, cross-workspace publishing, and cross-cloud publishing.
This feature is available only for workspaces using the standard mode with the new Data Studio. It allows you to batch publish objects like nodes, functions, resources, and widgets from the development environment to the production environment within the same workspace.
Cross-workspace deployment is mainly used to deploy objects such as nodes, functions, resources, and components from one basic-mode workspace to another within the same Alibaba Cloud account and region. This feature also allows you to implement development and production environment isolation for basic-mode workspaces. For more information, see Achieve environment isolation in basic mode.
This feature supports the deployment of objects, such as nodes, functions, resources, and components, across accounts, regions, or cloud platforms. It is available only for workspaces that use the legacy DataStudio. In essence, this feature migrates and deploys nodes from a source workspace to a target workspace that resides in a different region, account, or cloud platform.
Publishing change logic
If a node has dependencies, its ancestor nodes must be published to the target workspace before the descendant node can be published. The changes to a published node are as follows:
When you publish a node, the system replaces the source workspace name prefix with the target workspace name in the inputs and outputs of all related nodes. If the node has cross-workspace dependencies, you can also configure the Dependency Mappings parameter for the publishing environment. The ancestor dependencies, descendant dependencies, and input and output names of the node change after publishing based on your configuration. For more information, see:
When you publish a task that uses a MaxCompute engine, the system modifies the task code. It replaces any mention of the source workspace name with the target workspace name. For more information, see Code changes for tasks that use a MaxCompute engine.
To configure dependency mappings, see Configure a publishing environment. To configure scheduling dependencies, see Configure scheduling dependencies.
This topic uses output names in the
Workspace Name.Node Nameformat as examples. The actual output names may vary.
No cross-workspace dependencies
The nodes in project1 have no cross-workspace dependencies. All nodes from project1 are published to project2.
After publishing, the project1 prefix in all node input and output names is changed to project2. For example:
The input name of task_A changes from
project1_roottoproject2_root.The output name of task_A changes from
project1.task_Atoproject2.task_A.
Cross-workspace dependencies exist, but no cross-workspace dependency mapping is set
project1.task_A has a cross-workspace dependency on project2.task_A. All nodes from project1 are published to project3.
After publishing, the node changes are as follows:
Node inputs and outputs: The
project1prefix in all node input and output names is changed toproject3.Node cross-workspace dependencies:
project1.task_Aoriginally had a cross-workspace dependency onproject2.task_A. After publishing,project3.task_Astill has a cross-workspace dependency onproject2.task_A.
Cross-workspace dependencies exist, and cross-workspace dependency mapping is set
project1.task_A has a cross-workspace dependency on project2.task_A. All nodes from project1 are published to project4, and a dependency mapping is set from project2 to project3.
After publishing, the node changes are as follows:
Node inputs and outputs: The
project1prefix in all node input and output names is changed toproject4.Node cross-workspace dependencies:
project1.task_Aoriginally had a cross-workspace dependency onproject2.task_A. After publishing, the cross-workspace dependency ofproject4.task_Ais changed toproject3.task_A.
If a node in the source workspace uses a system output name for a cross-workspace dependency, the publish operation may fail. This occurs if the list in the node's configuration includes a system output name from another workspace. To resolve this, modify the dependency to use a non-system output name.
Do not reference system-generated output names:
In workspaces where the Use Data Studio (New Version) option is not enabled, the system output name is in the format
Workspace Name.File ID_out, such asshanghai_simple02.504822000_out.Reference output names in the following formats:
Workspace Name.Output Table Name(Recommended)Workspace Name.Node Name
Code changes for tasks that use a MaxCompute engine
When you publish a task that uses a MaxCompute engine, such as an ODPS SQL or ODPS Spark task, to the target workspace, the system replaces the source workspace name in the task code with the target workspace name during execution.
For example, task_A is an ODPS SQL or MaxCompute SQL node. In project1, the code to query table_A is SELECT * FROM project1.tableA. All nodes from project1 are then published to project2.
After the node is published to project2, the code to query table_A is changed to SELECT * FROM project2.tableA.