All Products
Search
Document Center

DataWorks:Create and use PAI Designer nodes

Last Updated:Jun 09, 2025

Machine Learning Designer is a visualized modeling tool that is provided by Platform for AI (PAI) to implement end-to-end machine learning development. DataWorks provides PAI Designer nodes. You can use the nodes to load pipelines of Machine Learning Designer. This way, pipeline tasks can be periodically scheduled based on the scheduling configurations of the PAI Designer nodes.

Prerequisites

  • DataWorks is authorized to access PAI.

    You can perform one-click authorization on the authorization page. For more information about the policy, see Role 1: AliyunServiceRoleForDataworksEngine. Only an Alibaba Cloud account or a RAM user to which the AliyunDataWorksFullAccess policy is attached can perform one-click authorization.

  • A workflow is created.

    In DataStudio, development operations are performed on different development engines based on workflows. Therefore, you must create a workflow before you can create a node. For more information, see Create a workflow.

Step 1: Create a PAI Designer node

  1. Go to the DataStudio page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O&M > Data Development. On the page that appears, select the desired workspace from the drop-down list and click Go to Data Development.

  2. Right-click the desired workflow and choose Create Node > Algorithm > PAI Designer.

  3. In the Create Node dialog box, configure the Name and Path parameters and click Confirm. You can develop and configure the related pipeline task on the configuration tab of the node later.

Step 2: Develop tasks on the PAI Designer node

Develop a task: Write a Machine Learning Designer pipeline

If you want to load an existing pipeline when you edit the PAI Designer node, you must create a pipeline in PAI in advance. This way, you can load the created pipeline by searching for the pipeline by name. On the configuration tab of the PAI Designer node, you can create a pipeline by using one of the following methods:

  • Create a blank pipeline.

    You can create a blank pipeline, add components, and perform drag-and-drop operations on the components to build a model based on your business requirements. For more information, see Create a blank pipeline.

  • Create a preset template.

    Machine Learning Designer provides preset templates for you to quickly create pipelines that are similar to the templates. You can modify components in a preset template or the configurations of components to build a model. For more information, see Create a pipeline from a preset template.

  • Create a custom template.

    You can save a stable pipeline as a custom template for other members in your workspace to use and edit. For more information, see Create a pipeline from a custom template.

Note

Develop SQL code: Use scheduling parameters

DataWorks provides scheduling parameters whose values are dynamically replaced in the code of a node based on the configurations of the scheduling parameters in periodic scheduling scenarios. You can define variables in the node code in the ${Variable} format and assign values to the variables in the Scheduling Parameter section of the Properties tab. For information about the supported formats of scheduling parameters, see Supported formats of scheduling parameters.

Sample code of scheduling parameters:

--command='echo '\''${Variable}'\'';' \ --Scheduling parameters are supported.

Step 3: Configure task scheduling properties

If you want to periodically run tasks on the created node, click Properties in the right-side navigation pane of the node configuration tab to configure the scheduling information of the node based on your business requirements. For more information, see Overview.

Note

You must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit the node.

Step 4: Debug the node code

You can perform the following operations to check whether the node is configured as expected based on your business requirements:

  1. Optional. Select a resource group and assign custom parameters to variables.

  2. Save and execute SQL statements.

    In the top toolbar, click the 保存 icon to save SQL statements. Then, click the 运行 icon to execute the SQL statements.

  3. Optional. Perform smoke testing.

    When you commit the node or after you commit the node, you can perform smoke testing on the node in the development environment to check whether the node is run as expected. For more information, see Perform smoke testing.

Step 5: Commit the node

After the node is configured, you must commit and deploy the node. After you commit and deploy the node, the system periodically runs tasks on the node on a regular basis based on scheduling configurations.

  1. Click the 保存 icon in the top toolbar to save the code.

  2. Click the 提交 icon in the top toolbar to commit the node.

    In the Submit dialog box, configure the Change description parameter. Then, determine whether to review node code after you commit the node based on your business requirements.

    Note
    • You must configure the Rerun and Parent Nodes parameters on the Properties tab before you commit the node.

    • You can use the code review feature to ensure the code quality of nodes and prevent execution errors caused by invalid node code. If you enable the code review feature, the node code that is committed can be deployed only after the node code passes the code review. For more information, see Code review.

If you use a workspace in standard mode, you must deploy the node to the production environment after you commit the node. To deploy a node, click Deploy in the upper-right corner of the configuration tab of the node. For more information, see Deploy nodes.

What to do next

After you commit and deploy the node, tasks are periodically run on the node based on the node configurations. You can click Operation Center in the upper-right corner of the configuration tab of the node to go to Operation Center and view the scheduling status of the node. For more information, see View and manage auto triggered tasks.