DataWorks task scheduling is widely used in machine learning scenarios. It allows you to periodically run DataWorks tasks to update your model, which helps you create a model training pipeline. Machine Learning Platform for AI (PAI) can work with DataWorks to periodically schedule machine learning experiments.

Background information

When all nodes in an experiment are up and running, you can deploy the experiment to DataWorks and schedule DataWorks to periodically run the experiment.
Note Before you schedule nodes offline, make sure that all nodes in an experiment are up and running and DataWorks is activated. For more information, see Create a workspace.

Procedure

  1. Go to the Experiments page of the PAI console.
    1. Log on to the PAI console.
    2. In the left-side navigation pane, choose Model Training > Studio-Modeling Visualization to go to the PAI Visualization Modeling page.
      When you create a project, we recommend that you select By usage for the project. PAI-TensorFlow tasks can run only by using GPU resources.PAI-Studio
    3. Find the required project and click Machine Learning in the Operation column.
    4. In the left-side navigation pane, click Experiments. On the page that appears, find the required experiment and double-click it. The Heart Disease Prediction experiment is used in this example.
  2. On the tab that appears, choose Deploy > DataWorks Offline Schedule to go to the Data Analytics page of DataWorks.dataworks offline schedule
  3. Create a PAI node.
    1. In the Create Node dialog box, configure Node Name and Location.
      If you do not create a node in the dialog box that appears, move the point over the Create icon icon on the Data Analytics page and choose Machine Learning > PAI Experiment.
      Note The node name can be a maximum of 128 characters in length and can contain letters, digits, underscores (_), and periods (.).
    2. Click Commit.
  4. On the tab that appears, select the PAI experiment that you created from the Experiment drop-down list.
    To edit the loaded PAI experiment, click Edit in PAI Console to edit the experiment.
  5. In the right-side navigation pane, click the Properties tab. In the panel that appears, configure the properties for the node. For more information, see Basic properties.
    Configure task scheduling parameters, including the recurrence, input parameters, and output parameters.
  6. Save and commit the node.
    Notice You must configure Rerun and Parent Nodes before you commit the node.
    1. Click the Save icon icon in the toolbar to save the node.
    2. Click the Commit icon icon in the toolbar.
    3. In the Commit Node dialog box, set Change description.
    4. Click OK.
    In a workspace in standard mode, you must click Publish in the upper-right corner after you commit the node. For more information, see Deploy a node.
  7. In the upper-right corner, click Operation to view the running status and system logs of the PAI task.
    You can also perform other operations, such as generating retroactive data and testing the experiment. For more information, see View auto triggered nodes.