To improve the execution efficiency of pipelines, Machine Learning Designer of Machine Learning Platform for AI (PAI) allows you to group multiple Alink nodes on the canvas and run them at a time. In addition, Machine Learning Designer provides the Alink intelligent aggregation feature. This feature can automatically identify the Alink nodes that can be grouped.

Background information

Alink is a new-generation machine learning algorithm framework and component library developed by the Alibaba Cloud PAI team based on Realtime Compute for Apache Flink. Machine Learning Designer provides the stream processing and batch processing components of Alink. These components can help you streamline machine learning workflows based on Flink. The workflows include data preprocessing, feature engineering, model training, and prediction.

In the left-side component library of Machine Learning Designer, the components that are marked with a purple dot are Alink components, as shown in the following figure. AlinkMachine Learning Designer allows you to manually group Alink nodes on the canvas and run them at a time. This improves execution efficiency and resource utilization. For more information, see Group multiple Alink nodes. In addition, Machine Learning Designer also provides a feature to automatically identify the Alink nodes that can be grouped on the canvas. For more information, see Alink intelligent aggregation.

Group multiple Alink nodes

Alink components can be used in the same way as the components of other frameworks. Machine Learning Designer provides a feature that allows you to run all Alink nodes in a group at a time. This feature is developed based on Flink, a high-performance in-memory data processing engine.

You can organize Alink nodes on the canvas in groups by performing the following steps:
  • Select multiple Alink nodes on the canvas.

    You can press Shift and click multiple Alink nodes. Alternatively, you can click the Box selection tool icon in the upper-left corner of the canvas and draw a box to select multiple Alink nodes.

  • Right-click a blank area on the canvas and select Select Nodes into Alink from the shortcut menu that appears.
    On the canvas, Alink nodes that belong to one group are displayed in a dashed and rounded rectangle, as shown in the following figure. Alink nodes in a group

You can click the Settings icon in the upper-right corner of the rectangle and set the Workers and Memory per Worker Node parameters for the Alink group. The settings of the Alink group have a higher priority than the settings of each Alink node in the Alink group. The Alink nodes in the Alink group are executed at a time. The system does not store the intermediate data that is generated during execution in disks. This improves execution efficiency and resource utilization.

Alink intelligent aggregation

The Alink intelligent aggregation feature can automatically identify the Alink nodes that can be grouped on the canvas and group them to reduce the overheads of transmitting the intermediate data. This improves resource utilization and the execution efficiency of pipelines.

To enable the Alink intelligent aggregation feature for a pipeline in Machine Learning Designer, click the Alink intelligent aggregation icon in the upper-left corner of the canvas.