Machine Learning Designer is a one-stop machine learning visualized modeling tool that is developed based on the cloud-native PAIFlow tool. Machine Learning Designer is upgraded from Machine Learning Studio. In addition, Machine Learning Designer provides multiple proven built-in machine learning algorithms and supports large-scale distributed computing based on computing resources such as MaxCompute, Flink, and general training resources. Machine Learning Designer can meet your business requirements in different scenarios, such as product recommendation, financial risk management, and advertising prediction.
You can log on to Machine Learning Designer by using an Alibaba Cloud account or as a Resource Access Management (RAM) user. If you want to log on as a RAM user, you must grant the required permissions to the user by using your Alibaba Cloud account. For more information, see Grant the permissions that are required to use Machine Learning Designer.
Machine Learning Designer allows you to create a pipeline from a template or manually create a pipeline. If you create a pipeline from a template, you can directly deploy the models generated by the created pipeline after the pipeline is successfully run. For more information about how to create and manage pipelines, see Pipeline overview.
Machine Learning Designer provides hundreds of components that encapsulate algorithms used in AI development and supports multiple data sources such as MaxCompute tables and Object Storage Service (OSS) data. In Machine Learning Designer, you can create models to implement best practices by using in-house algorithms in the Platform for AI (PAI) console, and then deploy the models to Elastic Algorithm Service (EAS).
Machine Learning Designer allows you to manage pipeline tasks and versions, and roll back a pipeline version. For more information, see Model training.
In Machine Learning Designer, you can use the visualized dashboard feature during model training to analyze the generated data, model information, and evaluation metrics to obtain the best model.
In Machine Learning Designer, you can share pipelines with members of the current workspace, deploy pipelines to DataWorks to schedule them as periodic tasks, and publish pipelines as custom templates.
In Machine Learning Designer, you can register models that are trained and tested to the model management module to deploy them as EAS services with a few clicks or package them as compound models to be deployed. For more information, see Overview of model prediction.
Machine Learning Designer provides hundreds of components to meet your requirements in a variety of scenarios. For more information, see Component reference: Overview of all components.
These components can be classified into the following categories based on their use scenarios.
Traditional machine learning components
These components are used in data preprocessing, feature engineering, statistical analysis, outlier detection, recommendation, time series processing, or network analysis.
Components in deep learning frameworks
These components provide visual, audio, or natural language processing algorithms in the PAI-Easy frameworks and other deep learning frameworks such as TensorFlow or PyTorch.
Custom algorithm components
These components include SQL Script, Python Script, and PyAlink Script. You can use these custom algorithm-based components to create custom pipelines based on your business requirements.
These components can be classified into Alink and PAI command-based components based on their implementation frameworks and supported compute engines. Each category provides specific features.
Alink components (marked with a purple dot) support the MaxCompute, Realtime Compute for Apache Flink, and general training resources.
PAI command-based components support only the MaxCompute compute engine.
You can configure PAI command-based components by specifying component parameters or by running PAI commands. To run PAI commands, you can use the SQL Script component, DataWorks DataStudio, or the MaxCompute client.
Use Machine Learning Designer
The following figure shows the process of using Machine Learning Designer.
Before you can use Machine Learning Designer to train a model, you must create a pipeline. A pipeline can be created by using multiple methods. You can choose one based on your business requirements.
On the pipeline configuration tab of Machine Learning Designer, drag components provided by Machine Learning Designer to the canvas, configure the components to use MaxCompute, Realtime Compute for Apache Flink, or general training resources, and then connect the components to create a pipeline. Then, run the pipeline to fine-tune the trained model. After the pipeline is run, you can schedule the pipeline as a periodic task to allow the model generated by the pipeline to be automatically updated.
(Optional) View visualized reports.
After the model is trained, you can view the analysis reports on the visualized dashboard to check if the model meets your expectations.
After the model is trained, you can deploy the model in the production environment to generate model-based predictions on new data.
Pipeline scheduling engine: PAIFlow
PAIFlow is the pipeline scheduling engine of Machine Learning Designer. You can schedule a pipeline by submitting the pipeline task to PAIFlow.
The Pipeline Tasks page of PAIFlow displays all pipeline tasks that are manually executed by using Machine Learning Designer and periodically scheduled by using DataWorks. For more information, see Manage pipeline tasks.