This topic describes how to use a preset template in Machine Learning Designer to build a model to predict the output power of a power plant.
Background information
In the following sample pipeline, the power generation data of a combined cycle power plant is used to build a model to predict the output power. This example also shows the application of machine learning in industrial production. The output power of a wind energy converter determines the electrical energy that one generator can produce. If you can accurately predict the output power of the generator, you can evaluate and implement the power production plan to reduce resource waste.
Prerequisites
A workspace is created. For more information, see Create a workspace.
MaxCompute resources are associated with the workspace. For more information, see Manage workspaces.
Datasets
The dataset of the combined cycle power plant that is used in the following sample pipeline is a dataset that University of California, Irvine (UCI) provides in Machine Learning Repository. For more information, see Combined Cycle Power Plant Data Set. The dataset contains 9,568 data entries. Each data entry includes the AT, V, AP, RH, and PE fields, which indicate the measurements of the temperature, pressure, humidity, pressure intensity, and output power in sequence. The following figure shows the sample data that is used in the pipeline.
Procedure
Go to the Machine Learning Designer page.
Log on to the PAI console.
In the left-side navigation pane, click Workspaces. On the Workspaces page, click the name of the workspace that you want to manage.
In the left-side navigation pane of the workspace page, choose to go to the Machine Learning Designer page.
Create a pipeline.
On the Visualized Modeling (Designer) page, click the Preset Templates tab.
Find the Power Plant Output Forecast template and click Create.
In the Create Pipeline dialog box, configure the required parameters. You can use the default values.
The value specified for the Pipeline Data Path parameter is the Object Storage Service (OSS) bucket path of the temporary data and models generated during the runtime of the pipeline.
Click OK.
It takes about 10 seconds to create the pipeline.
On the Pipelines tab, double-click the created Power Plant Output Forecast pipeline to open the pipeline.
View the components of the pipeline on the canvas as shown in the following figure. The system automatically creates the pipeline based on the preset template.
Area
Description
①
The Corrcoef component measures the impact of each feature on the output power. After you run the pipeline, you can right-click the Corrcoef component on the canvas and select Visual Analysis to view the impact of each feature on the output power.
②
The split component divides the dataset into a training dataset and a prediction dataset by a ratio of 8 to 2.
③
The Linear Regression component performs regression modeling.
④
The Prediction component predicts output power based on the prediction dataset. The Evaluation component evaluates the prediction accuracy of the model.
Run the pipeline and view the results.
In the upper-left corner of the canvas, click the Run icon.
After you run the pipeline, right-click the Correcoef component on the canvas and select Visual Analysis.
In the Corrcoef section, view the impact of each feature on the output power.
The preceding figure shows that the temperature has the greatest impact on the output power, followed by the pressure, the humidity, and the pressure intensity.
Right-click the Linear Regression component on the canvas and choose to view the model evaluation results.
Right-click the Evaluation component on the canvas and choose to view the results that indicate the model performance.