The Regression Model Evaluation component is used to evaluate the advantages and disadvantages of the different models of regression algorithms based on prediction results and original results. Then, evaluation metrics and histograms of residuals are generated.

Regression Model Evaluation

You can use one of the following methods to configure the Regression Model Evaluation component.

Method 1: Configure the component on the pipeline page

You can configure the parameters of the Regression Model Evaluation component on the pipeline page of Machine Learning Designer of Machine Learning Platform for AI (PAI). Machine Learning Designer is formerly known as Machine Learning Studio. The following table describes the parameters.
TabParameterDescription
Fields SettingOriginal Regression ValueThe columns of numeric data types are supported.
Predicted Regression ValueThe columns of numeric data types are supported.
TuningWorker numberThe number of cores. Valid values: 1 to 9999. This parameter must be used with the Memory Size per Node parameter.
Memory Size per NodeThe memory size of each core. Valid values: 1024 to 64 × 1024. Unit: MB.

Method 2: Use PAI commands

Configure the component parameters by using PAI commands. You can use the SQL Script component to call PAI commands. For more information, see SQL Script.
PAI -name regression_evaluation -project algo_public
    -DinputTableName=input_table
    -DyColName=y_col
    -DpredictionColName=prediction_col
    -DindexOutputTableName=index_output_table
    -DresidualOutputTableName=residual_output_table;
ParameterRequiredDescriptionDefault value
inputTableNameYesThe name of the input table. N/A
inputTablePartitionsNoThe partitions that are selected from the input table for computing. Full table
yColNameYesThe name of the column that contains original dependent variables in the input table. The columns of numeric data types are supported. N/A
predictionColNameYesThe name of the column that contains dependent variables in the prediction result. The columns of numeric data types are supported. N/A
indexOutputTableNameYesThe name of the output table of regression metrics. N/A
residualOutputTableNameYesThe name of the output table of the histogram of residuals. N/A
intervalNumNoThe number of intervals of the histogram. 100
lifecycleNoThe lifecycle of the output table. The value of this parameter must be a positive integer. N/A
coreNumNoThe number of cores. Valid values: 1 to 9999. Determined by the system
memSizePerCoreNoThe memory size of each core. Valid values: 1024 to 64 × 1024. Unit: MB. Determined by the system

Output

The output table of regression metrics is generated in the JSON format and contains the following parameters.
ParameterDescription
SSTThe total sum of squares.
SSEThe sum of squared errors.
SSRThe sum of squares due to regression.
R2The coefficient of determination.
RThe coefficient of multiple correlations.
MSEThe mean-square error.
RMSEThe root-mean-square error.
MAEThe mean absolute error.
MADThe mean error.
MAPEThe mean absolute percentage error.
countThe number of rows.
yMeanThe mean of original dependent variables.
predictionMeanThe mean of prediction results.