Ridge Regression Prediction - Platform For AI - Alibaba Cloud Documentation Center

The Ridge Regression Prediction component supports sparse and dense data. You can use this component to estimate values of numeric variables, such as housing prices, sales volumes, and temperatures. This topic describes how to configure the Ridge Regression Prediction component.

Limits

The supported computing engines are MaxCompute, Apache Flink or DLC.

How Tikhonov regularization works

Tikhonov regularization is a biased estimation regression method dedicated to the analysis of collinearity data. It is essentially an improved least squares method. By giving up the unbiasedness of the least squares method, Tikhonov regularization is more realistic and reliable to obtain regression coefficients and fits better with ill-conditioned data than the least squares method. However, Tikhonov regularization also causes partial information loss and reduced accuracy.

Configure the component in the PAI console

Input ports

Input port (left-to-right)	Data type	Recommended upstream component	Required
Input model of the prediction	None	Ridge Regression Training	Yes
Input data of the prediction	None	Read Table Feature Engineering Data preprocessing	Yes

Component parameters

Tab	Parameter	Description
Field Setting	reservedCols	The columns to be reserved by the algorithm.
Field Setting	vectorCol	The name of the vector column.
Parameter Setting	predictionCol	The name of the prediction column.
Parameter Setting	numThreads	The number of threads of the component. Default value: 1.
Execution Tuning	Number of Workers	The number of workers. This parameter must be used together with the Memory per worker, unit MB parameter. The value of this parameter must be a positive integer. Valid values: [1,9999].
Execution Tuning	Memory per worker, unit MB	The memory size of each worker. Valid values: 1024 to 64 × 1024. Unit: MB.

Configure the component by coding

You can copy the following code to the code editor of the PyAlink Script component. This allows the PyAlink Script component to function like the Ridge Regression Prediction component.

from pyalink.alink import *

def main(sources, sinks, parameter):
    model = sources[0]
    batchData = sources[1]

    predictor = RidgeRegPredictBatchOp()\
        .setPredictionCol("pred")
    result = predictor.linkFrom(model, batchData)
    result.link(sinks[0])
    BatchOperator.execute()