The Lasso Regression Prediction component supports both sparse and dense data formats and predicts numerical variables, such as loan limits and temperatures. This topic describes how to configure the component.
Limits
The supported computing engines are MaxCompute, Flink, or DLC.
Algorithm principle
The Lasso regression algorithm builds a more refined model by creating a penalty function. This function shrinks some regression coefficients by forcing the sum of their absolute values to be less than a fixed value, and it sets other regression coefficients to zero. This method retains the benefits of subset shrinkage and provides a biased estimation for handling multicollinear data.
Configure the component in the GUI
-
Input ports
Input port (from left to right)
Data type
Recommended upstream component
Required
Prediction input model
None
Yes
Prediction input data
None
Yes
-
Component parameters
Tab
Parameter
Description
Field Settings
Reserved Algorithm Column Names
Select the name of the column reserved for the algorithm.
Vector column
The name of the vector column.
Parameter Settings
Prediction result column
The name of the prediction result column.
Number of threads
The number of threads for the component. The default value is 1.
Execution Tuning
Number of workers
Used with the Memory per worker (MB) parameter. The value must be a positive integer from 1 to 9999.
Memory per worker (MB)
The value must be between 1024 MB and 64 × 1024 MB.
Configure the component using code
Copy the following code to a PyAlink Script component to achieve the same functionality.
from pyalink.alink import *
def main(sources, sinks, parameter):
model = sources[0]
batchData = sources[1]
predictor = LassoRegPredictBatchOp()\
.setPredictionCol("pred")
result = predictor.linkFrom(model, batchData)
result.link(sinks[0])
BatchOperator.execute()