This topic describes the linear regression (LR) algorithm.

Overview

LR is a regression analysis that uses the least square function of a linear regression equation to model the relationship between one or more independent variables and the dependent variable.

Scenarios

LR is a regression model that is primarily used to fit values. The model is simple but interpretable.

LR is suitable for fitting trend lines. A trend line represents the long-term trend of time series data. It indicates whether a set of data (such as stock prices, GMV, and sales volume) has increased or decreased over a period of time. Although trend lines can be drawn based on visual inspection of data points in the coordinate system, it is more appropriate to use LR to calculate the position and gradient of the trend line.

Parameters

The parameters in the following table are the values of the model_parameter parameters in the CREATE MODEL statement for creating a model. You can select the values based on your needs.

ParameterDescription
epochThe number of iterations. This parameter is usually a positive integer. Default value: -1.
Note If this parameter is set to -1, the iteration continues until it converges.
normalizeSpecifies whether normalization is required. Default value: False. Valid values:
  • False: Data is not normalized before model training.
  • True: Data is normalized before model training.

Examples

Create a model and an offline training task.
/*polar4ai*/
CREATE MODEL linearreg1 WITH
( model_class = 'linearreg', x_cols = 'dx1,dx2', y_cols='y',
 model_parameter=(epoch=3)) AS (select * from db4ai.testdata1)
Use the model for prediction.
/*polar4ai*/select dx1,dx2 FROM
PREDICT(MODEL linearreg1, select * from db4ai.testdata1 limit 10)
WITH (x_cols = 'dx1,dx2', y_cols='')
Note The columns in x_cols and y_cols must use floating-point or integer data.