Linear regression (LR) fits a linear relationship between one or more input variables and a continuous target variable using the least squares method. LR is simple and highly interpretable, making it well-suited for predicting numerical outcomes—such as revenue, spending, or inventory demand—or for measuring the direction and slope of a trend over time.
Use cases
LR works best when the relationship between inputs and the target value is approximately linear:
Trend analysis: Calculate the slope and position of a trend line across time series data—for example, determining whether monthly sales volume, GMV, or stock prices have risen or fallen over a period.
Value fitting: Fit continuous outcomes where the relationship between the input variables and the target is linear.
Data type requirement
x_cols and y_cols must be floating-point or integer columns.
Parameters
Set these parameters inside the model_parameter option of your CREATE MODEL statement.
| Parameter | Description | Default |
|---|---|---|
epoch | Number of training iterations. Set to a positive integer to cap iterations, or -1 to run until convergence. | -1 |
normalize | Controls data normalization before training. False normalizes the data before model creation; True skips normalization. | False |
Create, evaluate, and predict
The following steps walk through a complete LR workflow using the db4ai.testdata1 dataset, where dx1 and dx2 are the input features and y is the target variable. All three statements use the /*polar4ai*/ hint to route execution through the polar4ai engine.
Step 1: Create the model
Train an LR model named linearreg1 on the full dataset, running for three epochs:
/*polar4ai*/CREATE MODEL linearreg1 WITH
( model_class = 'linearreg', x_cols = 'dx1,dx2', y_cols='y',
model_parameter=(epoch=3)) AS (SELECT * FROM db4ai.testdata1);Step 2: Evaluate the model
Measure prediction accuracy using the R² score (r2_score):
/*polar4ai*/SELECT dx1,dx2 FROM EVALUATE(MODEL linearreg1,
SELECT * FROM db4ai.testdata1 LIMIT 10) WITH
(x_cols = 'dx1,dx2',y_cols='y',metrics='r2_score');Step 3: Run predictions
Apply the trained model to generate predictions on new data:
/*polar4ai*/SELECT dx1,dx2 FROM
PREDICT(MODEL linearreg1, SELECT * FROM db4ai.testdata1 LIMIT 10)
WITH (x_cols = 'dx1,dx2');Data types of x_cols and y_cols must be floating-point or integer.