GBRT Algorithm Principles & Key Parameters - PolarDB

The gradient boosting regression tree (GBRT) algorithm is a boosting ensemble method that builds a sequence of regression trees, where each tree corrects the residuals of the previous ones. PolarDB for MySQL supports GBRT as an in-database regression model through the polar4ai extension, letting you train, evaluate, and run predictions directly in SQL.

How it works

GBRT is a member of the boosting family. The weak learner is limited to the CART regression tree model. GBRT combines two components:

Regression tree (RT): A decision tree variant that predicts continuous values. GBRT chains multiple regression trees and accumulates their outputs to produce the final prediction.
Gradient boosting (GB): An iterative strategy in which each new tree fits the residuals left by the preceding trees, using the forward distribution algorithm to minimize the loss function at each step. Each tree learns from the conclusions and residuals of the preceding trees.

Use cases

GBRT suits regression problems where the target is a continuous numeric value — for example, predicting mortality or morbidity (y_cols) from socioeconomic variables such as income, education level, or socioeconomic status (common in epidemiological modeling).

Parameters

The following parameters map to the model_parameter argument in the CREATE MODEL statement.

Note

x_cols and y_cols must be floating-point or integer columns.

Parameter	Type	Default	Description
`n_estimators`	Integer (usually positive)	100	Number of boosting iterations. Higher values improve fit.
`objective`	String	`ls`	Loss function. Valid values: `ls` (least-squares), `lad` (least absolute deviation), `huber` (combines least-squares and least absolute deviation).
`max_depth`	Integer	7	Maximum depth of each regression tree. Set to `-1` to remove the depth limit; use with caution to avoid overfitting.
`random_state`	Integer (usually positive)	1	Random seed for reproducibility.

Examples

The following examples show the complete workflow — train a model, evaluate its accuracy, and run predictions — using the /*polar4ai*/ hint prefix required by the polar4ai extension.

Step 1: Create the model

/*polar4ai*/CREATE MODEL gbrt1 WITH
( model_class = 'gbrt', x_cols = 'dx1,dx2', y_cols='y',
 model_parameter=(objective='ls')) AS (SELECT * FROM db4ai.testdata1);

Step 2: Evaluate the model

/*polar4ai*/SELECT dx1,dx2 FROM EVALUATE(MODEL gbrt1,
SELECT * FROM db4ai.testdata1 LIMIT 10) WITH
(x_cols = 'dx1,dx2',y_cols='y',metrics='r2_score');

Step 3: Run predictions

/*polar4ai*/SELECT dx1,dx2 FROM
PREDICT(MODEL gbrt1, SELECT * FROM db4ai.testdata1 LIMIT 10)
WITH (x_cols = 'dx1,dx2');