All Products
Search
Document Center

PolarDB:BST algorithm

Last Updated:Mar 28, 2026

The Behavior Sequence Transformer (BST) algorithm uses the Transformer framework to model user behavior sequences and extract implicit features for prediction tasks. BST excels at capturing long-term time series patterns in sequential data, making it well-suited for recommendation systems and user lifecycle value mining.

Use cases

BST handles both classification and regression tasks. The input is a behavior sequence stored as a LONGTEXT column — an ordered list of integer behavior IDs sorted by timestamp. The output is an integer or floating-point prediction, such as a payment amount, a churn probability, or a payment confirmation flag.

Classification example

In a gaming operation scenario, construct the past 14 days of in-game player behaviors into a BST input sequence. The model predicts which paying users are likely to churn in the next 14 days. A user is considered churned if they do not log in for 14 consecutive days.

Regression example

In the same gaming context, use the first 24 hours of new user behaviors as the input sequence. The model predicts each user's total spending over the following 7 days.

Limitations

Class imbalance

BST works best when classes are roughly balanced. If the majority class has more than 20 times the samples of any minority class, preprocess the imbalanced classes using the K-means clustering algorithm in PolarDB for AI to restore a balanced class distribution before training.

Sequence and window size constraints

  • sequence_length must not exceed 3,000.

  • window_size must be greater than or equal to the maximum behavior ID value plus 1. If window_size exceeds 900, keep sequence_length well below the maximum to avoid memory issues.

  • When auto_heads=1, the value of int(sqrt(window_size)) + int(sqrt(sequence_length)) + 2 must not be a prime number. If it is, set auto_heads=0 and specify num_heads manually.

  • A small batch_size increases overfitting risk. The default is 16; use a larger value for more stable training.

Data format

Model creation table

ColumnRequiredTypeDescriptionExample
uidRequiredVARCHARID of each data entry (user ID or product ID)253460731706911258
event_listRequiredLONGTEXTBehavior sequence for training. Comma-separated integer behavior IDs, sorted in ascending order by timestamp."[183, 238, 153, 152]"
targetRequiredINT, FLOAT, DOUBLESample label used to measure model metrics0
val_rowOptionalINTRow-level flag for validation split. 0 = training data; 1 = validation data. Takes effect only when version=1 and val_flag=1. When val_flag=0, only rows with val_row=0 are used.1
other_featureOptionalINT, FLOAT, DOUBLE, LONGTEXTAdditional features. LONGTEXT supports JSON, list, or comma-separated format. Multiple columns are allowed (e.g., other_feature1, other_feature2).2
val_x_colsOptionalLONGTEXTValidation behavior sequence for parameter tuning. Takes effect only when version=0."[183, 238, 153, 152]"
val_y_colsOptionalINT, FLOAT, DOUBLEValidation label for parameter tuning. Takes effect only when version=0.1

Model evaluation table

ColumnRequiredTypeDescriptionExample
uidRequiredVARCHAR(255)ID of each data entry123213
event_listRequiredLONGTEXTBehavior sequence. Same format as the training table."[183, 238, 153, 152]"
targetRequiredINT, FLOAT, DOUBLESample label used to calculate model errors0
other_featureOptionalINT, FLOAT, DOUBLE, LONGTEXTAdditional features, consistent with those used during model creation2

Model prediction table

ColumnRequiredTypeDescriptionExample
uidRequiredVARCHAR(255)ID of each data entry123213
event_listRequiredLONGTEXTBehavior sequence. Same format as the training table."[183, 238, 153, 152]"
other_featureOptionalINT, FLOAT, DOUBLE, LONGTEXTAdditional features, consistent with those used during model creation2

Model parameters

The following parameters are values of model_parameter in the CREATE MODEL statement.

ParameterDefaultDescription
version0Model version. 0 = old version; 1 = new version (recommended). The old version supports val_x_cols and val_y_cols but not val_row, multiclass classification, or stacking.
model_task_typeclassificationTask type. Valid values: classification, regression, multi_classification.
num_classes2Number of prediction categories for multiclass classification. Sample labels must start at 0 and the total number of distinct labels must be less than this value. For example, when num_classes=3, valid labels are {0, 1, 2}.
batch_size16Batch size. A smaller value increases overfitting risk.
window_sizeSize of the embedding space for behavior IDs. Must be greater than or equal to the maximum behavior ID value plus 1. Otherwise, a parsing error occurs.
sequence_lengthNumber of behavior events included in model calculations. Must not exceed 3,000.
success_idThe behavior ID that the model predicts.
max_epoch1Maximum number of training iterations.
learning_rate0.0002Learning rate.
lossCrossEntropyLossLoss function. CrossEntropyLoss for binary classification; mse, mae, or msle for regression.
val_flag0Specifies whether to validate after each epoch. 0 = no validation (saves the last-epoch model); 1 = validate each epoch (saves the best-metric model; requires val_metric and val_row).
val_metriclossMetric used for epoch-level validation. See the table below.
auto_data_staticsoffSpecifies whether to count ID occurrences in the sequence and generate statistical features. on = count; off = skip.
auto_heads1Specifies whether to set the number of multi-head attention heads automatically. 1 = automatic; 0 = manual (specify num_heads). When set to 1, an insufficient video memory risk may occur. Verify that int(sqrt(window_size)) + int(sqrt(sequence_length)) + 2 is not a prime number.
num_heads4Number of multi-head attention heads. Used only when auto_heads=0.
x_value_colsColumn names to use as numeric discrete features. Cannot be empty. Values must be integers or floating-point numbers. Example: 'num_events, max_level, max_viplevel'.
x_statics_colsColumn names to use as statistical features. Cannot be empty. Each column must be LONGTEXT with fixed-length rows. Supports JSON, list, or comma-separated format. Example: 'stats_item_list, stats_event_list'.
x_seq_colsColumn names to use as sequence features. Each column must be LONGTEXT in list or comma-separated format. Example: 'event_list'.
data_normalization0Specifies whether to normalize columns specified by x_value_cols. 0 = off; 1 = on.
remove_seq_adjacent_duplicatesoffSpecifies whether to remove adjacent duplicate values from columns specified by x_seq_cols. off = keep duplicates; on = remove.
stackingoffSpecifies whether to enhance the BST algorithm through model fusion. Valid only when model_task_type='classification'. off = no fusion; on = model fusion and deduplication.
stacking_model'gbdt,svc,rt'Models to include in ensemble fusion. Valid only when stacking='on'. Valid values: bst, gbdt, svc, rt. Cannot be empty.

Validation metrics (val_metric)

ValueWhat it measuresTask type
lossSame loss function used during trainingClassification, regression
f1scoreHarmonic mean of precision and recall — useful when class distribution is unevenClassification, multiclass classification
r2_scoreCoefficient of determination — how well predictions fit the actual valuesRegression
mseMean squared error — average squared difference between predictions and actual valuesRegression
mapeMean absolute percentage error — average percentage deviation from actual valuesRegression
mape_plusVariant of MAPE that measures error only on positive labelsRegression

Evaluation metrics

The following are valid values of the metrics parameter in the EVALUATE statement.

ValueWhat it measuresTask type
accAccuracy — proportion of correct predictionsClassification, multiclass classification
aucArea under the ROC curve — model's ability to separate positive and negative classesClassification, multiclass classification
FscoreF1 score — harmonic mean of precision and recall, useful when class distribution is unevenClassification, multiclass classification
r2_scoreCoefficient of determinationRegression
mseMean squared errorRegression
mapeMean absolute percentage errorRegression
mape_plusVariant of MAPE for positive labels onlyRegression

Examples

The following examples use classification tasks. For other task types, adjust model_task_type and the corresponding loss and metrics parameters.

Create a model

/*polar4ai*/CREATE MODEL sequential_bst WITH (
  model_class = 'bst',
  x_cols = 'event_list,other_feature1',
  y_cols = 'target',
  model_parameter = (
    batch_size = 128,
    window_size = 900,
    sequence_length = 3000,
    success_id = 900,
    max_epoch = 2,
    learning_rate = 0.0008,
    val_flag = 1,
    x_seq_cols = 'event_list',
    x_value_cols = 'other_feature1',
    val_metric = 'f1score',
    auto_data_statics = 'on',
    data_normalization = 1,
    remove_seq_adjacent_duplicates = 'on',
    version = 1
  )
) AS (SELECT * FROM seqential_train);

seqential_train is the model creation data table.

Evaluate a model

/*polar4ai*/SELECT uid, target FROM evaluate(
  MODEL sequential_bst,
  SELECT * FROM seqential_eval
) WITH (
  x_cols = 'event_list,other_feature1',
  y_cols = 'target',
  metrics = 'Fscore'
);

seqential_eval is the model evaluation data table.

Run predictions

/*polar4ai*/SELECT uid, target FROM PREDICT(
  MODEL sequential_bst,
  SELECT * FROM seqential_test
) WITH (
  x_cols = 'event_list,other_feature1',
  mode = 'async'
);

seqential_test is the model prediction data table.