The Behavior Sequence Transformer (BST) algorithm leverages the powerful Transformer framework to capture long-term time series information from user behavior sequences. It extracts implicit features from behavior sequences and generates predictions. The BST algorithm offers significant benefits in business scenarios related to behavior sequences, such as recommendation systems and user lifecycle value mining.
Scenarios
The BST algorithm supports various prediction tasks, including classification and regression:
The input is typically a behavior sequence with time series characteristics, stored in the database as
LONGTEXTtype, such as user click behaviors in the past seven days.The BST algorithm outputs predictions, which are integers or floating-point numbers, such as the expected user payment amounts, user churn occurrences, or payment confirmations.
Sample classification scenarios:
Predict the number of new paying users and potential churns of regular-paying and high-paying users in gaming scenarios. For example, the in-game behaviors of paying users over the previous 14 days in a gaming operation scenario are constructed into the behavior sequence input of the BST algorithm. The algorithm extracts relevant features from these behavior sequences to predict potential churns in the following 14 days. A user is considered to have churned if they do not log on for 14 consecutive days.
Sample regression scenarios:
Predict the total spending of new users in a gaming scenario. For example, the in-game behaviors of new users within the first 24 hours in a gaming operation scenario are constructed into the behavior sequence input of the BST algorithm. The algorithm extracts relevant features from these behavior sequences to predict the total spending of new users in the following seven days.
Limits
The BST algorithm works effectively when the input data is balanced in terms of class distribution. If the input data is imbalanced, such as when a majority class has more than 20 times the samples of the minority classes, we recommend that you use the K-means clustering algorithm provided in PolarDB for AI to preprocess the imbalanced classes, such as the non-paying group, and provide a balanced overall data distribution across classes. For more information, see K-means clustering algorithm (K-Means).
Format of the data table for model creation
Column name | Required/Optional | Column type | Column description | Example |
uid | Required | VARCHAR | The ID of each data entry, such as the user ID or product ID. | 253460731706911258 |
event_list | Required | LONGTEXT | The behavior sequence used to create the model in the input table. The data in the sequence is separated by commas (,). Each behavior in the sequence is represented by a unique integer ID. The behaviors in the sequence are sorted in ascending order based on their timestamps. | "[183, 238, 153, 152]" |
target | Required | INT, FLOAT, DOUBLE | The label of the sample that is used to measure the algorithm model metrics. | 0 |
val_row | Optional | INT | To prevent the model from overfitting, you can specify a validation set. Valid values:
Note This parameter is typically used in conjunction with the version and val_flag parameters in the model creation parameter configuration. The following rules apply:
| 1 |
other_feature | Optional | INT, FLOAT, DOUBLE, LONGTEXT | Other features of the model. When using this parameter, you need to include the required feature column names in the x_value_cols and x_statics_cols configurations of the model creation parameters. Note
| 2 |
val_x_cols | Optional | LONGTEXT | A sequence of behaviors for model validation and parameter tuning. Each behavior in the sequence is represented by a unique integer ID. The behaviors in the sequence are separated by commas (,) and sorted in ascending order based on their timestamps. Note This parameter takes effect only when | "[183, 238, 153, 152]" |
val_y_cols | Optional | INT, FLOAT, DOUBLE | The label of the sample for the behavior sequence used for parameter tuning. Note This parameter takes effect only when | 1 |
The parameters in the following table are values of the model_parameter parameter in the CREATE MODEL syntax for creating an algorithm model. You can select the appropriate parameters based on your current requirements.
Parameter name | Parameter description |
version | Specifies the model version. We recommend that you use the new version. Valid values:
Note In the old version (version=0), the val_x_cols and val_y_cols parameters in the model creation data table take effect, but the val_row parameter does not take effect. The old version does not support multiclass classification tasks or the stacking model enhancement feature. |
model_task_type | The task type. Valid values:
|
num_classes | The number of prediction categories. Default value: 2. This parameter applies to multiclass classification tasks. When using this parameter, make sure that the sample labels in the target start with zero-based numbering and that the number of label categories is less than the value of this parameter. For example, when num_classes=3, the sample label categories in the dataset should only be {0, 1, 2}. |
batch_size | The batch size. A small batch size can increase the risk of overfitting in a model. Default value: 16. |
window_size | Used for embedded encoding of behavior IDs. The value must be greater than or equal to the maximum behavior ID value plus one. Otherwise, a parsing error occurs. |
sequence_length | The length of the behavior sequence involved in algorithm model calculations. The value must not exceed 3000. If the window_size parameter is greater than 900, do not set the sequence_length parameter to a value that is excessively large. |
success_id | The ID of the behavior for which the model makes a prediction. |
max_epoch | The maximum number of iterations. Default value: 1. |
learning_rate | The learning rate. Default value: 0.0002. |
loss | The loss function. Valid values:
|
val_flag | Specifies whether to perform validation after each iteration during model creation. Valid values:
|
val_metric | The metric used for validation. Valid values:
|
auto_data_statics | Specifies whether to automatically generate statistical features. Valid values:
|
auto_heads | Specifies whether to automatically specify the number of multi-attention headers. Valid values:
Note
|
num_heads | If you set the auto_heads parameter to 0, you must specify this parameter. Default value: 4. |
x_value_cols | Specifies specific columns as numeric discrete features. This parameter cannot be empty. Note
|
x_statics_cols | Specifies specific columns as statistical features. This parameter cannot be empty, and the length of the data in each row of the specified columns must be consistent (fixed-length). Note
|
x_seq_cols | Specifies specific columns as sequence features. Note
|
data_normalization | Specifies whether to normalize data in the columns specified by the x_value_cols parameter. Valid values:
|
remove_seq_adjacent_duplicates | Specifies whether to remove adjacent duplicate values from the columns specified by the x_seq_cols parameter. Valid values:
|
stacking | Specifies whether to enhance the BST algorithm through model fusion. This parameter is valid only when model_task_type='classification'. Valid values:
|
stacking_model | Specifies the models to be fused for model fusion enhancement. This parameter is valid only when stacking='on'. The valid set is {'bst', 'gbdt', 'svc', 'rt'}, and this parameter cannot be empty. Default value: 'gbdt,svc,rt'. |
Format of the data table for algorithm model evaluation
Column name | Required/Optional | Column type | Column description | Example |
uid | Required | VARCHAR(255) | The ID of each data entry, such as the user ID or product ID. | 123213 |
event_list | Required | LONGTEXT | The behavior sequence used to create the model in the input table. The data in the sequence is separated by commas (,). Each behavior in the sequence is represented by a unique integer ID. The behaviors in the sequence are sorted in ascending order based on their timestamps. | "[183, 238, 153, 152]" |
target | Required | INT, FLOAT, DOUBLE | The label of the sample used to calculate the errors of the algorithm model. | 0 |
other_feature | Optional | INT, FLOAT, DOUBLE, LONGTEXT | Other features of the model, which are consistent with those for model construction. When using this parameter, you need to include the required feature column names in the x_value_cols and x_statics_cols configurations of the model creation parameters. Note
| 2 |
The parameters in the following table are values of the metrics parameter in the EVALUATE syntax for algorithm model evaluation. You can select the appropriate evaluation metric parameters based on your current requirements.
Parameter name | Parameter description |
metrics | The metric used for validation. Valid values:
|
Format of the data table for algorithm model prediction
Column name | Required/Optional | Column type | Column description | Example |
uid | Required | VARCHAR(255) | The ID of each data entry, such as the user ID or product ID. | 123213 |
event_list | Required | LONGTEXT | The behavior sequence used to create the model in the input table. The data in the sequence is separated by commas (,). Each behavior in the sequence is represented by a unique integer ID. The behaviors in the sequence are sorted in ascending order based on their timestamps. | "[183, 238, 153, 152]" |
other_feature | Optional | INT, FLOAT, DOUBLE, LONGTEXT | Other features of the model, which are consistent with those for model construction. When using this parameter, you need to include the required feature column names in the x_value_cols and x_statics_cols configurations of the model creation parameters. Note
| 2 |
Example
Classification tasks are used in the following examples. For more task types, see model task type.
Model creation and offline learning
/*polar4ai*/CREATE MODEL sequential_bst WITH (
model_class = 'bst',
x_cols = 'event_list,other_feature1',
y_cols='target',
model_parameter=(
batch_size=128,
window_size=900,
sequence_length=3000,
success_id=900,
max_epoch=2,
learning_rate=0.0008,
val_flag=1,
x_seq_cols='event_list',
x_value_cols='other_feature1',
val_metric='f1score',
auto_data_statics='on',
data_normalization=1,
remove_seq_adjacent_duplicates='on',
version=1)) AS (SELECT * FROM seqential_train);In this example, seqential_train is the model creation data table.
Model evaluation
/*polar4ai*/SELECT uid,target FROM evaluate(MODEL sequential_bst,
SELECT * FROM seqential_eval) WITH
(x_cols = 'event_list,other_feature1', y_cols='target', metrics='Fscore');In this example, seqential_eval is the model evaluation data table.
Model prediction
/*polar4ai*/SELECT uid,target FROM PREDICT(MODEL sequential_bst, SELECT * FROM seqential_test) WITH
(x_cols= 'event_list,other_feature1',mode='async');In this example, seqential_test is the model prediction data table.