Run a complete machine learning pipeline — data preparation, model training, evaluation, and prediction — directly in AnalyticDB for MySQL using SQL, without leaving your database environment.
This guide walks through deploying a behavior sequence transformer (BST) model to classify user behavior sequences. The BST model accepts a sequence of behavior event IDs as input and returns a binary classification result (0 or 1).
How it works
All ML jobs in AnalyticDB for MySQL use two types of resource groups:
AI resource group: manages GPU resources for compute-intensive operations such as model training and inference.
General resource group: handles regular SQL queries, such as generating training data and running prediction functions.
When you submit an ML-related SQL statement, it is first processed by the general resource group. If the statement requires AI compute, it is automatically forwarded to the associated AI resource group.
The full workflow spans five steps:
-- Step 1: Set up resource groups (one-time console setup)
-- Step 2: (Optional) Transform raw data into the required format
-- Step 3: Create and train a model
/*+resource_group=itrain*/
CREATE MODEL bstdemo.bst
OPTIONS (
model_type='bst_classification',
feature_cols=(event_list),
target_cols=(target),
hyperparameters = (
use_best_ckpt = 'False',
early_stopping_patience='0'
)
)
AS SELECT event_list, target FROM bstdemo.adb;
-- Step 4: Evaluate the model
/*resource_group=rg1*/
EVALUATE MODEL bstdemo.bst
OPTIONS (
feature_cols=(event_list),
target_cols=(target),
)
AS SELECT event_list, target FROM bstdemo.adb01;
-- Step 5: Run predictions
SELECT ML_PREDICT('bstdemo.bst', event_list) FROM bstdemo.adb02;Use cases
BST models are suited for scenarios where you need to analyze sequential user behavior patterns to predict outcomes or provide personalized recommendations:
Gaming: Capture long-term dependencies between player actions (login, accept task, fight, recharge) to predict behavior categories such as churn risk or purchase likelihood.
E-commerce: Analyze browsing and purchase sequences to recommend products or predict conversion.
About the BST model
The BST model processes behavior sequence data. It accepts a sequence of behavior event IDs in string format and returns 0 or 1 as the classification result.
For example, a player's in-game activity might produce this behavior sequence: log on, receive logon rewards, accept tasks, fight, fight, fight, complete tasks, recharge, fight, and log out. This sequence maps to the following event ID string passed to the model: 0,1,2,3,3,3,4,5,3,6. The model analyzes the sequence and returns a classification result indicating whether the behavior matches a predefined category.
Prerequisites
Before you begin, make sure you have:
An AnalyticDB for MySQL cluster running Enterprise Edition, Basic Edition, or Data Lakehouse Edition, with minor version 3.2.4.0 or later
To view and update the minor version, log in to the AnalyticDB for MySQL console and go to the Configuration Information section of the Cluster Information page.
The AI resource group feature enabled
NoteThe AI resource group feature is in public preview. To enable it, contact technical support.
Limitations
The BST model supports binary classification only — it returns 0 or 1.
Input feature values must be a comma-separated string of integers (for example,
'1,2,3').The result label column must contain binary values (0 or 1).
Step 1: Set up resource groups
To run ML jobs, you need an AI resource group for GPU compute and a general resource group linked to it.
Log in to the AnalyticDB for MySQL console. In the upper-left corner, select a region. In the left navigation pane, click Clusters, then click the cluster ID.
In the left navigation pane, choose Cluster Management > Resource Management. On the Resource Management page, click the Resource Groups tab.
In the upper-right corner, click Create Resource Group. Configure the following parameters:
Parameter Description Resource group name 2–30 characters; letters, digits, and underscores (_); must start with a letter Job type Select AI from the drop-down list. If no AI option appears, contact technical support to enable the AI resource group feature. Specifications Select ADB.MLLarge.24, ADB.MLLarge.2, or ADB.MLAdvavced.6 Minimum resources The minimum number of resources Maximum resources The maximum number of resources Click OK. The AI resource group is created.
Find the general resource group to associate and click Modify in the Actions column. In the Modify Resource Group panel, go to the ML Job Resubmission Rules section and associate the general resource group with the AI resource group you created.
Step 2: Prepare your training data
Model training requires data in a specific table schema:
Feature column: a string of comma-separated integers, where each value is a behavior event ID (for example,
'1,2,3')Label column: a binary integer — 0 or 1 — indicating the classification category
Example rows: ('1,2,3', 0), ('3,2,1', 1).
If your raw data is already in this format, skip to Step 3.
If your raw data needs transformation:
Upload the JAR package containing your Spark data processing program to an Object Storage Service (OSS) bucket.
Submit a Spark job with the required parameters. For details, see Spark application configuration parameters.
Step 3: Create and train a model
In the left navigation pane, choose Job Development > SQL Development.
On the SQLConsole tab, run the following statements. The
/*+resource_group=itrain*/hint routes the job to the AI resource group nameditrain.-- Create and train a BST model /*+resource_group=itrain*/ CREATE MODEL bstdemo.bst OPTIONS ( model_type='bst_classification', -- Model type feature_cols=(event_list), -- Input feature column target_cols=(target), -- Result label column hyperparameters = ( use_best_ckpt = 'False', -- Use the last checkpoint rather than the best early_stopping_patience='0' -- Disable early stopping ) ) AS SELECT event_list, target FROM bstdemo.adb; -- Training data sourceCheck training status. Training is complete when the status is
READY.SHOW MODEL bstdemo.bst;
Step 4: Evaluate the model
Run EVALUATE MODEL against a held-out evaluation dataset to verify model accuracy. The /*resource_group=rg1*/ hint routes the job to the resource group named rg1.
/*resource_group=rg1*/
EVALUATE MODEL bstdemo.bst
OPTIONS (
feature_cols=(event_list),
target_cols=(target),
)
AS SELECT event_list, target FROM bstdemo.adb01;The query returns evaluation metrics that indicate how well the model classifies behavior sequences on data it has not seen during training. Use these metrics to decide whether the model is ready for production use.
Step 5: Run predictions
Pass feature columns from any table to ML_PREDICT() to classify each row. The first argument is the model name; the second is the input feature column.
SELECT ML_PREDICT('bstdemo.bst', event_list) FROM bstdemo.adb02;The function returns 0 or 1 for each row, indicating the predicted classification category.
What's next
Spark application configuration parameters — configure Spark jobs for data preprocessing
View and update the minor version of a cluster — upgrade your cluster to meet the version requirement