All Products
Search
Document Center

AnalyticDB:Use SQL to implement machine learning prediction

Last Updated:Feb 28, 2026

This topic describes how to run machine learning tasks with SQL in AnalyticDB for MySQL. You will deploy a Behavior Sequence Transformer (BST) model, covering data transformation, model creation, training, evaluation, and prediction.

Scenarios

AnalyticDB for MySQL lets you train and run machine learning models entirely through SQL, without setting up separate ML infrastructure. SQL handles the full pipeline: data preprocessing, training, and inference.

AnalyticDB for MySQL supports BST classification models. BST models are designed for scenarios where you need to understand user behavior patterns, such as:

  • Analyzing user preferences over time.

  • Predicting future user actions.

  • Delivering personalized recommendations.

BST excels at capturing long-term dependencies in sequential behavior data, making it a strong fit for gaming, e-commerce, and similar industries where user interaction sequences drive business decisions.

How the BST model works

The BST model takes a sequence of behavior event IDs as input and outputs a classification result of 0 or 1.

Consider a gaming example. A player generates the following behavior sequence during a session:

Log on, claim logon reward, accept task, battle, battle, battle, complete task, top up, battle, log out.

You convert each action into a numeric event ID, producing the sequence:

0, 1, 2, 3, 3, 3, 4, 5, 3, 6

This sequence is passed to the BST model as a comma-separated string. The model analyzes the pattern and returns a classification result of 0 or 1, indicating which predefined category the behavior sequence belongs to.

Prerequisites

Before you begin, make sure your environment meets the following requirements:

  • The cluster runs Enterprise Edition, Basic Edition, or Data Lakehouse Edition.

  • The cluster kernel version is 3.2.4.0 or later.

    Note

    To view and update the minor version, go to the Configuration Information section on the Cluster Information page in the AnalyticDB for MySQL console.

  • The AI resource group feature is enabled. This feature is in public preview. To enable it, contact technical support.

Step 1: Create an AI resource group and attach it to a regular resource group

Running machine learning tasks through SQL requires two types of resource groups:

  • AI resource group: Manages the GPU resources required for compute-intensive operations such as model training and prediction.

  • Regular resource group: Processes standard SQL queries, including those that generate training data or call prediction functions.

When you submit a SQL statement to the regular resource group, the system detects whether the statement requires AI computation. If it does, the system automatically forwards the task to the attached AI resource group for execution.

Create an AI resource group first, then attach it to a regular resource group.

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

  2. In the navigation pane on the left, choose Cluster Management > Resource Management, and then click the Resource Groups tab.

  3. In the upper-right corner of the resource group list, click Create Resource Group to create an AI resource group.

    Parameter

    Description

    Resource Group Name

    A custom name for the resource group. The name must be 2 to 30 characters in length, start with a letter, and can contain only letters, digits, and underscores.

    Job Type

    From the drop-down list, select AI.

    Important

    If the AI option is not available, the AI resource group feature is not enabled for your instance. Contact technical support.

    Specifications

    A combination of GPU, CPU, and memory resources. You can select ADB.MLLarge.24 or ADB.MLLarge.2.

    Minimum Resources

    The minimum number of resource units.

    Maximum Resources

    The maximum number of resource units.

  4. Click OK to create the resource group.

  5. Find the target regular resource group. In the Actions column, click Modify. Use the ML Job Resubmission Rules to attach the regular resource group to the newly created AI resource group.

Step 2: Data transformation

Data transformation converts your raw data into the format that the BST model expects for training. Whether you need this step depends on the format and location of your raw data.

Note

If your raw data already meets the format requirements described below, skip this step and proceed to Step 3.

Table schema requirements

The training table must have two columns:

ColumnTypeDescription
Input feature columnStringComma-separated integers representing the behavior event ID sequence for one sample.
Target label columnIntegerA binary value (0 or 1) representing the class label.

Example: ('1,2,3',0), ('3,2,1',1)

How to transform your data

If your raw data does not match the required format, transform it as follows:

  1. Prepare a JAR package: Package your Spark data transformation program into a JAR file and upload it to an OSS bucket.

  2. Configure Spark job parameters: When you submit the Spark job, configure the required parameters. For more information, see Spark Application Configuration Parameters.

Step 3: Model creation and training

After your training data is ready, create and train the BST model using SQL.

  1. In the navigation pane on the left, choose Job Development > SQL Development.

  2. In the SQL Console, run the following statements to create and train the model:

       -- Create a model
       -- Specify the resource group
       /*+resource_group=itrain*/
       CREATE MODEL bstdemo.bst-- Model name
       OPTIONS (
         model_type='bst_classification', -- Model type
         feature_cols=(event_list),  -- Feature column
         target_cols=(target),  -- Target column to predict
         hyperparameters = (  -- Other model hyperparameters
           use_best_ckpt = 'False',
           early_stopping_patience='0'
         )
       )
       AS SELECT event_list, target FROM bstdemo.adb; -- Source of model data. The result of this query is used to train the model with the specified variables.
    
       -- Check the model training status. The training is complete when the status changes to READY.
       SHOW MODEL bstdemo.bst;
  3. After you submit the CREATE MODEL statement, training begins. Run SHOW MODEL bstdemo.bst; to check progress. Training is complete when the status changes to READY.

Step 4: Model evaluation

After training completes, evaluate the model's performance to validate its classification quality.

-- The statement format is similar to the one for creating a model.
/*resource_group=rg1*/
EVALUATE MODEL bstdemo.bst
OPTIONS (
    feature_cols=(event_list),
    target_cols=(target),
)
AS SELECT event_list, target FROM bstdemo.adb01;

Review the returned evaluation metrics to determine whether the model meets your requirements before using it for prediction.

Step 5: Model prediction

After you confirm that the evaluation results are satisfactory, use the ML_PREDICT function to generate predictions on new data. The first parameter of the function is the model name. Subsequent parameters are the model's input columns. The function returns a classification result of 0 or 1 for each row.

-- Use the trained model for prediction.
-- The first parameter of the ml_predict function is the model name. Subsequent parameters are the model's input columns.
SELECT ML_PREDICT('bstdemo.bst', event_list) FROM bstdemo.adb02;