All Products
Search
Document Center

AnalyticDB:Use SQL to implement machine learning prediction

Last Updated:Nov 24, 2025

This topic describes how to use SQL in AnalyticDB for MySQL to perform machine learning tasks by deploying a Behavior Sequence Transformer (BST) model. The process covers the entire workflow, from data transformation and model creation to training, evaluation, and prediction.

Scenarios

You can use SQL queries to quickly perform model training and inference. SQL simplifies the entire process, including data pre-processing, training, and inference.

AnalyticDB for MySQL supports BST models. You can use these models for scenarios that require understanding user behavior patterns, analyzing user preferences, predicting future trends, or providing personalized recommendations. For example, in the gaming or e-commerce industries, you can capture long-term dependencies in user behavior. This helps you better understand and predict user actions and preferences to deliver personalized services and recommendations.

More about BST models

The BST model processes behavior sequence data. It takes a sequence of behavior event IDs as input and outputs a 0 or 1 as the classification result.

For example, in a game, you can record a player's interactions and convert them into a series of behavior events.

A player in a game might generate the following behavior sequence: log on, claim logon reward, accept task, battle, battle, battle, complete task, top up, battle, log out.

This behavior sequence is then converted into the following sequence of event IDs: 0, 1, 2, 3, 3, 3, 4, 5, 3, 6. The sequence is then passed to the BST model as a string. The model analyzes the sequence and returns a classification result of 0 or 1. This result indicates which predefined category the behavior sequence belongs to.

Prerequisites

  • The cluster is an Enterprise Edition, Basic Edition, or Data Lakehouse Edition cluster.

  • The cluster kernel version must be 3.2.4.0 or later.

    Note

    To view and update the minor version of an AnalyticDB for MySQL cluster, log on to the AnalyticDB for MySQL console and go to the Configuration Information section of the Cluster Information page.

  • The AI resource group feature must be enabled. This feature is in public preview. To enable it, contact technical support.

Step 1: Create an AI resource group and attach it to a regular resource group

To run machine learning tasks using SQL, you need two types of resource groups:

  • AI resource group: This group is designed to support machine learning tasks. It manages the GPU resources required for these tasks and supports compute-intensive operations in model training and prediction.

  • Regular resource group: This group is used to process standard SQL queries, such as generating training data or running prediction functions.

SQL statements are first submitted to the regular resource group. If the system detects that a SQL statement requires AI computation, the task is automatically forwarded to the attached AI resource group for execution.

Therefore, you must first create an AI resource group and attach it to a regular resource group.

  1. Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.

  2. In the navigation pane on the left, choose Cluster Management > Resource Management, and then click the Resource Groups tab.

  3. In the upper-right corner of the resource group list, click Create Resource Group to create an AI resource group.

    Parameter

    Description

    Resource Group Name

    A custom name for the resource group. The name must be 2 to 30 characters in length, start with a letter, and can contain only letters, digits, and underscores.

    Job Type

    From the drop-down list, select AI.

    Important

    If the AI option is not available, the AI resource group feature is not enabled for your instance. Contact technical support.

    Specifications

    A combination of GPU, CPU, and memory resources. You can select ADB.MLLarge.24, ADB.MLLarge.2, or ADB.MLAdvavced.6.

    Minimum Resources

    The minimum number of resource units.

    Maximum Resources

    The maximum number of resource units.

  4. Click OK to create the resource group.

  5. Find the target regular resource group. In the Actions column, click Modify. Use the ML Job Resubmission Rules to attach the regular resource group to the newly created AI resource group.

Step 2: Data transformation

Data transformation is the process of converting raw data into a format suitable for model training. This process depends on three key factors: the format of the raw data, its storage location, and the input data format required by the model.

Table schema requirements for model training data

  • Input feature column: This column must be a string that contains a series of comma-separated integers. Each entry represents the feature vector of a sample.

  • Target label column: This column contains two possible values, such as 0 or 1, to represent the class labels in a classification task.

Example table schema: ('1,2,3',0), ('3,2,1',1).

Data transformation method

If your raw data format differs from the format required for model training, use the following method to transform the data:

  1. Prepare a JAR package: Package your Spark data transformation program into a JAR file and upload it to an OSS bucket.

  2. Configure Spark job parameters: When you submit the Spark job, configure the required parameters. For more information about the parameters, see Spark Application Configuration Parameters.

If your raw data already meets the format requirements for model training, you can skip this step.

Step 3: Model creation and training

Create a machine learning model, define its parameters, specify the training data, and check the model status.

  1. In the navigation pane on the left, choose Job Development > SQL Development.

  2. In the SQL Console, use model statements to create and train the model.

-- Create a model
-- Specify the resource group
/*+resource_group=itrain*/
CREATE MODEL bstdemo.bst-- Model name
OPTIONS (
  model_type='bst_classification', -- Model type
  feature_cols=(event_list),  -- Feature column
  target_cols=(target),  -- Target column to predict
  hyperparameters = (  -- Other model hyperparameters
    use_best_ckpt = 'False',
    early_stopping_patience='0'
  )
)
AS SELECT event_list, target FROM bstdemo.adb; -- Source of model data. The result of this query is used to train the model with the specified variables.

-- Check the model training status. The training is complete when the status changes to READY.
SHOW MODEL bstdemo.bst;

Step 4: Model evaluation

Evaluate the created machine learning model to validate its performance.

-- The statement format is similar to the one for creating a model.
/*resource_group=rg1*/
EVALUATE MODEL bstdemo.bst
OPTIONS (
    feature_cols=(event_list), 
    target_cols=(target),  
)
AS SELECT event_list, target FROM bstdemo.adb01;

Step 5: Model prediction

Select feature columns from a table, pass the data to the trained model, and obtain prediction results based on this input data.

-- Use the trained model for prediction.
-- The first parameter of the ml_predict function is the model name. Subsequent parameters are the model's input columns.
SELECT ML_PREDICT('bstdemo.bst', event_list) FROM bstdemo.adb02;