All Products
Search
Document Center

Platform For AI:CTR prediction with consistent offline and online inference

Last Updated:Mar 11, 2026

Build a CTR prediction pipeline that maintains consistent feature transformations between offline training and online serving.

Consistency challenge

Prediction errors often stem from mismatched feature transformations between training and serving. When normalization parameters or encoding mappings are rebuilt at serving time, the model receives different inputs than during training.

PAI Designer packages preprocessing, feature engineering, and prediction into one deployable unit. Normalization parameters and one-hot encoding mappings from training are preserved and reused at inference.

Prerequisites

Prepare the following resources:

Dataset

This tutorial uses a 200,000-sample subset of the Avazu CTR prediction dataset: 160,000 training samples and 40,000 test samples.

Column name Type Description
id STRING Advertisement ID
click DOUBLE Click indicator (1 = clicked, 0 = not clicked)
dt_year INT Year
dt_month INT Month
dt_day INT Day
dt_hour INT Hour
c1 STRING Anonymized categorical variable
banner_pos INT Banner position
site_id STRING Site ID
site_domain STRING Site domain
site_category STRING Site category
app_id STRING Application ID
app_domain STRING Application domain
app_category STRING Application category
device_id STRING Device ID
device_ip STRING Device IP address
device_model STRING Device model
device_type STRING Device type
device_conn_type STRING Device connection type
c14 - c21 DOUBLE Anonymized categorical variables (8 columns)

Open Designer

  1. Log on to the PAI console.

  2. In the left-side navigation pane, click Workspaces. On the Workspaces page, click the workspace name.

  3. In the left-side navigation pane, choose Model Training > Visualized Modeling (Designer).

Create a pipeline from a template

  1. On the Designer page, click the Preset Templates tab.

  2. Find Click-Through Rate Prediction and click Create.

  3. In the Create Pipeline dialog box, set Data Storage to an OSS bucket path for temporary data and models. Keep default values for other parameters.

  4. Click OK. Pipeline creation takes approximately 10 seconds.

  5. In the pipeline list, select Click-Through Rate Prediction and click Open.

Workflow structure

The template processes features in two parallel paths before combining them for training:

  • Numerical features: Normalized to a common range.

  • Categorical features: One-hot encoded into binary vectors, then combined with normalized numerical features using Vector Assembler.

The combined feature vector feeds into the Factorization Machine algorithm for training and prediction.

Run and evaluate

  1. At the top of the canvas, click the run button to start execution.

  2. After completion, right-click Binary Classification Evaluation-1 and select Visual Analytics. Alternatively, click the evaluation icon at the top.

  3. In the Binary Classification Evaluation-1 dialog box, view prediction accuracy on the Metrics Data tab.

The evaluation displays AUC, KS (Kolmogorov-Smirnov statistic), and F1 Score. AUC is the primary metric for CTR prediction. An AUC above 0.70 indicates reasonable performance on this dataset subset.

Package and deploy

When evaluation metrics meet requirements, package the entire pipeline—preprocessing, feature engineering, and prediction—and deploy to EAS.

Package model

  1. At the top of the canvas, click Create Pipeline Model to start packaging.

  2. Select Normalization Batch Prediction-2. The downstream pipeline is automatically selected. Click Next to package the selected pipeline and models.

  3. Confirm the packaging information and click Next. Packaging takes 3-5 minutes.

Deploy service

Deploy the packaged model using either method:

  • Method 1: After Run Status shows Successful, click Deploy to EAS. Configure Service Name and Resource Deployment Information, then click Deploy. See Deploy a pipeline as an online service.

  • Method 2: If you closed the dialog box, click View All Tasks in the upper-right corner. In Historical Tasks, wait for Status to show Success:

    • Click Actions > Model > Deploy.

    • Alternatively, click Model List at the top. Select the packaged model and click Deploy to EAS.

Test service

  1. In the EAS console, find your service and click Online Debugging in the Actions column. See Debug a service online.

  2. In Request Body, enter test data matching the dataset schema:

    [{"id":"10000169349117863715","click":0.0,"dt_year":14,"dt_month":10,"dt_day":21,"dt_hour":0,"C1":"1005","banner_pos":0,"site_id":"1fbe01fe","site_domain":"f3845767","site_category":"28905ebd","app_id":"ecad2386","app_domain":"7801e8d9","app_category":"07d7df22","device_id":"a99f214a","device_ip":"96809ac8","device_model":"711ee120","device_type":"1","device_conn_type":"0","c14":15704.0,"c15":320.0,"c16":50.0,"c17":1722.0,"c18":0,"c19":35.0,"c20":100084.0,"c21":79.0}]
  3. Click Send Request. The service processes data through the inference pipeline: Normalization Prediction > One-Hot Encoding Prediction > Vector Assembler > FM Prediction.

The response contains a prediction score for each input record. Scores closer to 1.0 indicate higher click probability; scores closer to 0.0 indicate lower probability. Because the service uses the same pipeline evaluated offline, predictions match offline metrics.

Clean up

Remove resources to avoid ongoing charges:

  1. EAS service: In the EAS console, stop or delete the deployed service.

  2. Pipeline model: In Designer, click Model List and delete the packaged model.

  3. Workflow data: Remove temporary data stored in the OSS bucket path specified in Data Storage.

  4. Workspace resources: If this workspace was created solely for this tutorial, delete the workspace and associated MaxCompute resources.

Related topics