Train a click-through rate (CTR) prediction model using the Avazu dataset and deploy the complete preprocessing pipeline to Elastic Algorithm Service (EAS) for consistent offline-online inference.
Prerequisites
A workspace is created. For more information, see Create and manage workspaces.
MaxCompute resources are associated with the workspace. For more information, see Create and manage workspaces.
Dataset
This tutorial uses a subset of the Avazu dataset containing 200,000 samples: 160,000 for training and 40,000 for prediction. For complete data, see Avazu.
Column Name | Type | Description |
id | STRING | Ad ID |
click | DOUBLE | Whether the ad was clicked |
dt_year | INT | The year of the click |
dt_month | INT | The month of the click |
dt_day | INT | The day of the click |
dt_hour | INT | The hour of the click |
c1 | STRING | Anonymized categorical variable |
banner_pos | INT | Title position |
site_id | STRING | Site ID |
site_domain | STRING | Site domain |
site_category | STRING | Site category |
app_id | STRING | Application ID |
app_domain | STRING | Application domain |
app_category | STRING | Application category |
device_id | STRING | Device ID |
device_ip | STRING | Device IP address |
device_model | STRING | Device model |
device_type | STRING | Device type |
device_conn_type | STRING | Device connection type |
c14 - c21 | DOUBLE | Anonymized categorical variables (8 columns in total) |
Procedure
-
Go to the Machine Learning Designer page.
-
Log on to the PAI console.
-
In the left-side navigation pane, click Workspaces. On the Workspaces page, click the name of the workspace that you want to manage.
-
In the left-side navigation pane, choose .
-
Build a workflow.
On the Designer page, click the Preset Template tab.
In the template list, find the A CTR prediction solution that ensures offline and online consistency template and click Create.
In the New Workflow dialog box, configure the key parameters. You can use the default values for the other parameters.
Set Workflow Data Storage to an OSS bucket path. This path is used to store temporary data and models generated during the workflow run.
Click OK.
The workflow is created in about 10 seconds.
In the workflow list, double-click the A CTR prediction solution that ensures offline and online consistency workflow to open it.
The workflow created from the template is shown in the following figure.

The workflow processes features as follows:
Numerical features: Normalized using the normalization algorithm.
Categorical features: One-hot encoded and combined with numerical features into a vector. The FM algorithm then trains the model and performs prediction.
Run the workflow and view the output.
At the top of the canvas, click
.After the workflow completes, right-click Binary Classification Evaluation-1 on the canvas and select Visual Analytics. Alternatively, click
at the top of the canvas.In the Binary Classification Evaluation-1 dialog box, view the prediction accuracy on the Metrics Data tab.

If the metrics meet your requirements, package and deploy the entire pipeline (data preprocessing, feature engineering, and model prediction) to EAS.
At the top of the canvas, click Create Pipeline Model.
Select Normalization Batch Prediction-2. The entire downstream pipeline is automatically selected. Click Next. The selected pipeline and models are packaged into a pipeline model.

Confirm the model packaging information and click Next. The packaging task takes approximately 3-5 minutes.

Deploy the model service.
Method 1: After Run Status shows Successful, click Deploy to EAS. In the EAS console, configure Service Name and Resource Deployment Information, then click Deploy. For details, see Deploy a pipeline as an online service.
Method 2: If you closed the dialog box, click View All Tasks in the upper-right corner of the canvas. In Historical Tasks, wait until Status shows Success:
Click Actions > Model > Deploy to deploy the model service.
Alternatively, click Model List at the top of the canvas. Select the packaged model and click Deploy to EAS.
In the EAS console, find your service and click Online Debugging in the Actions column. For details, see Debug a service online.
In Request Body, enter test data matching the dataset structure:
[{"id":"10000169349117863715","click":0.0,"dt_year":14,"dt_month":10,"dt_day":21,"dt_hour":0,"C1":"1005","banner_pos":0,"site_id":"1fbe01fe","site_domain":"f3845767","site_category":"28905ebd","app_id":"ecad2386","app_domain":"7801e8d9","app_category":"07d7df22","device_id":"a99f214a","device_ip":"96809ac8","device_model":"711ee120","device_type":"1","device_conn_type":"0","c14":15704.0,"c15":320.0,"c16":50.0,"c17":1722.0,"c18":0,"c19":35.0,"c20":100084.0,"c21":79.0}]The service processes the data through the Normalization Prediction → One-Hot Encoding Prediction → Vector Assembler → FM Prediction pipeline. The result is shown below.
