All Products
Search
Document Center

Platform For AI:EasyRec Processor

Last Updated:May 18, 2026

Elastic Algorithm Service (EAS) includes a built-in EasyRec processor that lets you deploy recommendation models trained with EasyRec or TensorFlow as a scoring service with integrated feature engineering. By jointly optimizing feature engineering and the TensorFlow model, the EasyRec processor delivers a high-performance scoring service. This topic describes how to deploy and call an EasyRec model service.

Background information

The EasyRec Processor is an inference service based on the PAI-EAS processor specification (Develop custom processors by using C or C++). It is used in two scenarios:

  • For deep learning models trained with feature generation (FG) and EasyRec, the EasyRec Processor significantly improves scoring performance by caching item features in memory and optimizing feature transformation and inference performance. You can also use FeatureStore to manage online and real-time features. Custom recommendation solutions built on the PAI-Rec recommendation system development platform generate code that streamlines training, feature transformation, and inference optimization. Combined with the PAI-Rec DPI engine, this approach enables rapid model deployment and service integration, reducing costs and improving development efficiency.

  • The EasyRec Processor can also serve models trained with EasyRec or TensorFlow without the Feature Generator. This process is called running the EasyRec Processor in bypass mode.

The following figure shows the architecture of a recommendation engine based on the EasyRec Processor.

easyrec

Note: The processor also supports offline data from MaxCompute.

The EasyRec Processor includes the following modules:

  • Item Feature Cache: Caches features from FeatureStore in memory to reduce network overhead and load on FeatureStore. It also supports incremental updates, such as real-time feature updates.

  • Feature Generator: A feature engineering module (Feature generation overview and configuration) that uses the same implementation for both offline and online feature processing to ensure consistency. The feature engineering implementation leverages proven solutions from Taobao. For more information about FG-related concepts, see Concepts of data fields, data features, and FG features in EasyRec. To extend FG with custom feature operators, see Customize feature operators.

  • TFModel: Loads SavedModel files exported from EasyRec and uses Blade to optimize model inference on CPUs and GPUs.

  • Feature Instrumentation and Incremental Model Update modules: These modules primarily support real-time training scenarios. For more information, see Real-time training.

Limitations

CPU inference is supported only on g6, g7, and g8 general-purpose instance families (Intel CPUs only).

GPU inference is supported on T4, A10, GU30, L20, 3090, and 4090 GPUs, but not P100.

For details, see General-purpose (g series).

Versions

The EasyRec Processor is actively developed. We recommend using the latest version to deploy your inference service. Newer versions offer more features and improved performance.

Version list

Processor name

Release date

TensorFlow version

New features

easyrec

20230608

2.10

  • Supports FeatureGenerator and the item feature cache.

  • Supports online deep learning.

  • Supports Faiss vector retrieval.

  • Supports GPU inference.

easyrec-1.2

20230721

2.10

  • Optimizes weighted category embedding.

easyrec-1.3

20230802

2.10

  • Supports loading item features from MaxCompute to the item feature cache.

easyrec-1.6

20231006

2.10

  • Automatic feature extension.

  • GPU placement optimization.

  • Supports saving requests to the model directory with save_req.

easyrec-1.7

20231013

2.10

  • Optimizes Keras model performance.

easyrec-1.8

20231101

2.10

  • Supports the cloud version of the feature store.

easyrec-kv-1.8

20231220

DeepRec

(deeprec2310)

  • Supports DeepRec EmbeddingVariable.

easyrec-1.9

20231222

2.10

  • Fixes graph optimization issues for TagFeature and RawFeature.

easyrec-2.4

20240826

2.10

  • Supports FeatureDB in the feature store C++ SDK.

  • Supports STS tokens in the feature store C++ SDK.

  • Supports the double (float64) data type for requests.

easyrec-2.9

20250718

2.10

  • Integrates version 0.7.0 of the FeatureGenerator library.

easyrec-3.0

20251025

2.10

  • Integrates version 0.7.4 of the FeatureGenerator library.

  • Improves performance.

  • Fixes an issue with parsing new operators from the updated FG library.

easyrec-3.1

20260116

2.10.1

  • Upgrades the FG library to version 1.0.1.

  • Upgrades the FS SDK to version 20251117.

easyrec-3.2

20260209

2.10.1

Upgrades the FS SDK to version 20260202.

easyrec-3.3

20260330

2.10.1

Upgrades the FG library to version 1.0.2.

Upgrades the FS SDK to version 20260305.

easyrec-3.4

20260415

2.10.1

Upgrades the FG library to version 1.0.3.

easyrec-3.5

20260515

2.10.1

Upgrades the FG library to version 1.0.5.

Step 1: Deploy the service

When deploying an EasyRec model service with the eascmd client, set the Processor type to easyrec-{version}. For more information about how to deploy a service by using the client, see Service deployment: EASCMD. The following sections provide example service configuration files.

New FeatureGenerator library (fg_mode=normal)

This example uses the PyOdps3 node type. In this mode, you can use the new version of the FeatureGenerator library. This library provides a rich set of built-in transformation operators and supports custom operators, complex input types like arrays and maps, and DAG-based feature dependencies.

The following example uses PAI-FeatureStore to manage feature data. Replace the ${fs_project},${fs_model} variables in the script with their actual values. For more details, see Step 2: Create and deploy an EAS model service.

import json
import os

service_name = 'ali_rec_rnk_with_fg'

config = {
  'name': service_name,
  'metadata': {
    "cpu": 8,
    #"cuda": "11.2",
    "gateway": "default",
    "gpu": 0,
    "memory": 32000,
    "rolling_strategy": {
        "max_unavailable": 1
    },
    "rpc": {
        "enable_jemalloc": 1,
        "max_queue_size": 256
    }
  },
  "processor_envs": [
    {
      "name": "ADAPTE_FG_CONFIG",
      "value": "true"
    }
  ],
  "model_path": "",
  "processor": "easyrec-3.5",
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20250722/export/final_with_fg"
      }
    }
  ],
  # When you change fg_mode, the invocation method must also be changed.
  # If fg_mode is 'normal' or 'tf', use the EasyRecRequest SDK.
  # If fg_mode is 'bypass', use the TFRequest SDK.
  'model_config': {
    'outputs': 'probs_ctr,probs_cvr',
    'fg_mode': 'normal',
    'steady_mode': True,
    'period': 2880,
    'access_key_id': f'{o.account.access_id}',
    'access_key_secret': f'{o.account.secret_access_key}',
    "load_feature_from_offlinestore": True,
    'region': 'cn-shanghai',
    'fs_project': '${fs_project}',
    'fs_model': '${fs_model}',
    'fs_entity': 'item',
    'featuredb_username': 'guest',
    'featuredb_password': '123456',
    'log_iterate_time_threshold': 100,
    'iterate_featuredb_interval': 5,
    'mc_thread_pool_num': 1,
  }
}

with open('echo.json', 'w') as output_file:
    json.dump(config, output_file)

os.system(f'/home/admin/usertools/tools/eascmd -i {o.account.access_id} -k {o.account.secret_access_key} -e pai-eas.cn-shanghai.aliyuncs.com create echo.json')
# os.system(f'/home/admin/usertools/tools/eascmd -i {o.account.access_id} -k {o.account.secret_access_key} -e pai-eas.cn-shanghai.aliyuncs.com modify {service_name} -s echo.json')

Note: Change the values of the featuredb_username and featuredb_password parameters to a valid username and password.

TF operator version of FeatureGenerator (fg_mode=tf)

Important: The TF operator version of FeatureGenerator supports only a limited set of built-in features: id_feature, raw_feature, combo_feature, lookup_feature, match_feature, and sequence_feature. Custom FeatureGenerator operators are not supported.

The following shell script is used for deployment. The script includes the AccessKey ID and AccessKey Secret in plaintext. This approach is straightforward but does not demonstrate how to use PAI-FeatureStore or load data from MaxCompute to reduce the load on Hologres.

We recommend that you use PAI-FeatureStore and load data from MaxCompute. For more information, see Step 2: Create and deploy an EAS model service. The referenced topic describes a more secure deployment method using a Python script, the DataWorks built-in o object, and a temporary Security Token Service (STS) token. In that example, load_feature_from_offlinestore is set to True.

bizdate=$1
# When you change fg_mode, the invocation method must also be changed. If fg_mode is 'normal' or 'tf', use the EasyRecRequest SDK. If fg_mode is 'bypass', use the TFRequest SDK.
cat << EOF > echo.json
{
  "name":"ali_rec_rnk_with_fg",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large",
      "instances": null
    }
  },
  "model_config": {
    "remote_type": "hologres",
    "url": "postgresql://<AccessKeyID>:<AccessKeySecret>@<endpoint>:<port>/<database>",
    "tables": [{"name":"<schema>.<table_name>","key":"<index_column_name>","value": "<column_name>"}],
    "period": 2880,
    "fg_mode": "tf",
    "outputs":"probs_ctr,probs_cvr",
  },
  "model_path": "",
  "processor": "easyrec-3.5",
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final_with_fg"
      }
    }
  ]
}

EOF
# Run the deployment command.
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# Run the update command.
eascmd update ali_rec_rnk_with_fg -s echo.json

Bypass FeatureGenerator (fg_mode=bypass)

If you do not use FeatureGenerator, you must assemble the request on the client before you call the EasyRec processor. For more information, see How to use EAS for inference without training with EasyRec.

bizdate=$1
# When you change fg_mode, the invocation method must also be changed. If fg_mode is 'normal' or 'tf', use the EasyRecRequest SDK. If fg_mode is 'bypass', use the TFRequest SDK.
cat << EOF > echo.json
{
  "name":"ali_rec_rnk_no_fg",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large",
      "instances": null
    }
  },
  "model_config": {
    "fg_mode": "bypass"
  },
  "processor": "easyrec-3.5",
  "processor_envs": [
    {
      "name": "INPUT_TILE",
      "value": "2"
    }
  ],
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final/"
      }
    }
  ],
  "warm_up_data_path": "oss://easyrec/ali_rec_sln_acc_rnk/rnk_warm_up.bin"
}

EOF
# Run the deployment command.
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# Run the update command.
eascmd update ali_rec_rnk_no_fg -s echo.json

The following table describes the key parameters. For information about other parameters, see JSON deployment.

Parameter

Required

Description

Example

processor

Yes

The name of the EasyRec processor.

"processor": "easyrec"

fg_mode

Yes

The feature engineering mode. Based on the selected mode, you must use the corresponding SDK and request construction method when calling the service.

  • normal: (Recommended)

    • Description: Uses the FeatureGenerator library for feature transformation and feeds the output to the model. This mode provides a rich set of built-in FeatureGenerator operators, supports custom operators, and supports DAG-based feature dependencies.

    • Invocation method: The client must use the EasyRecRequest SDK and pass only high-level features, such as user IDs and item ID lists.

  • tf:

    • Description: Embeds FeatureGenerator as a TensorFlow operator into the TensorFlow computation graph and performs graph optimization for higher performance.

    • Invocation method: Same as the normal mode. The client must use the EasyRecRequest SDK.

  • bypass:

    • Description: Skips the built-in FeatureGenerator, and the service acts only as a TensorFlow model inference engine. This mode is suitable for scenarios where you use custom feature processing. In this mode, you do not need to configure parameters related to the item feature cache or processor access to PAI-FeatureStore.

    • Invocation method: The client must use the TFRequest SDK. The caller must prepare and assemble all raw feature data that is required by the model on the client and organize the data into the Tensor format. This mode is suitable for advanced users who have an external feature processing system.

"fg_mode": "normal"

outputs

Yes

The names of the output variables of the TensorFlow model, such as probs_ctr. Separate multiple names with commas (,). To find the output variable names, run the TensorFlow command saved_model_cli.

"outputs":"probs_ctr,probs_cvr"

save_req

No

Specifies whether to save the data file obtained from the request to the model directory. The saved file can be used for warm-up and performance testing. Valid values:

  • true: The file is saved.

  • false (default): The file is not saved. Set this parameter to false in production environments to prevent performance degradation.

"save_req": "false"

Item feature cache parameters

period

Yes

The update interval for the item feature cache, in minutes. If item features are updated daily, you can set this parameter to a value greater than 1,440 (the number of minutes in a day), such as 2,880 for a two-day interval. This prevents unnecessary feature updates within the same day because features are also updated during routine daily service deployments.

"period": 2880

remote_type

Yes

The data source for item features. Valid values:

  • hologres: Reads and writes data using SQL interfaces. This is suitable for storing and querying large amounts of data.

  • none: Does not use the item feature cache. Item features are passed in the request. In this case, you must set tables to [].

"remote_type": "hologres"

tables

No

The item feature table. This parameter is required when remote_type is set to hologres. It includes the following sub-parameters:

  • key: Required. The name of the item ID column.

  • name: Required. The name of the feature table.

  • value: Optional. The names of the columns to load. Separate multiple column names with a comma (,).

  • condition: Optional. The WHERE clause used to filter items. Example: style_id<10000.

  • timekey: Optional. Specifies the timestamp or integer value for incremental item updates. Supported formats: timestamp and int.

  • static: Optional. Indicates a static feature that does not require periodic updates.

You can read input item data from multiple tables. The configuration must be in the following format:

"tables": [{"key":"table1", ...},{"key":"table2", ...}]

If tables share column names, the columns from the table that appears later in the list overwrite those from the table that appears earlier.

"tables": {

"key": "goods_id",

"name": "public.ali_rec_item_feature"

}

url

No

The Hologres endpoint.

"url": "postgresql://LTAI************@hgprecn-cn-xxxxx-cn-hangzhou-vpc.hologres.aliyuncs.com:80/bigdata_rec"

Parameters for processor access to PAI-FeatureStore

fs_project

No

The name of the PAI-FeatureStore project. This parameter is required when you use PAI-FeatureStore. For more information, see Configure a FeatureStore project.

"fs_project": "fs_demo"

fs_model

No

The name of the model feature in PAI-FeatureStore.

"fs_model": "fs_rank_v1"

fs_entity

No

The entity name in PAI-FeatureStore.

"fs_entity": "item"

region

No

The region where the PAI-FeatureStore project resides.

"region": "cn-beijing"

access_key_id

No

The AccessKey ID that is used to access PAI-FeatureStore.

"access_key_id": "xxxxx"

access_key_secret

No

The AccessKey Secret that is used to access PAI-FeatureStore.

"access_key_secret": "xxxxx"

featuredb_username

No

The username for FeatureDB.

"featuredb_username": "xxxxx"

featuredb_password

No

The password for FeatureDB.

"featuredb_password": "xxxxx"

load_feature_from_offlinestore

No

Specifies whether to obtain offline features directly from the PAI-FeatureStore OfflineStore. Valid values:

  1. True: Obtains data from the PAI-FeatureStore OfflineStore by reading from MaxCompute.

  2. False (default): Obtains data from the PAI-FeatureStore OnlineStore.

"load_feature_from_offlinestore": True

iterate_featuredb_interval

No

The interval, in seconds, at which to update real-time statistical features.

A shorter interval improves feature freshness but increases read costs when features change frequently. Balance accuracy and cost.

"iterate_featuredb_interval": 5

input_tile: Parameters for automatic feature broadcasting

INPUT_TILE

No

Set the INPUT_TILE environment variable to 1 to enable automatic broadcasting of item features. This allows you to pass a single value for features that remain constant within a request, such as user_id. When the INPUT_TILE environment variable is set to 2, the tile operation is delayed until after the feature embeddings are retrieved, which further reduces the computational load.

  • Benefits: Reduces request size, network transfer time, and computation time.

  • To enable this feature, set the INPUT_TILE environment variable to 1 or 2.

Note
  • This optimization is supported in EasyRec 1.3 and later.

  • When fg_mode is tf, this optimization is automatically enabled, and you do not need to set this environment variable.

  • When fg_mode is normal, INPUT_TILE is automatically set to 1 in EasyRec 2.9 and later.

"processor_envs":

[

{

"name": "INPUT_TILE",

"value": "2"

}

]

ADAPTE_FG_CONFIG

No

Enables compatibility with models trained with an older version of FeatureGenerator.

"processor_envs":

[

{

"name": "ADAPTE_FG_CONFIG",

"value": "true"

}

]

DISABLE_FG_PRECISION

No

For compatibility with models trained with an older version of FeatureGenerator. The old version limits float-type features to six significant digits by default, whereas the new version removes this limit. To apply the old behavior (6-digit limit), set this variable to true.

"processor_envs":

[

{

"name": "DISABLE_FG_PRECISION",

"value": "false"

}

]

EasyRec processor inference optimization

Parameter

Required

Description

Example

TF_XLA_FLAGS

No

For GPU inference, this parameter enables XLA to compile and optimize the model and automatically perform operator fusion.

"processor_envs":

[

{

"name": "TF_XLA_FLAGS",

"value": "--tf_xla_auto_jit=2"

},

{

"name": "XLA_FLAGS",

"value": "--xla_gpu_cuda_data_dir=/usr/local/cuda/"

},

{

"name": "XLA_ALIGN_SIZE",

"value": "64"

}

]

TF scheduling parameters

No

inter_op_parallelism_threads: Controls the number of threads for running different operations in parallel.

intra_op_parallelism_threads: Controls the number of threads used within a single operation.

For a 32-core CPU, setting these parameters to 16 usually improves performance. Note that the sum of the two thread counts cannot exceed the total number of CPU cores.

"model_config": {

"inter_op_parallelism_threads": 16,

"intra_op_parallelism_threads": 16,

}

rpc.worker_threads

No

A parameter under metadata in the EAS configuration. Set this parameter to the number of CPU cores of the instance. For example, if the instance has 15 CPU cores, set worker_threads to 15.

"metadata": {

"rpc": {

"worker_threads": 15

}

Step 2: Call the service

2.1 Network configuration

The PAI-Rec engine and the model scoring service are deployed on PAI-EAS and require a direct network connection. On the PAI-EAS instance page, click 'VPC' in the upper-right corner to configure the same VPC, vSwitch, and security group. For more information, see Access public or on-premises resources from EAS. If you use Hologres, you must also configure the same VPC information. The following figure shows an example.

image

2.2 Obtain service information

After the EasyRec model service is deployed, go to the Elastic Algorithm Service (EAS) page. Find the service that you want to call, and in the Service Method column, click Invocation Information to view the service endpoint and token.

2.3 SDK code examples

The EasyRec model service uses the Protocol Buffers (Protobuf) format for both input and output. Therefore, you cannot test the service from the PAI-EAS console.

Before you call the service, confirm the fg_mode configured in model_config during deployment in Step 1. Different modes require different client invocation methods.

Mode (fg_mode)

Request class

normal or tf (with built-in feature engineering)

EasyRecRequest

bypass (without built-in feature engineering)

TFRequest

With FG fg_mode=normal or tf

Java

For Maven environment configuration, see the Java SDK guide. The following code shows how to send a request to the ali_rec_rnk_with_fg service:

import com.aliyun.openservices.eas.predict.http.*;
import com.aliyun.openservices.eas.predict.request.EasyRecRequest;

PredictClient client = new PredictClient(new HttpConfig());
// When you access the service through a public gateway, use the endpoint that starts with your user ID (UID). You can obtain this endpoint from the invocation information of the service in the EAS console.
client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
client.setModelName("ali_rec_rnk_with_fg");
// Replace this with your service token.
client.setToken("******");

EasyRecRequest easyrecRequest = new EasyRecRequest(separator);
// userFeatures: User features. Features are separated by \u0002 (CTRL_B). Feature names and values are separated by a colon (:).
//  user_fea0:user_fea0_val\u0002user_fea1:user_fea1_val
// For more information about the feature value format, see: https://easyrec.readthedocs.io/en/latest/feature/rtp_fg.html
easyrecRequest.appendUserFeatureString(userFeatures);
// You can also add one user feature at a time:
// easyrecRequest.addUserFeature(String userFeaName, T userFeaValue).
// The data type T of the feature value can be String, float, long, or int.

// contextFeatures: Context features. Features are separated by \u0002 (CTRL_B). Feature names and their values are separated by a colon (:). Multiple values for the same feature are also separated by colons.
//   ctxt_fea0:ctxt_fea0_ival0:ctxt_fea0_ival1:ctxt_fea0_ival2\u0002ctxt_fea1:ctxt_fea1_ival0:ctxt_fea1_ival1:ctxt_fea1_ival2
easyrecRequest.appendContextFeatureString(contextFeatures);
// You can also add one context feature at a time:
// easyrecRequest.addContextFeature(String ctxtFeaName, List<Object> ctxtFeaValue).
// The data type of ctxtFeaValue can be String, Float, Long, or Integer.

// itemIdStr: A list of item IDs to predict, separated by a comma (,).
easyrecRequest.appendItemStr(itemIdStr, ",");
// You can also add one item ID at a time:
// easyrecRequest.appendItemId(String itemId)

easyrecPredictProtos.PBResponse response = client.predict(easyrecRequest);

for (Map.Entry<String, easyrecPredictProtos.Results> entry : response.getResultsMap().entrySet()) {
    String key = entry.getKey();
    easyrecPredictProtos.Results value = entry.getValue();
    System.out.print("key: " + key);
    for (int i = 0; i < value.getScoresCount(); i++) {
        System.out.format("value: %.6g\n", value.getScores(i));
    }
}

// Get the features after FG processing to check for consistency with offline features.
// Set DebugLevel to 1 to return the generated features.
easyrecRequest.setDebugLevel(1);
easyrecPredictProtos.PBResponse response = client.predict(easyrecRequest);
Map<String, String> genFeas = response.getGenerateFeaturesMap();
for(String itemId: genFeas.keySet()) {
    System.out.println(itemId);
    System.out.println(genFeas.get(itemId));
}
Python

For information about how to configure the environment, see the Python SDK guide. We recommend that you use the Java client in production environments for better performance. The following code provides an example:

from eas_prediction import PredictClient

from eas_prediction.easyrec_request import EasyRecRequest
from eas_prediction.easyrec_predict_pb2 import PBFeature
from eas_prediction.easyrec_predict_pb2 import PBRequest

if __name__ == '__main__':
    endpoint = 'http://xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com'
    service_name = 'ali_rec_rnk_with_fg'
    token = '******'

    client = PredictClient(endpoint, service_name)
    client.set_token(token)
    client.init()

    req = PBRequest()
    uid = PBFeature()
    uid.string_feature = 'u0001'
    req.user_features['user_id'] = uid
    age = PBFeature()
    age.int_feature = 12
    req.user_features['age'] = age
    weight = PBFeature()
    weight.float_feature = 129.8
    req.user_features['weight'] = weight

    req.item_ids.extend(['item_0001', 'item_0002', 'item_0003'])
    
    easyrec_req = EasyRecRequest()
    easyrec_req.add_feed(req, debug_level=0)
    res = client.predict(easyrec_req)
    print(res)

The parameters are described as follows:

  • endpoint: The service endpoint. To obtain it, go to the Elastic Algorithm Service (EAS) page, find your service, and click Invocation Information in the Service Method column.

  • service_name: The service name. Obtain it from the Elastic Algorithm Service (EAS) page.

  • token: The service token. Find it in the Invocation Information dialog box.

Without FG fg_mode=bypass

Java

For Maven environment configuration, see the Java SDK guide. The following code shows how to send a request to the ali_rec_rnk_no_fg service:

import java.util.List;

import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;

public class TestEasyRec {
    public static TFRequest buildPredictRequest() {
        TFRequest request = new TFRequest();
 
        request.addFeed("user_id", TFDataType.DT_STRING, 
                        new long[]{3}, new String []{ "u0001", "u0001", "u0001"});
      	request.addFeed("age", TFDataType.DT_FLOAT, 
                        new long[]{3}, new float []{ 18.0f, 18.0f, 18.0f});
        // Note: If you set INPUT_TILE=2, for features that have the same value, you only need to pass the value once:
        //    request.addFeed("user_id", TFDataType.DT_STRING,
        //            new long[]{1}, new String []{ "u0001" });
        //    request.addFeed("age", TFDataType.DT_FLOAT, 
        //            new long[]{1}, new float []{ 18.0f});
      	request.addFeed("item_id", TFDataType.DT_STRING, 
                        new long[]{3}, new String []{ "i0001", "i0002", "i0003"});  
        request.addFetch("probs");
      	return request;
    }

    public static void main(String[] args) throws Exception {
        PredictClient client = new PredictClient(new HttpConfig());

        // To use a direct network connection, use the setDirectEndpoint method. Example: 
        //   client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // You must enable the direct network connection in the EAS console and provide the source vSwitch used to access the EAS service.
        // A direct network connection offers better stability and performance.
        client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
        client.setModelName("ali_rec_rnk_no_fg");
        client.setToken("");
        long startTime = System.currentTimeMillis();
        for (int i = 0; i < 100; i++) {
            try {
                TFResponse response = client.predict(buildPredictRequest());
                // "probs" is an output field of the model. You can use the curl command to view the model's inputs and outputs:
                //   curl xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com -H "Authorization:{token}"
                List<Float> result = response.getFloatVals("probs");
                System.out.print("Predict Result: [");
                for (int j = 0; j < result.size(); j++) {
                    System.out.print(result.get(j).floatValue());
                    if (j != result.size() - 1) {
                        System.out.print(", ");
                    }
                }
                System.out.print("]\n");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");
        client.shutdown();
    }
}

Python

For instructions, see the Python SDK guide. Due to lower performance, the Python SDK is recommended for debugging only. Use the Java SDK in production environments. The following code shows how to send a request to the ali_rec_rnk_no_fg service:

#!/usr/bin/env python

from eas_prediction import PredictClient
from eas_prediction import StringRequest
from eas_prediction import TFRequest

if __name__ == '__main__':
    client = PredictClient('http://xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com', 'ali_rec_rnk_no_fg')
    client.set_token('')
    client.init()
    
    # Note: Replace server_default with the actual signature_name of your model. For more information, see the SDK guide mentioned above.
    req = TFRequest('server_default') 
    req.add_feed('user_id', [3], TFRequest.DT_STRING, ['u0001'] * 3)
    req.add_feed('age', [3], TFRequest.DT_FLOAT, [18.0] * 3)
    # Note: After enabling the INPUT_TILE=2 optimization, you can pass a single value for the preceding features.
    #   req.add_feed('user_id', [1], TFRequest.DT_STRING, ['u0001'])
    #   req.add_feed('age', [1], TFRequest.DT_FLOAT, [18.0])
    req.add_feed('item_id', [3], TFRequest.DT_STRING, 
        ['i0001', 'i0002', 'i0003'])
    for x in range(0, 100):
        resp = client.predict(req)
        print(resp)

2.4 Build a custom service request

To call the service from languages other than Python and Java, you must manually generate the prediction request code from the .proto files. To build the service request, use the following Protocol Buffers definitions to generate the relevant code:

  • tf_predict.proto: Request definition for a TensorFlow model.

    syntax = "proto3";
    
    option cc_enable_arenas = true;
    option go_package = ".;tf";
    option java_package = "com.aliyun.openservices.eas.predict.proto";
    option java_outer_classname = "PredictProtos";
    
    enum ArrayDataType {
      // Not a legal value for DataType. Used to indicate a DataType field
      // has not been set.
      DT_INVALID = 0;
    
      // Data types that all computation devices are expected to be
      // capable to support.
      DT_FLOAT = 1;
      DT_DOUBLE = 2;
      DT_INT32 = 3;
      DT_UINT8 = 4;
      DT_INT16 = 5;
      DT_INT8 = 6;
      DT_STRING = 7;
      DT_COMPLEX64 = 8;  // Single-precision complex
      DT_INT64 = 9;
      DT_BOOL = 10;
      DT_QINT8 = 11;     // Quantized int8
      DT_QUINT8 = 12;    // Quantized uint8
      DT_QINT32 = 13;    // Quantized int32
      DT_BFLOAT16 = 14;  // Float32 truncated to 16 bits.  Only for cast ops.
      DT_QINT16 = 15;    // Quantized int16
      DT_QUINT16 = 16;   // Quantized uint16
      DT_UINT16 = 17;
      DT_COMPLEX128 = 18;  // Double-precision complex
      DT_HALF = 19;
      DT_RESOURCE = 20;
      DT_VARIANT = 21;  // Arbitrary C++ data types
    }
    
    // Dimensions of an array
    message ArrayShape {
      repeated int64 dim = 1 [packed = true];
    }
    
    // Protocol buffer representing an array
    message ArrayProto {
      // Data Type.
      ArrayDataType dtype = 1;
    
      // Shape of the array.
      ArrayShape array_shape = 2;
    
      // DT_FLOAT.
      repeated float float_val = 3 [packed = true];
    
      // DT_DOUBLE.
      repeated double double_val = 4 [packed = true];
    
      // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
      repeated int32 int_val = 5 [packed = true];
    
      // DT_STRING.
      repeated bytes string_val = 6;
    
      // DT_INT64.
      repeated int64 int64_val = 7 [packed = true];
    
      // DT_BOOL.
      repeated bool bool_val = 8 [packed = true];
    }
    
    // PredictRequest specifies which TensorFlow model to run, as well as
    // how inputs are mapped to tensors and how outputs are filtered before
    // returning to user.
    message PredictRequest {
      // A named signature to evaluate. If unspecified, the default signature
      // will be used
      string signature_name = 1;
    
      // Input tensors.
      // Names of input tensor are alias names. The mapping from aliases to real
      // input tensor names is expected to be stored as named generic signature
      // under the key "inputs" in the model export.
      // Each alias listed in a generic signature named "inputs" should be provided
      // exactly once in order to run the prediction.
      map<string, ArrayProto> inputs = 2;
    
      // Output filter.
      // Names specified are alias names. The mapping from aliases to real output
      // tensor names is expected to be stored as named generic signature under
      // the key "outputs" in the model export.
      // Only tensors specified here will be run/fetched and returned, with the
      // exception that when none is specified, all tensors specified in the
      // named signature will be run/fetched and returned.
      repeated string output_filter = 3;
      
      // Debug flags
      // 0: just return prediction results, no debug information
      // 100: return prediction results, and save request to model_dir 
      // 101: save timeline to model_dir
      int32 debug_level = 100;
    }
    
    // Response for PredictRequest on successful run.
    message PredictResponse {
      // Output tensors.
      map<string, ArrayProto> outputs = 1;
    }
  • easyrec_predict.proto: Request definition for a TensorFlow model with FG.

    syntax = "proto3";
    
    option cc_enable_arenas = true;
    option go_package = ".;easyrec";
    option java_package = "com.aliyun.openservices.eas.predict.proto";
    option java_outer_classname = "EasyRecPredictProtos";
    
    import "tf_predict.proto";
    
    // context features
    message ContextFeatures {
      repeated PBFeature features = 1;
    }
    
    message PBFeature {
      oneof value {
        int32 int_feature = 1;
        int64 long_feature = 2;
        string string_feature = 3;
        float float_feature = 4;
      }
    }
    
    // PBRequest specifies the request for aggregator
    message PBRequest {
      // Debug flags
      // 0: just return prediction results, no debug information
      // 3: return features generated by FG module, string format, feature values are separated by \u0002, 
      //    could be used for checking feature consistency and generating online deep learning samples 
      // 100: return prediction results, and save request to model_dir 
      // 101: save timeline to model_dir
      // 102: for recall models such as DSSM and MIND, not only return Faiss retrieved results
      //      but also return user embedding vectors.
      int32 debug_level = 1;
    
      // user features
      map<string, PBFeature> user_features = 2;
    
      // item ids, static(daily updated) item features 
      // are fetched from the feature cache residing in 
      // each processor node by item_ids
      repeated string item_ids = 3;
    
      // context features for each item, realtime item features
      //    could be passed as context features.
      map<string, ContextFeatures> context_features = 4;
    
      // embedding retrieval neighbor number.
      int32 faiss_neigh_num = 5;
    }
    
    // return results
    message Results {
      repeated double scores = 1 [packed = true];
    }
    
    enum StatusCode {
      OK = 0;
      INPUT_EMPTY = 1;
      EXCEPTION = 2;
    }
    
    // PBResponse specifies the response for aggregator
    message PBResponse {
      // results
      map<string, Results> results = 1;
    
      // item features
      map<string, string> item_features = 2;
    
      // fg generate features
      map<string, string> generate_features = 3;
    
      // context features
      map<string, ContextFeatures> context_features = 4;
    
      string error_msg = 5;
    
      StatusCode status_code = 6;
    
      // item ids
      repeated string item_ids = 7;
    
      repeated string outputs = 8;
    
      // all fg input features
      map<string, string> raw_features = 9;
    
      // output tensors
      map<string, ArrayProto> tf_outputs = 10;
    }