Elastic Algorithm Service (EAS) of Platform for AI (PAI) provides a built-in TorchEasyRec processor. This processor facilitates the deployment of TorchEasyRec or Torch recommendation models as scoring services, and integrates feature engineering capabilities. You can use the TorchEasyRec processor to deploy high-performance scoring services that are optimized for feature engineering and Torch models. This topic describes how to deploy and call a TorchEasyRec model service.
Background information
The following figure shows the architecture of a recommendation engine based on the TorchEasyRec processor.
The TorchEasyRec processor includes the following modules:
Item Feature Cache: This module caches item features from FeatureStore to memory. This reduces the load of FeatureStore resulting from frequent request operations and improves the performance of an inference service. If item features contain real-time features, FeatureStore synchronizes the real-time features.
Feature Generator (FG): This module uses a configuration file to define the feature engineering process and uses a collection of C++ code for real-time and offline feature engineering to ensure consistency.
TorchModel: indicates a scripted model file that is exported from TorchEasyRec or Torch model training.
Limits
The TorchEasyRec processor can be used on GPU devices of the T4 and A10 types and general-purpose instance families, including g6, g7, and g8. If you use GPU devices, make sure that the version of the Compute Unified Device Architecture (CUDA) driver is 535 or later.
Processor versions
The TorchEasyRec processor is continuously improved. Later versions provide enhanced features and inference performance. For optimal results, we recommend that you use the latest version to deploy your inference service. The following table describes the released versions.
Processor name | Release date | Torch version | FG version | New features |
easyrec-torch-0.1 | 20240910 | 2.4 | 0.2.9 |
|
easyrec-torch-0.2 | 20240930 | 2.4 | 0.2.9 |
|
easyrec-torch-0.3 | 20241014 | 2.4 | 0.2.9 |
|
easyrec-torch-0.4 | 20241028 | 2.4 | 0.3.1 |
|
easyrec-torch-0.5 | 20241114 | 2.4 | 0.3.1 |
|
easyrec-torch-0.6 | 20241118 | 2.4 | 0.3.6 |
|
easyrec-torch-0.7 | 20241206 | 2.5 | 0.3.9 |
|
easyrec-torch-0.8 | 20241225 | 2.5 | 0.3.9 |
|
easyrec-torch-0.9 | 20250115 | 2.5 | 0.4.1 |
|
easyrec-torch-1.0 | 20250206 | 2.5 | 0.4.2 |
|
easyrec-torch-1.1 | 20250423 | 2.5 | 0.5.9 |
|
Step 1: Deploy a model service
Prepare the torcheasyrec.json service configuration file.
You must set the processor parameter to easyrec-torch-{version} and configure the value of {version} based on the Processor versions. The following sample code provides an example of a JSON configuration file:
Sample code when the fg_mode parameter is set to normal
{ "metadata": { "instance": 1, "name": "alirec_rank_with_fg", "rpc": { "enable_jemalloc": 1, "max_queue_size": 256, "worker_threads": 16 } }, "cloud": { "computing": { "instance_type": "ecs.gn6i-c16g1.4xlarge" } }, "model_config": { "fg_mode": "normal", "fg_threads": 8, "region": "YOUR_REGION", "fs_project": "YOUR_FS_PROJECT", "fs_model": "YOUR_FS_MODEL", "fs_entity": "item", "load_feature_from_offlinestore": true, "access_key_id":"YOUR_ACCESS_KEY_ID", "access_key_secret":"YOUR_ACCESS_KEY_SECRET" }, "storage": [ { "mount_path": "/home/admin/docker_ml/workspace/model/", "oss": { "path": "oss://xxx/xxx/export", "readOnly": false }, "properties": { "resource_type": "code" } } ], "processor":"easyrec-torch-0.3" }
Sample code when the fg_mode parameter is set to bypass
{ "metadata": { "instance": 1, "name": "alirec_rank_no_fg", "rpc": { "enable_jemalloc": 1, "max_queue_size": 256, "worker_threads": 16 } }, "cloud": { "computing": { "instance_type": "ecs.gn6i-c16g1.4xlarge" } }, "model_config": { "fg_mode": "bypass" }, "storage": [ { "mount_path": "/home/admin/docker_ml/workspace/model/", "oss": { "path": "oss://xxx/xxx/export", "readOnly": false }, "properties": { "resource_type": "code" } } ], "processor":"easyrec-torch-0.3" }
The following table describes the key parameters. For information about other parameters, see Parameters for JSON deployment.
Parameter
Required
Description
Example
processor
Yes
The TorchEasyRec processor.
"processor":"easyrec-torch-0.3"
path
Yes
The Object Storage Service (OSS) path to which the model file is mounted.
"path": "oss://examplebucket/xxx/export"
fg_mode
No
The feature engineering mode. Valid values:
bypass (default): The FG module is not used, and only a Torch model is deployed.
This mode is suitable for custom feature engineering scenarios.
In bypass mode, you do not need to configure the parameters related to FeatureStore.
normal: The FG module is used. In most cases, the FG module is used together with TorchEasyRec for model training.
"fg_mode": "normal"
fg_threads
No
The number of concurrent threads that are used to run the FG module for a single request.
"fg_threads": 15
outputs
No
The names of the output variables for the Torch model. Example: probs_ctr. If multiple variables are to be returned, separate them with commas (,). By default, all variables are output.
"outputs":"probs_ctr,probs_cvr"
item_empty_score
No
The default score when the item ID does not exist. Default value: 0.
"item_empty_score": -1
Parameters related to retrieval
faiss_neigh_num
No
The number of retrieved vectors. By default, the value of the
faiss_neigh_num
field in a request is used. If thefaiss_neigh_num
field does not exist in a request, the value of thefaiss_neigh_num
field inmodel_config
is used. Default value: 1."faiss_neigh_num": 200
faiss_nprobe
No
The number of clusters obtained in the retrieval process. Default value: 800. In FAISS, an inverted file index divides the data into multiple small clusters (groups) and maintains an inverted list for each cluster. A large value indicates high retrieval accuracy but increases computing costs and retrieval time. Conversely, a small value indicates low retrieval accuracy and accelerates the retrieval speed.
"faiss_nprobe": 700
Parameters related to FeatureStore
fs_project
No
The name of the FeatureStore project. This parameter is required if you use FeatureStore. For more information about FeatureStore, see Configure a FeatureStore project.
"fs_project": "fs_demo"
fs_model
No
The name of the model feature in FeatureStore.
"fs_model": "fs_rank_v1"
fs_entity
No
The name of the feature entity in FeatureStore.
"fs_entity": "item"
region
No
The region in which FeatureStore resides. For example, if FeatureStore resides in China (Beijing), set this parameter to cn-beijing. For more information about regions, see Endpoints.
"region": "cn-beijing"
access_key_id
No
The AccessKey ID of FeatureStore.
"access_key_id": "xxxxx"
access_key_secret
No
The AccessKey secret of FeatureStore.
"access_key_secret": "xxxxx"
load_feature_from_offlinestore
No
Specifies whether to obtain offline feature data from an offline data store in FeatureStore. Valid values:
True: Offline feature data is obtained from an offline data store in FeatureStore.
False (default): Offline feature data is obtained from an online data store in FeatureStore.
"load_feature_from_offlinestore": True
featuredb_username
No
The username of FeatureDB.
"featuredb_username":"xxx"
featuredb_password
No
The password of the username for FeatureDB.
"featuredb_passwd":"xxx"
Parameters related to automatic broadcasting
INPUT_TILE
No
Enables automatic broadcasting for features. If the values of a feature, such as user_id, are the same in a request, specify the value only once to reduce the request size, network transfer time, and calculation time.
You must use this feature together with TorchEasyRec in normal mode. You must also configure related environment variables when the related file is exported. By default, the system reads the INPUT_TILE value from the model_acc.json file, in the model directory, which is exported from TorchEasyRec. If the file does not exist, the system reads the value from the environment variable.
After the feature is enabled:
If you set this parameter to 2, FG calculates input data only once for user features.
If you set this parameter to 3, FG calculates the embedding information only once for a user feature. The system separately calculates user and item embedding information. This is suitable for scenarios that involve a large number of user features.
"processor_envs":
[
{
"name": "INPUT_TILE",
"value": "2"
}
]
NO_GRAD_GUARD
No
Disables gradient calculation during inference. If you configure this parameter, the tracking operation stops and the computation graph is not built as expected.
NoteIf you set this parameter to 1, specific models may be incompatible. If a stuck issue occurs during subsequent inferences, you can resolve the issue by adding the configuration
PYTORCH_TENSOREXPR_FALLBACK=2
. This allows you to skip the compilation step and retain specific graph optimization capabilities."processor_envs":
[
{
"name": "NO_GRAD_GUARD",
"value": "1"
}
]
Deploy a TorchEasyRec model service. You can use one of the following deployment methods:
(Recommended) Deploy a model service by using JSON
Perform the following steps:
Log on to the PAI console. Select a region on the top of the page. Then, select the desired workspace and click Enter Elastic Algorithm Service (EAS).
On the Elastic Algorithm Service (EAS) page, click Deploy Service. On the Deploy Service page, click JSON Deployment in the Custom Model Deployment section.
On the JSON Deployment page, enter the content of the JSON configuration file that you prepared in the JSON text editor and click Deploy.
Deploy a model service by using the EASCMD client
Download the EASCMD client and complete identity authentication. In this example, Windows 64 is used.
Launch the client and run the following command in the directory in which the JSON configuration file is located to create a model service. For more information, see Run commands to use the EASCMD client.
eascmdwin64.exe create <service.json>
Replace <service.json> with the name of the JSON configuration file that you created, such as torcheasyrec.json.
Step 2: Call the model service
After you deploy a TorchEasyRec model service, you can perform the following steps to view and call the model service:
Log on to the PAI console. In the top navigation bar, select the desired region. Choose Model Deployment > Elastic Algorithm Service (EAS). On the page that appears, select the desired workspace and click Enter Elastic Algorithm Service (EAS).
On the Elastic Algorithm Service (EAS) page, find the desired model service and click Invocation Method in the Service Type column. In the Invocation Method dialog box, view the endpoint and token of the model service.
The input and output of the TorchEasyRec model service are in the Protocol Buffers (protobuf) format. You can call a model service based on whether FG is used.
Call a model service when FG is used
You can call a model service by using one of the following methods:
EAS SDK for Java
Before you run the code, you must configure the Maven environment. For more information, see SDK for Java. Sample code for calling the alirec_rank_with_fg model service:
package com.aliyun.openservices.eas.predict;
import com.aliyun.openservices.eas.predict.http.Compressor;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.proto.TorchRecPredictProtos;
import com.aliyun.openservices.eas.predict.request.TorchRecRequest;
import com.aliyun.openservices.eas.predict.proto.TorchPredictProtos.ArrayProto;
import java.util.*;
public class TorchRecPredictTest {
public static PredictClient InitClient() {
return new PredictClient(new HttpConfig());
}
public static TorchRecRequest buildPredictRequest() {
TorchRecRequest TorchRecRequest = new TorchRecRequest();
TorchRecRequest.appendItemId("7033");
TorchRecRequest.addUserFeature("user_id", 33981,"int");
ArrayList<Double> list = new ArrayList<>();
list.add(0.24689289764507472);
list.add(0.005758482924454689);
list.add(0.6765301324940026);
list.add(0.18137273055602343);
TorchRecRequest.addUserFeature("raw_3", list,"List<double>");
Map<String,Integer> myMap =new LinkedHashMap<>();
myMap.put("866", 4143);
myMap.put("1627", 2451);
TorchRecRequest.addUserFeature("map_1", myMap,"map<string,int>");
ArrayList<ArrayList<Float>> list2 = new ArrayList<>();
ArrayList<Float> innerList1 = new ArrayList<>();
innerList1.add(1.1f);
innerList1.add(2.2f);
innerList1.add(3.3f);
list2.add(innerList1);
ArrayList<Float> innerList2 = new ArrayList<>();
innerList2.add(4.4f);
innerList2.add(5.5f);
list2.add(innerList2);
TorchRecRequest.addUserFeature("click", list2,"list<list<float>>");
TorchRecRequest.addContextFeature("id_2", list,"List<double>");
TorchRecRequest.addContextFeature("id_2", list,"List<double>");
System.out.println(TorchRecRequest.request);
return TorchRecRequest;
}
public static void main(String[] args) throws Exception{
PredictClient client = InitClient();
client.setToken("tokenGeneratedFromService");
client.setEndpoint("175805416243****.cn-beijing.pai-eas.aliyuncs.com");
client.setModelName("alirec_rank_with_fg");
client.setRequestTimeout(100000);
testInvoke(client);
testDebugLevel(client);
client.shutdown();
}
public static void testInvoke(PredictClient client) throws Exception {
long startTime = System.currentTimeMillis();
TorchRecPredictProtos.PBResponse response = client.predict(buildPredictRequest());
for (Map.Entry<String, ArrayProto> entry : response.getMapOutputsMap().entrySet()) {
System.out.println("Key: " + entry.getKey() + ", Value: " + entry.getValue());
}
long endTime = System.currentTimeMillis();
System.out.println("Spend Time: " + (endTime - startTime) + "ms");
}
public static void testDebugLevel(PredictClient client) throws Exception {
long startTime = System.currentTimeMillis();
TorchRecRequest request = buildPredictRequest();
request.setDebugLevel(1);
TorchRecPredictProtos.PBResponse response = client.predict(request);
Map<String, String> genFeas = response.getGenerateFeaturesMap();
for(String itemId: genFeas.keySet()) {
System.out.println(itemId);
System.out.println(genFeas.get(itemId));
}
long endTime = System.currentTimeMillis();
System.out.println("Spend Time: " + (endTime - startTime) + "ms");
}
}
Take note of the following parameters:
client.setToken("tokenGeneratedFromService"): Replace tokenGeneratedFromService with the token of your model service, such as
MmFiMDdlO****wYjhhNjgwZmZjYjBjMTM1YjliZmNkODhjOGVi****
.client.setEndpoint("175805416243****.cn-beijing.pai-eas.aliyuncs.com"): Replace the content enclosed in double quotation marks with the endpoint of your model service, such as
175805416243****.cn-beijing.pai-eas.aliyuncs.com
.client.setModelName("alirec_rank_with_fg"): Replace the content enclosed in double quotation marks with the name of your model service.
EAS SDK for Python
Before you run the code, run the pip install -U eas-prediction --user
command to install or update the eas-prediction
library. For more information, see SDK for Python. Sample code:
from eas_prediction import PredictClient
from eas_prediction.torchrec_request import TorchRecRequest
if __name__ == '__main__':
endpoint = 'http://localhost:6016'
client = PredictClient(endpoint, '<YOUR_SERVICE_NAME>')
client.set_token('<your_service_token>')
client.init()
torchrec_req = TorchRecRequest()
torchrec_req.add_user_fea('user_id', 'u001d', "STRING")
torchrec_req.add_user_fea('age', 12, "INT")
torchrec_req.add_user_fea('weight', 129.8, "FLOAT")
torchrec_req.add_item_id('item_0001')
torchrec_req.add_item_id('item_0002')
torchrec_req.add_item_id('item_0003')
torchrec_req.add_user_fea("raw_3", [0.24689289764507472, 0.005758482924454689, 0.6765301324940026, 0.18137273055602343], "list<double>")
torchrec_req.add_user_fea("raw_4", [0.9965264740966043, 0.659596586238391, 0.16396649403055896, 0.08364986620265635], "list<double>")
torchrec_req.add_user_fea("map_1", {"0":0.37845234405201145}, "map<int,float>")
torchrec_req.add_user_fea("map_2", {"866":4143,"1627":2451}, "map<int,int>")
torchrec_req.add_context_fea("id_2", [866], "list<int>" )
torchrec_req.add_context_fea("id_2", [7022,1], "list<int>" )
torchrec_req.add_context_fea("id_2", [7022,1], "list<int>" )
torchrec_req.add_user_fea("click", [[0.94433516,0.49145547], [0.94433516, 0.49145597]], "list<list<float>>")
res = client.predict(torchrec_req)
print(res)
Take note of the following parameters:
endpoint: Set the value to the endpoint of your model service, such as
http://175805416243****.cn-beijing.pai-eas.aliyuncs.com/
.Replace <your_service_name> with the name of your model service.
Replace <your_service_token> with the token of your model service, such as
MmFiMDdlO****wYjhhNjgwZmZjYjBjMTM1YjliZmNkODhjOGVi****
.
Call a model service when FG is not used
EAS SDK for Java
Before you run the code, you must configure the Maven environment. For more information, see SDK for Java. Sample code for calling the alirec_rank_no_fg model service:
package com.aliyun.openservices.eas.predict;
import java.util.List;
import java.util.Arrays;
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TorchDataType;
import com.aliyun.openservices.eas.predict.request.TorchRequest;
import com.aliyun.openservices.eas.predict.response.TorchResponse;
public class Test_Torch {
public static PredictClient InitClient() {
return new PredictClient(new HttpConfig());
}
public static TorchRequest buildPredictRequest() {
TorchRequest request = new TorchRequest();
float[] content = new float[2304000];
for (int i = 0; i < content.length; i++) {
content[i] = (float) 0.0;
}
long[] content_i = new long[900];
for (int i = 0; i < content_i.length; i++) {
content_i[i] = 0;
}
long[] a = Arrays.copyOfRange(content_i, 0, 300);
float[] b = Arrays.copyOfRange(content, 0, 230400);
request.addFeed(0, TorchDataType.DT_INT64, new long[]{300,3}, content_i);
request.addFeed(1, TorchDataType.DT_FLOAT, new long[]{300,10,768}, content);
request.addFeed(2, TorchDataType.DT_FLOAT, new long[]{300,768}, b);
request.addFeed(3, TorchDataType.DT_INT64, new long[]{300}, a);
request.addFetch(0);
request.setDebugLevel(903);
return request;
}
public static void main(String[] args) throws Exception {
PredictClient client = InitClient();
client.setToken("tokenGeneratedFromService");
client.setEndpoint("175805416243****.cn-beijing.pai-eas.aliyuncs.com");
client.setModelName("alirec_rank_no_fg");
client.setIsCompressed(false);
long startTime = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
TorchResponse response = null;
try {
response = client.predict(buildPredictRequest());
List<Float> result = response.getFloatVals(0);
System.out.print("Predict Result: [");
for (int j = 0; j < result.size(); j++) {
System.out.print(result.get(j).floatValue());
if (j != result.size() - 1) {
System.out.print(", ");
}
}
System.out.print("]\n");
} catch (Exception e) {
e.printStackTrace();
}
}
long endTime = System.currentTimeMillis();
System.out.println("Spend Time: " + (endTime - startTime) + "ms");
client.shutdown();
}
}
Take note of the following parameters:
client.setToken("tokenGeneratedFromService"): Replace tokenGeneratedFromService with the token of your model service, such as
MmFiMDdlO****wYjhhNjgwZmZjYjBjMTM1YjliZmNkODhjOGVi****
.client.setEndpoint("175805416243****.cn-beijing.pai-eas.aliyuncs.com"): Replace the content enclosed in double quotation marks with the endpoint of your model service, such as
175805416243****.cn-beijing.pai-eas.aliyuncs.com
.client.setModelName("alirec_rank_no_fg"): Replace the content enclosed in double quotation marks with the name of your model service.
EAS SDK for Python
Before you run the code, run the pip install -U eas-prediction --user
command to install or update the eas-prediction
library. For more information, see SDK for Python. Sample code for calling the alirec_rank_no_fg model service:
from eas_prediction import PredictClient
from eas_prediction import TorchRequest
# snappy data
req = TorchRequest(False)
req.add_feed(0, [300, 3], TorchRequest.DT_INT64, [1] * 900)
req.add_feed(1, [300, 10, 768], TorchRequest.DT_FLOAT, [1.0] * 3 * 768000)
req.add_feed(2, [300, 768], TorchRequest.DT_FLOAT, [1.0] * 3 * 76800)
req.add_feed(3, [300], TorchRequest.DT_INT64, [1] * 300)
client = PredictClient('<your_endpoint>', '<your_service_name>')
client.set_token('<your_service_token>')
client.init()
resp = client.predict(req)
print(resp)
Take note of the following parameters:
Replace <your_endpoint> with the endpoint of your model service, such as
http://175805416243****.cn-beijing.pai-eas.aliyuncs.com/
.Replace <your_service_name> with the name of your model service.
Replace <your_service_token> with the token of your model service, such as
MmFiMDdlO****wYjhhNjgwZmZjYjBjMTM1YjliZmNkODhjOGVi****
.
For information about the status codes returned when you call a model service, see Status codes. You can also create custom service requests. For more information, see Request syntax.
Request syntax
If you call a model service on a client, you must manually generate prediction code from a .proto file. To generate code for custom service requests, use the following protobuf definitions:
pytorch_predict.proto: protobuf definition for a Torch model
syntax = "proto3";
package pytorch.eas;
option cc_enable_arenas = true;
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "TorchPredictProtos";
enum ArrayDataType {
// Not a legal value for DataType. Used to indicate a DataType field
// has not been set.
DT_INVALID = 0;
// Data types that all computation devices are expected to be
// capable to support.
DT_FLOAT = 1;
DT_DOUBLE = 2;
DT_INT32 = 3;
DT_UINT8 = 4;
DT_INT16 = 5;
DT_INT8 = 6;
DT_STRING = 7;
DT_COMPLEX64 = 8; // Single-precision complex
DT_INT64 = 9;
DT_BOOL = 10;
DT_QINT8 = 11; // Quantized int8
DT_QUINT8 = 12; // Quantized uint8
DT_QINT32 = 13; // Quantized int32
DT_BFLOAT16 = 14; // Float32 truncated to 16 bits. Only for cast ops.
DT_QINT16 = 15; // Quantized int16
DT_QUINT16 = 16; // Quantized uint16
DT_UINT16 = 17;
DT_COMPLEX128 = 18; // Double-precision complex
DT_HALF = 19;
DT_RESOURCE = 20;
DT_VARIANT = 21; // Arbitrary C++ data types
}
// Dimensions of an array
message ArrayShape {
repeated int64 dim = 1 [packed = true];
}
// Protocol buffer representing an array
message ArrayProto {
// Data Type.
ArrayDataType dtype = 1;
// Shape of the array.
ArrayShape array_shape = 2;
// DT_FLOAT.
repeated float float_val = 3 [packed = true];
// DT_DOUBLE.
repeated double double_val = 4 [packed = true];
// DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
repeated int32 int_val = 5 [packed = true];
// DT_STRING.
repeated bytes string_val = 6;
// DT_INT64.
repeated int64 int64_val = 7 [packed = true];
}
message PredictRequest {
// Input tensors.
repeated ArrayProto inputs = 1;
// Output filter.
repeated int32 output_filter = 2;
// Input tensors for rec
map<string, ArrayProto> map_inputs = 3;
// debug_level for rec
int32 debug_level = 100;
}
// Response for PredictRequest on successful run.
message PredictResponse {
// Output tensors.
repeated ArrayProto outputs = 1;
// Output tensors for rec.
map<string, ArrayProto> map_outputs = 2;
}
torchrec_predict.proto: protobuf definition for a Torch model and FG
syntax = "proto3";
option go_package = ".;torch_predict_protos";
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "TorchRecPredictProtos";
package com.alibaba.pairec.processor;
import "pytorch_predict.proto";
//long->others
message LongStringMap {
map<int64, string> map_field = 1;
}
message LongIntMap {
map<int64, int32> map_field = 1;
}
message LongLongMap {
map<int64, int64> map_field = 1;
}
message LongFloatMap {
map<int64, float> map_field = 1;
}
message LongDoubleMap {
map<int64, double> map_field = 1;
}
//string->others
message StringStringMap {
map<string, string> map_field = 1;
}
message StringIntMap {
map<string, int32> map_field = 1;
}
message StringLongMap {
map<string, int64> map_field = 1;
}
message StringFloatMap {
map<string, float> map_field = 1;
}
message StringDoubleMap {
map<string, double> map_field = 1;
}
//int32->others
message IntStringMap {
map<int32, string> map_field = 1;
}
message IntIntMap {
map<int32, int32> map_field = 1;
}
message IntLongMap {
map<int32, int64> map_field = 1;
}
message IntFloatMap {
map<int32, float> map_field = 1;
}
message IntDoubleMap {
map<int32, double> map_field = 1;
}
// list
message IntList {
repeated int32 features = 1;
}
message LongList {
repeated int64 features = 1;
}
message FloatList {
repeated float features = 1;
}
message DoubleList {
repeated double features = 1;
}
message StringList {
repeated string features = 1;
}
// lists
message IntLists {
repeated IntList lists = 1;
}
message LongLists {
repeated LongList lists = 1;
}
message FloatLists {
repeated FloatList lists = 1;
}
message DoubleLists {
repeated DoubleList lists = 1;
}
message StringLists {
repeated StringList lists = 1;
}
message PBFeature {
oneof value {
int32 int_feature = 1;
int64 long_feature = 2;
string string_feature = 3;
float float_feature = 4;
double double_feature=5;
LongStringMap long_string_map = 6;
LongIntMap long_int_map = 7;
LongLongMap long_long_map = 8;
LongFloatMap long_float_map = 9;
LongDoubleMap long_double_map = 10;
StringStringMap string_string_map = 11;
StringIntMap string_int_map = 12;
StringLongMap string_long_map = 13;
StringFloatMap string_float_map = 14;
StringDoubleMap string_double_map = 15;
IntStringMap int_string_map = 16;
IntIntMap int_int_map = 17;
IntLongMap int_long_map = 18;
IntFloatMap int_float_map = 19;
IntDoubleMap int_double_map = 20;
IntList int_list = 21;
LongList long_list =22;
StringList string_list = 23;
FloatList float_list = 24;
DoubleList double_list = 25;
IntLists int_lists = 26;
LongLists long_lists =27;
StringLists string_lists = 28;
FloatLists float_lists = 29;
DoubleLists double_lists = 30;
}
}
// context features
message ContextFeatures {
repeated PBFeature features = 1;
}
// PBRequest specifies the request for aggregator
message PBRequest {
// debug mode
int32 debug_level = 1;
// user features, key is user input name
map<string, PBFeature> user_features = 2;
// item ids
repeated string item_ids = 3;
// context features for each item, key is context input name
map<string, ContextFeatures> context_features = 4;
// number of nearest neighbors(items) to retrieve
// from faiss
int32 faiss_neigh_num = 5;
// item features for each item, key is item input name
map<string, ContextFeatures> item_features = 6;
}
// PBResponse specifies the response for aggregator
message PBResponse {
// torch output tensors
map<string, pytorch.eas.ArrayProto> map_outputs = 1;
// fg ouput features
map<string, string> generate_features = 2;
// all fg input features
map<string, string> raw_features = 3;
// item ids
repeated string item_ids = 4;
}
The following table describes the debug_level parameter.
By default, you do not need to configure the debug_level parameter. Configure this parameter only if you need to perform debugging.
debug_level | Description |
0 | A model service is successfully called. |
1 | In normal mode, shape verification is performed on the input and output of FG, and the input and output features are saved. |
2 | In normal mode, shape verification is performed on the input and output of FG, the input and output features are saved, and the input tensor of a model service is saved. |
100 | In normal mode, the request to call a model service is saved. |
102 | In normal mode, shape verification is performed on the input and output of FG, the input and output features are saved, and the input tensor and user embedding information of a model service are saved. |
903 | The time required to call a model service in each phase is printed. |
Status codes
The following table describes the status codes that are returned when you call a TorchEasyRec model service. For more information about the status codes, see Appendix: Status codes.
Status code | Description |
200 | A model service is successfully called. |
400 | The request information is incorrect. |
500 | A model service failed to be called. For more information, see the service log. |