部署和調用TorchEasyRec模型服務 - Platform For AI

EAS內建的TorchEasyRec Processor支援將TorchEasyRec或Torch訓練的推薦模型部署為打分服務，並具備整合特徵工程的能力。通過聯合最佳化特徵工程和Torch模型，Processor能夠實現高效能的打分服務。本文為您介紹如何部署及調用TorchEasyRec模型服務。

背景資訊

基於TorchEasyRec Processor的Recommendation Engine的架構圖如下所示：

其中TorchEasyRec Processor主要包含以下模組：

Item Feature Cache：將FeatureStore裡面的物品側特徵緩衝到記憶體中，可以減少請求FeatureStore帶來的網路開銷和壓力，同時可以提升推理服務的效能。當物品側特徵包含即時特徵時，FeatureStore負責對即時特徵進行同步。

Feature Generator（FG）：特徵產生模組，通過設定檔定義特徵變換的過程，使用一套C++代碼保證離線和線上特徵處理邏輯的一致性。
TorchModel：Torch模型，經過TorchEasyRec或Torch訓練後匯出的ScriptedModel。

使用限制

僅支援使用通用型執行個體規格類型系列g6、g7或g8機型，並且支援T4、A10等GPU型號，詳情請參見通用型（g系列）。如果部署GPU服務，請確保CUDA Driver版本不低於535。

版本列表

TorchEasyRec Processor仍然在迭代中，建議您使用最新的版本部署推理服務，新的版本將提供更多的功能和更高的推理效能。已經發布的版本列表如下：

Processor名稱	發布日期	Torch版本	FG版本	新增功能
easyrec-torch-0.1	20240910	2.4	0.2.9	支援Feature Generator（FG）和FeatureStore Item Feature Cache。支援Torch模型CPU或GPU推理。支援Input_Tile User類特徵自動擴充。支援Faiss向量召回。支援在normal模式下預熱。
easyrec-torch-0.2	20240930	2.4	0.2.9	FeatureDB支援複雜類型。加快FeatureStore資料初始化載入時間。最佳化bypass模式下debug_level。最佳化H2D。
easyrec-torch-0.3	20241014	2.4	0.2.9	FeatureStore支援JSON初始化。 proto重定義。
easyrec-torch-0.4	20241028	2.4	0.3.1	修複Feature Generator（FG）複雜類型問題
easyrec-torch-0.5	20241114	2.4	0.3.1	最佳化離線上一致性邏輯，Debug設定時，無論item是否存在都產生FG之後的特徵資訊。
easyrec-torch-0.6	20241118	2.4	0.3.6	最佳化package打包過程，去除冗餘標頭檔。
easyrec-torch-0.7	20241206	2.5	0.3.9	sequence primary key支援array。升級torch版本至2.5。升級FG版本至0.3.9。
easyrec-torch-0.8	20241225	2.5	0.3.9	升級TensorRT SDK版本至2.5。 Torcheasyrec的模型輸入支援int64類型。升級FeatureStore版本，解決Holo查特徵問題。最佳化debug時運行效率和邏輯。 proto中新增item_features，支援從請求側傳入item特徵。
easyrec-torch-0.9	20250115	2.5	0.4.1	升級Feature Generator（FG）版本至0.4.1，最佳化多線程FG初始化時間。
easyrec-torch-1.0	20250206	2.5	0.4.2	支援Weighted Feature。升級Feature Generator（FG）版本至0.4.2。支援AMD CPU。
easyrec-torch-1.1	20250423	2.5	0.5.9	升級FeatureStore SDK，新增對FeatureDB VPC網路高速連通的支援，並支援根據event_time和ttl過濾記憶體中的即時特徵到期資料。升級Feature Generator（FG）版本，新增對自訂序列特徵的支援，並修複了組合特徵（combo feature）的相關問題。
easyrec-torch-1.2	20250512	2.5	0.6.0	升級fg 0.6.0 支援從多個featurestore entity中讀取特徵，例如config["fs_entity"] = "item,raw"; debug時，輸出請求中不在featurestore的itemid
easyrec-torch-1.3	20250529	2.5	0.6.5	升級fg 0.6.5 weighted id feature支援FSMAP 支援WordPiece分詞支援boolean_mask過濾運算元最佳化升級運算式特徵運算元
easyrec-torch-1.4	20250715	2.5	0.6.9	升級fg 0.6.9 運算式特徵運算元增加若干新的函數 Debug String的產生邏輯從Processor中移到fg lib內部
easyrec-torch-1.5	20250918	2.5	0.7.3	升級fg 0.7.3 支援擷取線上請求進行模型預熱升級FeatureStore SDK, 支援maxcompute schema三級表，支援零信任無ak調用，相容特徵視圖添加特徵
easyrec-torch-1.6	20251021	2.5	0.7.4	最佳化日誌控制，避免callback請求多時出現大量日誌輸出影響效能 context feature處理最佳化特徵預先處理與fg共用線程池，節約線程資源 fg 升級0.7.4
easyrec-torch-1.7	20251104	2.5	0.7.4	最佳化save debug tensor邏輯，避免callback引發過多檔案儲存
easyrec-torch-1.8	20251201	2.5	0.7.4	最佳化feature store SDK的線程池，避免資源緊張時線程無法建立
easyrec-torch-1.9	20260109	2.5	1.0.0	GPU推理支援cuda multi-stream, 提高系統吞吐和效能 fg升級1.0.0
easyrec-torch-1.10	20260123	2.5	1.0.1	日誌自動記錄慢請求的執行時間增加配置參數，在出現慢請求時儲存請求資料

步驟一：部署服務

準備服務組態檔torcheasyrec.json。

您需要指定Processor種類為easyrec-torch-{version}，其中 {version} 請參照版本列表進行選擇。JSON設定檔內容樣本如下：

使用FG的樣本（fg_mode='normal'）

{
  "metadata": {
    "instance": 1,
    "name": "alirec_rank_with_fg",
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 256,
      "worker_threads": 16
    }
  },
  "cloud": {
        "computing": {
            "instance_type": "ecs.gn6i-c16g1.4xlarge"
        }
  },
  "model_config": {
    "fg_mode": "normal",
    "fg_threads": 8,
    "region": "YOUR_REGION",
    "fs_project": "YOUR_FS_PROJECT",
    "fs_model": "YOUR_FS_MODEL",
    "fs_entity": "item",
    "load_feature_from_offlinestore": true,
    "access_key_id":"YOUR_ACCESS_KEY_ID",
    "access_key_secret":"YOUR_ACCESS_KEY_SECRET"
  },
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://xxx/xxx/export",
        "readOnly": false
      },
      "properties": {
        "resource_type": "code"
      }
    }
  ],
  "processor":"easyrec-torch-0.3"
}

不使用FG的樣本（fg_mode='bypass'）

{
  "metadata": {
    "instance": 1,
    "name": "alirec_rank_no_fg",
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 256,
      "worker_threads": 16
    }
  },
  "cloud": {
        "computing": {
            "instance_type": "ecs.gn6i-c16g1.4xlarge"
        }
  },
  "model_config": {
    "fg_mode": "bypass"
  },
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://xxx/xxx/export",
        "readOnly": false
      },
      "properties": {
        "resource_type": "code"
      }
    }
  ],
  "processor":"easyrec-torch-0.3"
}

其中關鍵參數說明如下，其他參數說明，請參見JSON部署。

參數	是否必選	描述	樣本
processor	是	TorchEasyRec Processor。	"processor":"easyrec-torch-0.3"
path	是	表示服務儲存掛載的Object Storage Service路徑，用來存放模型檔案。	"path": "oss://examplebucket/xxx/export"
fg_mode	否	用於指定特徵工程模式，取值如下： bypass（預設值）：不使用FG，僅部署Torch模型。適用於自訂特徵處理的情境。該模式下不需要配置Processor訪問FeatureStore相關參數。 normal：使用FG。通常配合TorchEasyRec進行模型訓練。	"fg_mode": "normal"
fg_threads	否	用於單請求執行FG的並發線程數。	"fg_threads": 15
outputs	否	Torch模型預測的輸出變數名稱，如probs_ctr。若存在多個則用半形逗號（,）分隔。預設輸出所有變數。	"outputs":"probs_ctr,probs_cvr"
item_empty_score	否	當Item ID不存在時，預設的打分情況。預設值為0。	"item_empty_score": -1
Processor召回相關參數
faiss_neigh_num	否	FAISS向量召回數量。預設從請求體（Request）中的`faiss_neigh_num`欄位擷取；若該欄位未提供，則會讀取`model_config`配置中的`faiss_neigh_num`值，其預設設定為1。	"faiss_neigh_num"：200
faiss_nprobe	否	nprobe參數指定檢索過程中檢索到的簇的數量，預設值為800。FAISS中的倒排檔案索引是將資料劃分為多個小的簇（或組），並為每個簇維護一個倒排列表。更大的 `nprobe` 值通常會導致更高的檢索精度，但會增加計算成本和搜尋時間；反之則會降低精度但加快速度。	"faiss_nprobe" : 700
Processor訪問FeatureStore相關參數
fs_project	否	FeatureStore專案名稱，使用FeatureStore時需指定該欄位。關於FeatureStore的詳細介紹，請參見配置FeatureStore專案。	"fs_project": "fs_demo"
fs_model	否	FeatureStore模型特徵名稱。	"fs_model": "fs_rank_v1"
fs_entity	否	FeatureStore實體名稱。	"fs_entity": "item"
region	否	FeatureStore產品所在的地區，例如華北2（北京）配置為cn-beijing。更多地區配置說明，請參見服務存取點。	"region": "cn-beijing"
access_key_id	否	FeatureStore產品的AccessKey ID。	"access_key_id": "xxxxx"
access_key_secret	否	FeatureStore產品的AccessKey Secret。	"access_key_secret": "xxxxx"
load_feature_from_offlinestore	否	離線特徵是否直接從FeatureStore OfflineStore中擷取資料，取值如下： True：是，會從FeatureStore OfflineStore中擷取資料。 False（預設值）：否，會從FeatureStore OnlineStore中擷取資料。	"load_feature_from_offlinestore": True
featuredb_username	否	FeatureDB使用者名稱。	"featuredb_username":"xxx"
featuredb_password	否	FeatureDB密碼。	"featuredb_passwd":"xxx"
input_tile：特徵自動擴充相關參數
INPUT_TILE	否	支援Feature自動擴充，對於一次請求中值都相同的特徵（例如user_id），只需傳遞一個值即可，這有助於減少請求大小、網路傳輸時間和計算時間。該功能必須在normal模式下使用，需要與TorchEasyRec配合使用，並且在匯出時設定相應的環境變數。目前系統預設從TorchEasyRec匯出模型目錄下的model_acc.json檔案中讀取INPUT_TILE值，如果該檔案不存在，則會讀取環境變數裡的值。開啟後：環境變數值設定為2：User側特徵FG僅計算一次。環境變數值設定為3：User側特徵FG僅計算一次，系統會將User和Item的Embedding分開計算，並且User側的Embedding僅計算一次。適用於User側特徵比較多的情況。	"processor_envs": [ { "name": "INPUT_TILE", "value": "2" } ]
NO_GRAD_GUARD	否	推理時禁止梯度計算，會停止跟蹤操作，從而不構建計算圖。說明當設定為1時，可能會出現部分模型不相容的情況。如果在第二次運行推理過程中遇到卡頓問題，可以通過添加環境變數`PYTORCH_TENSOREXPR_FALLBACK=2`來解決，這樣可以跳過編譯步驟，同時保留一定的圖最佳化功能。	"processor_envs": [ { "name": "NO_GRAD_GUARD", "value": "1" } ]
模型預熱相關參數
warmup_data_path	否	開啟warmup功能並指定warmup檔案的儲存路徑。為保證預熱檔案不丟失，需在storage配置中增加一個oss掛載，掛載到該路徑。	"warmup_data_path": "/warmup"
warmup_cnt_per_file	否	每個warmup pb檔案的預熱次數。適當增大該參數可以保證預熱充分，但相應預熱時間會有所延長。預設值20。	"warmup_cnt_per_file": 20,
warmup_pb_files_count	否	儲存線上請求的個數，儲存為pb檔案供下次啟動預熱使用，儲存路徑為warmup_data_path參數指定。預設值64。	"warmup_pb_files_count": 64
慢請求日誌和儲存
long_request_threshold	否	慢請求的時間閾值，單位ms。超過閾值的請求在日誌中自動記錄各階段執行時間。預設值200ms。	"long_request_threshold": 200
save_long_request	否	bool參數，當出現慢請求時(超過long_request_threshold)，是否將請求儲存為pb檔案。預設為false。	"save_long_request": true

部署TorchEasyRec模型服務。您可以任意選擇一種部署方式：
JSON獨立部署（推薦）
具體操作步驟如下：
1. 登入PAI控制台，在頁面上方選擇目標地區，並在右側選擇目標工作空間，然後單擊進入EAS。
2. 在模型線上服務（EAS）頁面，單擊部署服務，然後在自訂模型部署地區，單擊JSON獨立部署。
3. 在JSON文本編輯框中，填入已準備好的JSON設定檔內容，然後單擊部署。
eascmd用戶端部署
1. 下載並認證用戶端，以Windows 64版本為例。
2. 開啟終端工具，在JSON檔案所在目錄，使用以下命令建立服務。更多操作說明，請參見命令使用說明。
```
eascmdwin64.exe create <service.json>
```
  其中：<service.json>需要替換為您已建立的JSON檔案名稱。例如torcheasyrec.json。

步驟二：調用服務

TorchEasyRec模型服務部署完成後，按照以下操作步驟查看服務調用資訊：

登入PAI控制台，在頁面上方選擇目標地區，並在右側選擇目標工作空間，然後單擊進入EAS。
單擊目標服務的服務方式列下的調用資訊，查看服務的訪問地址和Token資訊。

TorchEasyRec模型服務的輸入輸出格式為Protobuf格式，根據是否使用FG，分為以下兩種調用方法：

使用FG（fg_mode='normal'）

支援以下兩種調用方法：

使用EAS Java SDK

在執行代碼前，您需要配置Maven環境，配置詳情請參見Java SDK使用說明, 最新版本Java SDK見https://github.com/pai-eas/eas-java-sdk。請求服務alirec_rank_with_fg的範例程式碼如下：

package com.aliyun.openservices.eas.predict;

import com.aliyun.openservices.eas.predict.http.Compressor;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.proto.TorchRecPredictProtos;
import com.aliyun.openservices.eas.predict.request.TorchRecRequest;
import com.aliyun.openservices.eas.predict.proto.TorchPredictProtos.ArrayProto;

import java.util.*;


public class TorchRecPredictTest {
    public static PredictClient InitClient() {
        return new PredictClient(new HttpConfig());
    }

    public static TorchRecRequest buildPredictRequest() {
        TorchRecRequest TorchRecRequest = new TorchRecRequest();
        TorchRecRequest.appendItemId("7033");

        TorchRecRequest.addUserFeature("user_id", 33981,"int");

        ArrayList<Double> list = new ArrayList<>();
        list.add(0.24689289764507472);
        list.add(0.005758482924454689);
        list.add(0.6765301324940026);
        list.add(0.18137273055602343);
        TorchRecRequest.addUserFeature("raw_3", list,"List<double>");

        Map<String,Integer> myMap =new LinkedHashMap<>();
        myMap.put("866", 4143);
        myMap.put("1627", 2451);
        TorchRecRequest.addUserFeature("map_1", myMap,"map<string,int>");

        ArrayList<ArrayList<Float>> list2 = new ArrayList<>();
        ArrayList<Float> innerList1 = new ArrayList<>();
        innerList1.add(1.1f);
        innerList1.add(2.2f);
        innerList1.add(3.3f);
        list2.add(innerList1);
        ArrayList<Float> innerList2 = new ArrayList<>();
        innerList2.add(4.4f);
        innerList2.add(5.5f);
        list2.add(innerList2);
        TorchRecRequest.addUserFeature("click", list2,"list<list<float>>");

        TorchRecRequest.addContextFeature("id_2", list,"List<double>");
        TorchRecRequest.addContextFeature("id_2", list,"List<double>");

        System.out.println(TorchRecRequest.request);
        return TorchRecRequest;
    }

    public static void main(String[] args) throws Exception{
        PredictClient client = InitClient();
        client.setToken("tokenGeneratedFromService");
        client.setEndpoint("175805416243****.cn-beijing.pai-eas.aliyuncs.com");
        client.setModelName("alirec_rank_with_fg");
        client.setRequestTimeout(100000);


        testInvoke(client);
        testDebugLevel(client);
        client.shutdown();
    }

    public static void testInvoke(PredictClient client) throws Exception {
        long startTime = System.currentTimeMillis();
        TorchRecPredictProtos.PBResponse response = client.predict(buildPredictRequest());
        for (Map.Entry<String, ArrayProto> entry : response.getMapOutputsMap().entrySet()) {

            System.out.println("Key: " + entry.getKey() + ", Value: " + entry.getValue());
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");

    }

    public static void testDebugLevel(PredictClient client) throws Exception {
        long startTime = System.currentTimeMillis();
        TorchRecRequest request = buildPredictRequest();
        request.setDebugLevel(1);
        TorchRecPredictProtos.PBResponse response = client.predict(request);
        Map<String, String> genFeas = response.getGenerateFeaturesMap();
        for(String itemId: genFeas.keySet()) {
            System.out.println(itemId);
            System.out.println(genFeas.get(itemId));
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");

    }
}

其中：

client.setToken("tokenGeneratedFromService")：需要將括弧裡的配置設定為您的服務Token。例如MmFiMDdlO****wYjhhNjgwZmZjYjBjMTM1YjliZmNkODhjOGVi****。
client.setEndpoint("175805416243****.cn-beijing.pai-eas.aliyuncs.com")：需要將括弧裡的配置設定為您的服務Endpoint。例如175805416243****.cn-beijing.pai-eas.aliyuncs.com。
client.setModelName("alirec_rank_with_fg")：需要將括弧裡的配置設定為您的服務名稱。

使用EAS Python SDK

在執行代碼前，請先使用pip install -U eas-prediction --user命令安裝或更新eas-prediction庫，更多配置詳情請參見Python SDK使用說明。範例程式碼如下：

from eas_prediction import PredictClient
from eas_prediction.torchrec_request import TorchRecRequest


if __name__ == '__main__':
    endpoint = 'http://localhost:6016'

    client = PredictClient(endpoint, '<YOUR_SERVICE_NAME>')
    client.set_token('<your_service_token>')
    client.init()
    torchrec_req = TorchRecRequest()

    torchrec_req.add_user_fea('user_id', 'u001d', "STRING")
    torchrec_req.add_user_fea('age', 12, "INT")
    torchrec_req.add_user_fea('weight', 129.8, "FLOAT")
    torchrec_req.add_item_id('item_0001')
    torchrec_req.add_item_id('item_0002')
    torchrec_req.add_item_id('item_0003')
    torchrec_req.add_user_fea("raw_3", [0.24689289764507472, 0.005758482924454689, 0.6765301324940026, 0.18137273055602343], "list<double>")
    torchrec_req.add_user_fea("raw_4", [0.9965264740966043, 0.659596586238391, 0.16396649403055896, 0.08364986620265635], "list<double>")
    torchrec_req.add_user_fea("map_1", {"0":0.37845234405201145}, "map<int,float>")
    torchrec_req.add_user_fea("map_2", {"866":4143,"1627":2451}, "map<int,int>")
    torchrec_req.add_context_fea("id_2", [866], "list<int>" )
    torchrec_req.add_context_fea("id_2", [7022,1], "list<int>" )
    torchrec_req.add_context_fea("id_2", [7022,1], "list<int>" )
    torchrec_req.add_user_fea("click", [[0.94433516,0.49145547], [0.94433516, 0.49145597]], "list<list<float>>")

    res = client.predict(torchrec_req)
    print(res)

其中關鍵配置說明如下：

endpoint：配置為您的服務訪問地址，例如http://175805416243****.cn-beijing.pai-eas.aliyuncs.com/。
<your_service_name>：替換為您的服務名稱。
<your_service_token>：替換您的服務Token，例如MmFiMDdlO****wYjhhNjgwZmZjYjBjMTM1YjliZmNkODhjOGVi****。

不使用FG（fg_mode='bypass'）

使用EAS Java SDK

在執行代碼前，您需要配置Maven環境，配置詳情請參見Java SDK使用說明。請求服務alirec_rank_no_fg的範例程式碼如下：

package com.aliyun.openservices.eas.predict;

import java.util.List;
import java.util.Arrays;


import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TorchDataType;
import com.aliyun.openservices.eas.predict.request.TorchRequest;
import com.aliyun.openservices.eas.predict.response.TorchResponse;

public class Test_Torch {
    public static PredictClient InitClient() {
        return new PredictClient(new HttpConfig());
    }

    public static TorchRequest buildPredictRequest() {
        TorchRequest request = new TorchRequest();
        float[] content = new float[2304000];
        for (int i = 0; i < content.length; i++) {
            content[i] = (float) 0.0;
        }
        long[] content_i = new long[900];
        for (int i = 0; i < content_i.length; i++) {
            content_i[i] = 0;
        }

        long[] a = Arrays.copyOfRange(content_i, 0, 300);
        float[] b = Arrays.copyOfRange(content, 0, 230400);
        request.addFeed(0, TorchDataType.DT_INT64, new long[]{300,3}, content_i);
        request.addFeed(1, TorchDataType.DT_FLOAT, new long[]{300,10,768}, content);
        request.addFeed(2, TorchDataType.DT_FLOAT, new long[]{300,768}, b);
        request.addFeed(3, TorchDataType.DT_INT64, new long[]{300}, a);
        request.addFetch(0);
        request.setDebugLevel(903);
        return request;
    }

    public static void main(String[] args) throws Exception {
        PredictClient client = InitClient();
        client.setToken("tokenGeneratedFromService");
        client.setEndpoint("175805416243****.cn-beijing.pai-eas.aliyuncs.com");
        client.setModelName("alirec_rank_no_fg");
        client.setIsCompressed(false);
        long startTime = System.currentTimeMillis();
        for (int i = 0; i < 10; i++) {
            TorchResponse response = null;
            try {
                response = client.predict(buildPredictRequest());
                List<Float> result = response.getFloatVals(0);
                System.out.print("Predict Result: [");
                for (int j = 0; j < result.size(); j++) {
                    System.out.print(result.get(j).floatValue());
                    if (j != result.size() - 1) {
                        System.out.print(", ");
                    }
                }
                System.out.print("]\n");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");
        client.shutdown();
    }
}

其中：

client.setToken("tokenGeneratedFromService")：需要將括弧裡的配置設定為您的服務Token。例如MmFiMDdlO****wYjhhNjgwZmZjYjBjMTM1YjliZmNkODhjOGVi****。
client.setEndpoint("175805416243****.cn-beijing.pai-eas.aliyuncs.com")：需要將括弧裡的配置設定為您的服務Endpoint。例如175805416243****.cn-beijing.pai-eas.aliyuncs.com。
client.setModelName("alirec_rank_no_fg")：需要將括弧裡的配置設定為您的服務名稱。

使用EAS Python SDK

在執行代碼前，請先使用pip install -U eas-prediction --user命令安裝或更新eas-prediction庫，更多配置詳情請參見Python SDK使用說明。請求服務alirec_rank_no_fg的範例程式碼如下：

from eas_prediction import PredictClient
from eas_prediction import TorchRequest

# snappy data
req = TorchRequest(False)

req.add_feed(0, [300, 3], TorchRequest.DT_INT64, [1] * 900)
req.add_feed(1, [300, 10, 768], TorchRequest.DT_FLOAT, [1.0] * 3 * 768000)
req.add_feed(2, [300, 768], TorchRequest.DT_FLOAT, [1.0] * 3 * 76800)
req.add_feed(3, [300], TorchRequest.DT_INT64, [1] * 300)


client = PredictClient('<your_endpoint>', '<your_service_name>')
client.set_token('<your_service_token>')

client.init()

resp = client.predict(req)
print(resp)

其中關鍵配置說明如下：

<your_endpoint>：替換為您的服務訪問地址，例如http://175805416243****.cn-beijing.pai-eas.aliyuncs.com/。
<your_service_name>：替換為您的服務名稱。
<your_service_token>：替換您的服務Token，例如MmFiMDdlO****wYjhhNjgwZmZjYjBjMTM1YjliZmNkODhjOGVi****。

有關訪問服務返回的狀態代碼的詳細說明，請參見服務狀態代碼說明。您也可以參考請求格式自行構建服務要求。

請求格式

用戶端調用服務可以根據.proto檔案手動產生預測的請求代碼檔案。如果您希望自行構建服務要求，則可以參考如下protobuf的定義來產生相應的代碼：

pytorch_predict.proto：Torch模型的請求定義

syntax = "proto3";

package pytorch.eas;
option cc_enable_arenas = true;
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "TorchPredictProtos";

enum ArrayDataType {
  // Not a legal value for DataType. Used to indicate a DataType field
  // has not been set.
  DT_INVALID = 0;
  
  // Data types that all computation devices are expected to be
  // capable to support.
  DT_FLOAT = 1;
  DT_DOUBLE = 2;
  DT_INT32 = 3;
  DT_UINT8 = 4;
  DT_INT16 = 5;
  DT_INT8 = 6;
  DT_STRING = 7;
  DT_COMPLEX64 = 8;  // Single-precision complex
  DT_INT64 = 9;
  DT_BOOL = 10;
  DT_QINT8 = 11;     // Quantized int8
  DT_QUINT8 = 12;    // Quantized uint8
  DT_QINT32 = 13;    // Quantized int32
  DT_BFLOAT16 = 14;  // Float32 truncated to 16 bits.  Only for cast ops.
  DT_QINT16 = 15;    // Quantized int16
  DT_QUINT16 = 16;   // Quantized uint16
  DT_UINT16 = 17;
  DT_COMPLEX128 = 18;  // Double-precision complex
  DT_HALF = 19;
  DT_RESOURCE = 20;
  DT_VARIANT = 21;  // Arbitrary C++ data types
}

// Dimensions of an array
message ArrayShape {
  repeated int64 dim = 1 [packed = true];
}

// Protocol buffer representing an array
message ArrayProto {
  // Data Type.
  ArrayDataType dtype = 1;

  // Shape of the array.
  ArrayShape array_shape = 2;

  // DT_FLOAT.
  repeated float float_val = 3 [packed = true];

  // DT_DOUBLE.
  repeated double double_val = 4 [packed = true];

  // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
  repeated int32 int_val = 5 [packed = true];

  // DT_STRING.
  repeated bytes string_val = 6;

  // DT_INT64.
  repeated int64 int64_val = 7 [packed = true];

}


message PredictRequest {

  // Input tensors.
  repeated ArrayProto inputs = 1;

  // Output filter.
  repeated int32 output_filter = 2;

  // Input tensors for rec
  map<string, ArrayProto> map_inputs = 3;

  // debug_level for rec
  int32 debug_level = 100;
}

// Response for PredictRequest on successful run.
message PredictResponse {
  // Output tensors.
  repeated ArrayProto outputs = 1;
  // Output tensors for rec.
  map<string, ArrayProto> map_outputs = 2;
}

torchrec_predict.proto：Torch模型+FG的請求定義

syntax = "proto3";

option go_package = ".;torch_predict_protos";
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "TorchRecPredictProtos";
package com.alibaba.pairec.processor;
import "pytorch_predict.proto";

//long->others
message LongStringMap {
  map<int64, string> map_field = 1;
}
message LongIntMap {
  map<int64, int32> map_field = 1;
}
message LongLongMap {
  map<int64, int64> map_field = 1;
}
message LongFloatMap {
  map<int64, float> map_field = 1;
}
message LongDoubleMap {
  map<int64, double> map_field = 1;
}

//string->others
message StringStringMap {
  map<string, string> map_field = 1;
}
message StringIntMap {
  map<string, int32> map_field = 1;
}
message StringLongMap {
  map<string, int64> map_field = 1;
}
message StringFloatMap {
  map<string, float> map_field = 1;
}
message StringDoubleMap {
  map<string, double> map_field = 1;
}

//int32->others
message IntStringMap {
  map<int32, string> map_field = 1;
}
message IntIntMap {
  map<int32, int32> map_field = 1;
}
message IntLongMap {
  map<int32, int64> map_field = 1;
}
message IntFloatMap {
  map<int32, float> map_field = 1;
}
message IntDoubleMap {
  map<int32, double> map_field = 1;
}

// list
message IntList {
  repeated int32 features = 1;
}
message LongList {
  repeated int64 features  = 1;
}

message FloatList {
  repeated float features = 1;
}
message DoubleList {
  repeated double features = 1;
}
message StringList {
  repeated string features = 1;
}

// lists
message IntLists {
  repeated IntList lists = 1;
}
message LongLists {
  repeated LongList lists = 1;
}

message FloatLists {
  repeated FloatList lists = 1;
}
message DoubleLists {
  repeated DoubleList lists = 1;
}
message StringLists {
  repeated StringList lists = 1;
}

message PBFeature {
  oneof value {
    int32 int_feature = 1;
    int64 long_feature = 2;
    string string_feature = 3;
    float float_feature = 4;
    double double_feature=5;

    LongStringMap long_string_map = 6; 
    LongIntMap long_int_map = 7; 
    LongLongMap long_long_map = 8; 
    LongFloatMap long_float_map = 9; 
    LongDoubleMap long_double_map = 10; 
    
    StringStringMap string_string_map = 11; 
    StringIntMap string_int_map = 12; 
    StringLongMap string_long_map = 13; 
    StringFloatMap string_float_map = 14; 
    StringDoubleMap string_double_map = 15; 

    IntStringMap int_string_map = 16; 
    IntIntMap int_int_map = 17; 
    IntLongMap int_long_map = 18; 
    IntFloatMap int_float_map = 19; 
    IntDoubleMap int_double_map = 20; 

    IntList int_list = 21; 
    LongList long_list =22;
    StringList string_list = 23;
    FloatList float_list = 24;
    DoubleList double_list = 25;

    IntLists int_lists = 26;
    LongLists long_lists =27;
    StringLists string_lists = 28;
    FloatLists float_lists = 29;
    DoubleLists double_lists = 30;
    
  }
}

// context features
message ContextFeatures {
  repeated PBFeature features = 1;
}

// PBRequest specifies the request for aggregator
message PBRequest {
  // debug mode
  int32 debug_level = 1;

  // user features, key is user input name
  map<string, PBFeature> user_features = 2;

  // item ids
  repeated string item_ids = 3;

  // context features for each item, key is context input name 
  map<string, ContextFeatures> context_features = 4;

  // number of nearest neighbors(items) to retrieve
  // from faiss
  int32 faiss_neigh_num = 5;

  // item features for each item, key is item input name 
  map<string, ContextFeatures> item_features = 6;
}

// PBResponse specifies the response for aggregator
message PBResponse {
  // torch output tensors
  map<string, pytorch.eas.ArrayProto> map_outputs = 1;

  // fg ouput features
  map<string, string> generate_features = 2;

  // all fg input features
  map<string, string> raw_features = 3;

  // item ids
  repeated string item_ids = 4;

}

debug_level說明如下：

說明

預設情況下無需配置，當您需要進行Debug調試時才需傳入。

debug_level	說明
0	服務正常預測。
1	在normal模式下，對請求的key做校正，並對FG的輸入輸出進行形狀校正，同時儲存輸入特徵和輸出特徵，但不進行預測。
2	在normal模式下，對請求的key做校正，並對FG的輸入輸出進行形狀校正，儲存輸入特徵和輸出特徵，及模型輸入的Tensor，進行預測。
3	在normal模式下，對請求的key做校正，並對FG的輸入輸出進行形狀校正，輸出特徵，不做預測。
100	在normal模式下儲存預測請求。
102	在normal模式下進行向量召回，對請求的key做校正，對FG的輸入輸出進行形狀校正，儲存輸入特徵和輸出特徵，以及模型輸入的Tensor、User Embedding結果。
903	列印每個階段的預測時間。
904	對請求中缺失的特徵欄位進行校正，並在日誌中記錄。

服務狀態代碼說明

訪問TorchEasyRec服務時，可能返回的主要狀態代碼說明如下。關於訪問EAS服務返回的更多狀態代碼說明，請參見附錄：服務狀態代碼與常見報錯。

狀態代碼	說明
200	服務正常返回。
400	請求輸入有問題。
500	預測失敗，詳細請查看服務日誌。

背景資訊

使用限制

版本列表

步驟一：部署服務

使用FG的樣本（fg_mode='normal'）

不使用FG的樣本（fg_mode='bypass'）

JSON獨立部署（推薦）

eascmd用戶端部署

步驟二：調用服務

使用FG（fg_mode='normal'）

使用EAS Java SDK

使用EAS Python SDK

不使用FG（fg_mode='bypass'）

使用EAS Java SDK

使用EAS Python SDK

請求格式

pytorch_predict.proto：Torch模型的請求定義

torchrec_predict.proto：Torch模型+FG的請求定義

服務狀態代碼說明