All Products
Search
Document Center

Platform For AI:Built-in processors

Last Updated:Apr 01, 2026

A processor is a package of online prediction logic. Elastic Algorithm Service (EAS) provides built-in processors to deploy standard models, eliminating the need to develop this logic yourself.

The following table lists the processor names and codes in EAS. Provide the processor code when deploying a service with EASCMD.

Processor name

Processor code (EASCMD only)

Reference

CPU edition

GPU edition

EasyRec

easyrec-2.4

easyrec-2.4

EasyRec processor

TorchEasyRec

easyrec-torch-1.0

easyrec-torch-1.0

TorchEasyRec processor

PMML

pmml

None

PMML processor

TensorFlow 1.12

tensorflow_cpu_1.12

tensorflow_gpu_1.12

TensorFlow 1.12 processor

TensorFlow 1.14

tensorflow_cpu_1.14

tensorflow_gpu_1.14

TensorFlow 1.14 processor

TensorFlow 1.15

tensorflow_cpu_1.15

tensorflow_gpu_1.15

TensorFlow 1.15 processor (includes the PAI-Blade agility edition optimization engine)

TensorFlow 2.3

tensorflow_cpu_2.3

None

TensorFlow 2.3 processor

PyTorch 1.6

pytorch_cpu_1.6

pytorch_gpu_1.6

PyTorch 1.6 processor (includes the PAI-Blade agility edition optimization engine)

Caffe

caffe_cpu

caffe_gpu

Caffe processor

Parameter Server

parameter_server

None

Parameter Server processor

Alink

alink_pai_processor

None

None

xNN

xnn_cpu

None

None

EasyVision

easy_vision_cpu_tf1.12_torch151

easy_vision_gpu_tf1.12_torch151

EasyVision processor

EasyTransfer

easytransfer_cpu

easytransfer_gpu

EasyTransfer processor

EasyNLP

easynlp

easynlp

EasyNLP processor

EasyCV

easycv

easycv

EasyCV processor

Blade

blade_cpu

blade_cuda10.0_beta

None

MediaFlow

None

mediaflow

MediaFlow processor

Triton

None

triton

Triton processor

PMML processor

The PMML processor in EAS:

  • Loads a PMML model file as a service.

  • Processes requests to the model service.

  • Calculates and returns prediction results to the client.

The PMML processor provides a default strategy for handling missing values. If no isMissing policy is specified for the feature columns in the PMML model file, the system imputes them with the following defaults.

Type

Default

BOOLEAN

false

DOUBLE

0.0

FLOAT

0.0

INT

0

STRING

""

Deploy a PMML model in any of the following ways:

  • Console

    Set the Processor Type parameter to PMML. For more information, see Deploy a model service by using the console.

  • EASCMD client

    In the service.json configuration file, set processor to pmml. Example:

    {
      "processor": "pmml",
      "generate_token": "true",
      "model_path": "http://xxxxx/lr.pmml",
      "name": "eas_lr_example",
      "metadata": {
        "instance": 1,
        "cpu": 1 # EAS allocates 4 GB of memory per CPU core (1 Quota).
      }
    }
  • Data Science Workshop (DSW)

    Similar to using the EASCMD client. Create a service.json configuration file. For more information, see Deploy a model service by using EASCMD.

TensorFlow 1.12 processor

The EAS TensorFlow 1.12 processor loads TensorFlow models in SavedModel (recommended) or SessionBundle format. Convert Keras and Checkpoint models to SavedModel format before deployment. For more information, see TensorFlow FAQ.

Note

This processor does not support custom TensorFlow operations.

Deploy a TensorFlow model in one of the following ways:

  • Console

    Set Processor Type to TensorFlow1.12. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to tensorflow_cpu_1.12 or tensorflow_gpu_1.12. Select the code based on deployment resources. A mismatch between processor and resource type causes deployment failure. Example:

    {
      "name": "tf_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/savedmodel_example.zip",
      "processor": "tensorflow_cpu_1.12",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • DSW

    Similar to using the EASCMD client. Create a service.json configuration file. For more information, see Deploy model services by using EASCMD.

TensorFlow 1.14 processor

The EAS TensorFlow 1.14 processor loads TensorFlow models in SavedModel (recommended) or SessionBundle format. Convert Keras and Checkpoint models to SavedModel format before deployment. For more information, see TensorFlow FAQ.

Note

This processor does not support custom TensorFlow operations.

Deploy a TensorFlow model in one of the following ways:

  • Console

    Set Processor Type to TensorFlow1.14. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to tensorflow_cpu_1.14 or tensorflow_gpu_1.14. Select the code that matches your deployment resources. A mismatch between processor and resource type causes deployment failure. Example:

    {
      "name": "tf_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/savedmodel_example.zip",
      "processor": "tensorflow_cpu_1.14",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • DSW

    Similar to using the EASCMD client. Create a service.json configuration file. For more information, see Deploy model services by using EASCMD.

TensorFlow 1.15 processor (PAI-Blade Agility Edition)

The EAS TensorFlow 1.15 processor loads TensorFlow models in SavedModel (recommended) or SessionBundle format. Convert Keras and Checkpoint models to SavedModel format before deployment. For more information, see TensorFlow FAQ.

Note
  • This processor does not support custom TensorFlow operations.

  • This processor includes the PAI-Blade Agility Edition optimization engine for deploying PAI-Blade-optimized TensorFlow models.

Deploy a TensorFlow model in one of the following ways:

  • Console

    Set Processor Type to TensorFlow1.15. For more information, see Deploy a custom inference service.

  • EASCMD

    In the service.json configuration file, set processor to tensorflow_cpu_1.15 or tensorflow_gpu_1.15. Select the code that matches your deployment resources. A mismatch between processor and resource type causes deployment failure. Example:

    {
      "name": "tf_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/savedmodel_example.zip",
      "processor": "tensorflow_cpu_1.15",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • DSW

    Similar to using EASCMD. Create a service.json configuration file. For more information, see Deploy model services by using EASCMD. For parameter descriptions, see Create a service.

TensorFlow 2.3 processor

The EAS TensorFlow 2.3 processor loads TensorFlow models in SavedModel (recommended) or SessionBundle format. Convert Keras and Checkpoint models to SavedModel format before deployment. For more information, see TensorFlow FAQ.

Note

This processor does not support custom TensorFlow operations.

Deploy a TensorFlow model in one of the following ways:

  • Console

    Set Processor Type to TensorFlow2.3. For more information, see Deploy a service by using the console.

  • EASCMD

    In the service.json configuration file, set processor to tensorflow_cpu_2.3 Example:

    {
      "name": "tf_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/savedmodel_example.zip",
      "processor": "tensorflow_cpu_2.3",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • DSW

    Similar to using EASCMD. Create a service.json configuration file. For more information, see Deploy model services by using EASCMD.

PyTorch 1.6 processor (PAI-Blade Agility Edition)

The EAS PyTorch 1.6 processor loads models in TorchScript format. For more information, see the official TorchScript documentation.

Note
  • This processor does not support PyTorch extensions or non-tensor model inputs and outputs.

  • This processor includes the PAI-Blade (Agility Edition) optimization engine for deploying optimized PyTorch models.

Deploy a TorchScript model in one of the following ways:

  • Console

    Set Processor Type to PyTorch 1.6. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to pytorch_cpu_1.6 or pytorch_gpu_1.6. Select a value based on deployment resources. A mismatch between processor and resource type causes deployment failure. Example:

    {
      "name": "pytorch_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/torchscript_model.pt",
      "processor": "pytorch_gpu_1.6",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 1,
        "cuda": "10.0",
        "memory": 2000
      }
    }
  • DSW

    Similar to using the EASCMD client. Create a service.json configuration file. For more information, see Deploy model services by using EASCMD. For parameter descriptions, see Create a service.

Caffe processor

The EAS Caffe processor loads deep learning models trained with Caffe. Specify the model and weight file names in the model package.

Note

This processor does not support custom data layers.

Deploy a Caffe model in the following ways:

  • Console

    Set Processor Type to Caffe. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to caffe_cpu or caffe_gpu based on the resource type. A mismatch between processor and resource type causes deployment failure. Example:

    {
      "name": "caffe_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/caffe_model.zip",
      "processor": "caffe_cpu",
      "model_config": {
        "model": "deploy.prototxt",
        "weight": "bvlc_reference_caffenet.caffemodel"
      },
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • DSW

    Similar to using the EASCMD client. Create a service.json configuration file. For more information, see Deploy model services by using EASCMD.

PS processor

The EAS PS processor loads models in the PS format.

Deploy a PS model and send requests to the service.

  • Deploy a PS model in one of the following ways:

    • Console

      Set Processor Type to PS Algorithm. For more information, see Custom deployment.

    • EASCMD client

      In the service.json configuration file, set processor to parameter_sever.

      {
        "name":"ps_smart",
        "model_path": "oss://examplebucket/xlab_m_pai_ps_smart_b_1058272_v0.tar.gz",
        "processor": "parameter_sever",
        "metadata": {
          "region": "beijing",
          "cpu": 1,
          "instance": 1,
          "memory": 2048
        }
      }
    • DSW

      Similar to using the EASCMD client. Create a service.json configuration file. For more information, see Deploy model services using the EASCMD client.

  • Request format

    The processor supports both single and batch predictions. The request format is the same: a JSON array of feature objects.

    • Single request example

      curl "http://eas.location/api/predict/ps_smart" -d "[
                  {
                      "f0": 1,
                      "f1": 0.2,
                      "f3": 0.5
                  }
      ]"
    • Batch request example

      curl "http://eas.location/api/predict/ps_smart" -d "[
              {
                  "f0": 1,
                  "f1": 0.2,
                  "f3": 0.5
              },
              {
                  "f0": 1,
                  "f1": 0.2,
                  "f3": 0.5
              }
      ]"
    • Response

      The response format is the same for single and batch requests: an array of response objects. Each response object corresponds to the request object at the same position.

      [
        {
          "label":"xxxx",
          "score" : 0.2,
          "details" : [{"k1":0.3}, {"k2":0.5}]
        },
        {
          "label":"xxxx",
          "score" : 0.2,
          "details" : [{"k1":0.3}, {"k2":0.5}]
        }
      ]

EasyTransfer processor

The EAS EasyTransfer processor loads TensorFlow-based NLP models trained with EasyTransfer.

Deploy an EasyTransfer model in the following ways:

  • Console

    Select EasyTransfer for the Processor Type parameter. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to easytransfer_cpu or easytransfer_gpu based on deployment resources. A mismatch between processor and resources causes deployment failure. In model_config, set type to the model type used during training. The following example uses a text classification model. For other parameters, see Create a service.

    • Configuration for GPU deployment (using a public resource group as an example)

      {
        "name": "et_app_demo",
        "metadata": {
          "instance": 1
        },
        "cloud": {
          "computing": {
            "instance_type": "ecs.gn6i-c4g1.xlarge"
          }
        },
        "model_path": "http://xxxxx/your_model.zip",
        "processor": "easytransfer_gpu",
        "model_config": {
          "type": "text_classify_bert"
        }
      }
    • Configuration for CPU deployment

      {
        "name": "et_app_demo",
        "model_path": "http://xxxxx/your_model.zip",
        "processor": "easytransfer_cpu",
        "model_config": {
          "type":"text_classify_bert"
        },
        "metadata": {
          "instance": 1,
          "cpu": 1,
          "memory": 4000
        }
      }

    Supported task types:

    Task type

    Type

    Text matching

    text_match_bert

    Text classification

    text_classify_bert

    Sequence labeling

    sequence_labeling_bert

    Text vectorization

    vectorization_bert

EasyNLP processor

The EAS EasyNLP processor loads PyTorch-based NLP models trained with EasyNLP.

Deploy an EasyNLP model in one of the following ways:

  • Console

    Set Processor Type to EasyNLP. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to easynlp. In model_config, set type to the training task type. The following example uses a single-label text classification model. For other parameters, see Create a service.

    {
      "name": "easynlp_app_demo",
      "metadata": {
        "instance": 1
      },
      "cloud": {
        "computing": {
          "instance_type": "ecs.gn6i-c4g1.xlarge"
        }
      },
      "model_config": {
        "app_name": "text_classify",
        "type": "text_classify"
      },
      "model_path": "http://xxxxx/your_model.tar.gz",
      "processor": "easynlp"
    }

    Supported task types:

    Task type

    Value

    Single-label text classification

    text_classify

    Multi-label text classification

    text_classify_multi

    Text matching

    text_match

    Sequence labeling

    sequence_labeling

    Text vectorization

    vectorization

    Chinese text summarization (GPU)

    sequence_generation_zh

    English text summarization (GPU)

    sequence_generation_en

    Machine reading comprehension (Chinese)

    machine_reading_comprehension_zh

    Machine reading comprehension (English)

    machine_reading_comprehension_en

    WUKONG_CLIP (GPU)

    wukong_clip

    CLIP (GPU)

    clip

After deployment, on the Elastic Algorithm Service (EAS) page, click Invocation Information in the Service Type column of the target service to view the endpoint and token. Call the service using the following Python example.

import requests
# Replace with your service endpoint.
url = '<eas-service-url>'
# Replace with your token.
token = '<eas-service-token>'
# Prepare the request data. The following example is for text classification.
request_body = {
    "first_sequence": "hello"
}
 
headers = {"Authorization": token}
resp = requests.post(url=url, headers=headers, json=request_body)
print(resp.content.decode())

EasyCV processor

The EAS EasyCV processor loads deep learning models trained with EasyCV.

Deploy an EasyCV model in one of the following ways:

  • Console

    Set Processor Type to EasyCV. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to easycv. In model_config, set type to the model type used during training. The following example uses an image classification model. For other parameters, see Create a service.

    {
      "name": "easycv_classification_example",
      "processor": "easycv",
      "model_path": "oss://examplebucket/epoch_10_export.pt",
      "model_config": {"type":"TorchClassifier"},
      "metadata": {
        "instance": 1
      },
      "cloud": {
        "computing": {
          "instance_type": "ecs.gn5i-c4g1.xlarge"
        }
      }
    }

    Supported job types:

    Job type

    model_config

    Image classification

    {"type":"TorchClassifier"}

    Object detection

    {"type":"DetectionPredictor"}

    Semantic segmentation

    {"type":"SegmentationPredictor"}

    YOLOX

    {"type":"YoloXPredictor"}

    Video classification

    {"type":"VideoClassificationPredictor"}

After deployment, go to the Elastic Algorithm Service (EAS) page. Find the service, and in the Service Type column, click Invocation Information to view the endpoint and token. The following Python example shows how to call the service.

import requests
import base64
import json
resp = requests.get('http://examplebucket.oss-cn-zhangjiakou.aliyuncs.com/images/000000123213.jpg')
ENCODING = 'utf-8'
datas = json.dumps( {
            "image": base64.b64encode(resp.content).decode(ENCODING)
            })
# Replace with your authentication token.
head = {
   "Authorization": "NTFmNDJlM2E4OTRjMzc3OWY0NzI3MTg5MzZmNGQ5Yj***"
}
for x in range(0,10):
  	# Replace with your service endpoint.
    resp = requests.post("http://150231884461***.cn-hangzhou.pai-eas.aliyuncs.com/api/predict/easycv_classification_example", data=datas, headers=head)
    print(resp.text)
                            

Base64-encode the image or video data for transmission. Use the image key for image data and the video key for video data.

EasyVision processor

The EAS EasyVision processor loads deep learning models trained with EasyVision.

Deploy an EasyVision model in one of the following ways:

  • Console

    Set Processor Type to EasyVision. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to easy_vision_cpu_tf1.12_torch151 or easy_vision_gpu_tf1.12_torch151. Select the code that matches your deployment resources. A mismatch between processor and resource type causes deployment failure. In model_config, set type to the model type used for training. Examples. For other parameters, see Create a service:

    • Configuration for GPU deployment

      {
        "name": "ev_app_demo",
        "processor": "easy_vision_gpu_tf1.12_torch151",
        "model_path": "oss://path/to/your/model",
        "model_config": "{\"type\":\"classifier\"}",
        "metadata": {
          "resource": "your_resource_name",
          "cuda": "9.0",
          "instance": 1,
          "memory": 4000,
          "gpu": 1,
          "cpu": 4,
          "rpc.worker_threads" : 5
        }
      }
    • Configuration for CPU deployment

      {
        "name": "ev_app_cpu_demo",
        "processor": "easy_vision_cpu_tf1.12_torch151",
        "model_path": "oss://path/to/your/model",
        "model_config": "{\"type\":\"classifier\"}",
        "metadata": {
          "resource": "your_resource_name",
          "instance": 1,
          "memory": 4000,
          "gpu": 0,
          "cpu": 4,
          "rpc.worker_threads" : 5
        }
      }

MediaFlow processor

The EAS MediaFlow processor is an orchestration engine for analyzing and processing video, audio, and images.

Deploy a MediaFlow model in one of the following ways:

  • Console

    Set Processor Type to MediaFlow. For more information, see Deploy a custom inference service.

  • EASCMD client

    In the service.json configuration file, set processor to mediaflow. This processor requires additional configuration fields. For other fields, see Create a service:

    • graph_pool_size: Number of graph pools.

    • worker_threads: Number of worker threads.

    Examples:

    • Configuration for deploying a video classification model.

      {
        "model_entry": "video_classification/video_classification_ext.js", 
        "name": "video_classification", 
        "model_path": "oss://path/to/your/model", 
        "generate_token": "true", 
        "processor": "mediaflow", 
        "model_config" : {
            "graph_pool_size":8,
            "worker_threads":16
        },
        "metadata": {
          "eas.handlers.disable_failure_handler" :true,
          "resource": "your_resource_name", 
            "rpc.worker_threads": 30,
            "rpc.enable_jemalloc": true,
          "rpc.keepalive": 500000, 
          "cpu": 4, 
          "instance": 1, 
          "cuda": "9.0", 
          "rpc.max_batch_size": 64, 
          "memory": 10000, 
          "gpu": 1 
        }
      }
    • Configuration for deploying an automated speech recognition (ASR) model.

      {
        "model_entry": "asr/video_asr_ext.js", 
        "name": "video_asr", 
        "model_path": "oss://path/to/your/model", 
        "generate_token": "true", 
        "processor": "mediaflow", 
        "model_config" : {
            "graph_pool_size":8,
            "worker_threads":16
        },
        "metadata": {
          "eas.handlers.disable_failure_handler" :true,
          "resource": "your_resource_name", 
            "rpc.worker_threads": 30,
            "rpc.enable_jemalloc": true,
          "rpc.keepalive": 500000, 
          "cpu": 4, 
          "instance": 1, 
          "cuda": "9.0", 
          "rpc.max_batch_size": 64, 
          "memory": 10000, 
          "gpu": 1 
        }
      }

    The configurations for ASR and video classification differ mainly in model_entry, name, and model_path. Modify these fields for your model.

Triton processor

Triton Inference Server is an NVIDIA online serving framework. It provides an interface for deploying and managing models on GPUs and is compatible with the KFServing API standard. Key features:

  • Deploys models from various frameworks, such as TensorFlow, PyTorch, ONNX Runtime, TensorRT, and custom backends.

  • Runs multiple models concurrently on a GPU to improve utilization.

  • Supports HTTP/gRPC protocols and binary format extension to reduce request size.

  • Supports Dynamic Batching to improve service throughput.

Triton Inference Server is available on EAS as a built-in Triton processor.

Note
  • Available only in public preview in the China (Shanghai) region.

  • All models must be stored in OSS. Activate OSS and upload your model files to an OSS bucket first. For more information, see Simple Upload.

Deploy and call a Triton processor service.

  • Deploy with the Triton processor

    Deploy Triton model services only by using EASCMD. For more information, see Create a service. In the service.json configuration file, set processor to triton. Because Triton retrieves models from OSS, configure the required OSS parameters. Example service.json:

    {
      "name": "triton_test",                          
      "processor": "triton",
      "processor_params": [
        "--model-repository=oss://triton-model-repo/models", 
        "--allow-http=true", 
      ],
      "metadata": {
        "instance": 1,
        "cpu": 4,
        "gpu": 1,
        "memory": 10000,
        "resource":"<your resource id>"
      }
    }

    Triton-specific parameters are listed below. For other parameters, see Parameters in service.json.

    Parameter

    Description

    processor_params

    Parameters passed to Triton Server at startup. Unsupported parameters are automatically filtered. Supported parameters are listed in the following set of parameters that can be passed to the Triton server. model-repository is required. For optional parameters, see main.cc.

    oss_endpoint

    OSS endpoint. If not specified, the system uses OSS in the same region as the EAS service. Specify this for cross-region OSS. For values, see Regions and Endpoints.

    metadata

    resource

    ID of the EAS exclusive resource group for deploying the model service. The Triton processor requires an EAS exclusive resource group. For more information, see Use EAS exclusive resource groups.

    Table 1. Supported parameters for the Triton server

    Parameter

    Required

    Description

    model-repository

    路径需要指定为OSS路径,系统不支持直接使用Bucket根目录作为model-repository,需要指定Bucket下的某个子目录才可以。

    例如,oss://triton-model-repo/models,其中triton-model-repo为Bucket名称,models为Bucket下的一个子目录。

    log-verbose

    No

    For more information, see main.cc.

    log-info

    No

    log-warning

    No

    log-error

    No

    exit-on-error

    No

    strict-model-config

    No

    strict-readiness

    No

    allow-http

    No

    http-thread-count

    No

    pinned-memory-pool-byte-size

    No

    cuda-memory-pool-byte-size

    No

    min-supported-compute-capability

    No

    buffer-manager-thread-count

    No

    backend-config

    No

  • Call the service with the native Triton client

    Install NVIDIA's official Triton client:

    pip3 install nvidia-pyindex
    pip3 install tritonclient[all]

    Download a test image:

    wget http://pai-blade.oss-cn-zhangjiakou.aliyuncs.com/doc-assets/cat.png

    Send a binary-format request to the Triton processor service using the Python client:

    import numpy as np
    import time
    from PIL import Image
    
    import tritonclient.http as httpclient
    from tritonclient.utils import InferenceServerException
    
    URL = "<service url>"  # Replace <service url> with your service endpoint.
    HEADERS = {"Authorization": "<service token>"} # Replace <service token> with your service access token.
    input_img = httpclient.InferInput("input", [1, 299, 299, 3], "FP32")
    img = Image.open('./cat.png').resize((299, 299))
    img = np.asarray(img).astype('float32') / 255.0
    input_img.set_data_from_numpy(img.reshape([1, 299, 299, 3]), binary_data=True)
    
    output = httpclient.InferRequestedOutput(
        "InceptionV3/Predictions/Softmax", binary_data=True
    )
    triton_client = httpclient.InferenceServerClient(url=URL, verbose=False)
    
    start = time.time()
    for i in range(10):
        results = triton_client.infer(
            "inception_graphdef", inputs=[input_img], outputs=[output], headers=HEADERS
        )
        res_body = results.get_response()
        elapsed_ms = (time.time() - start) * 1000
        if i == 0:
            print("model name: ", res_body["model_name"])
            print("model version: ", res_body["model_version"])
            print("output name: ", res_body["outputs"][0]["name"])
            print("output shape: ", res_body["outputs"][0]["shape"])
        print("[{}] Avg rt(ms): {:.2f}".format(i, elapsed_ms))
        start = time.time()