A processor is a package of online prediction logic. Elastic Algorithm Service (EAS) of Machine Learning Platform for AI (PAI) has preset specific common processors as official processors. You can use the official processors to deploy routine models. This reduces the costs of developing online prediction logic.

The following table describes the names and codes of processors that EAS provides. If you use the EASCMD client to deploy a model, a processor code is required.
Processor name Processor code that is required when the EASCMD client is used References
CPU edition GPU edition
PMML pmml N/A PMML processor
TensorFlow1.12 tensorflow_cpu_1.12 tensorflow_gpu_1.12 TensorFlow1.12 processor
TensorFlow1.14 tensorflow_cpu_1.14 tensorflow_gpu_1.14 TensorFlow1.14 processor
TensorFlow1.15 tensorflow_cpu_1.15 tensorflow_gpu_1.15 TensorFlow1.15 processor with a built-in optimization engine based on PAI-Blade of the agility edition
TensorFlow2.3 tensorflow_cpu_2.3 N/A TensorFlow2.3 processor
PyTorch1.6 pytorch_cpu_1.6 pytorch_gpu_1.6 PyTorch1.6 processor with a built-in optimization engine based on PAI-Blade of the agility edition
Caffe caffe_cpu caffe_gpu Caffe processor
Parameter server algorithm parameter_sever N/A N/A
Alink alink_pai_processor N/A N/A
xNN xnn_cpu N/A N/A
EasyVision easy_vision_cpu_tf1.12_torch151 easy_vision_gpu_tf1.12_torch151 EasyVision processor
EasyNLP easy_nlp_cpu_tf1.12 easy_nlp_gpu_tf1.12 EasyNLP Processor
Blade blade_cpu blade_cuda10.0_beta N/A
MediaFlow N/A mediaflow MediaFlow processor
Triton N/A triton Triton processor

PMML processor

Predictive Model Markup Language (PMML) is a predictive model markup language. Traditional machine learning models that are trained in Machine Learning Studio can be exported in the PMML format. You can export a PMML model from Machine Learning Studio by using the following method:
  1. Before model training, choose Settings > General in the left-side navigation pane of Machine Learning Studio and select Auto Generate PMML.
  2. After model training is complete, right-click the model training node in the canvas and choose Model Option > Export PMML.
Note In Machine Learning Studio, the following algorithms can be used to generate PMML models: Gradient Boosting Decision Tree (GBDT) for binary classification, Support Vector Machine (SVM), logistic regression for binary classification, logistic regression for multiclass classification, random forest, k-means clustering, linear regression, GBDT regression, and scorecard training.
The built-in PMML processor in EAS provides the following features:
  • Loads a model file of the PMML type as a service.
  • Processes requests that are sent to call a model service.
  • Calculates the request result based on the model and returns the result to the client.
The PMML processor provides a default policy to impute missing values. If the isMissing policy is not specified for the feature fields in the PMML model file, the following values are imputed by default.
DataType Default imputed value
BOOLEAN false
DOUBLE 0.0
FLOAT 0.0
INT 0
STRING ""
You can deploy PMML models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to PMML. For more information, see Upload and deploy models in the console.

  • Use Machine Learning Studio to deploy models

    For more information, see Use Machine Learning Studio to deploy models.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to pmml. The following sample code shows you how to modify the service configuration file:
    {
      "processor": "pmml",
      "generate_token": "true",
      "model_path": "http://xxxxx/lr.pmml",
      "name": "eas_lr_example",
      "metadata": {
        "instance": 1,
        "cpu": 1 # Configure a quota of 4 GB memory for each CPU. 
      }
    }
  • Use DSW to deploy models

    Modify the service.json service configuration file. This method is similar to the method of deploying models by using the EASCMD client. For more information, see Deploy models.

TensorFlow1.12 processor

The TensorFlow1.12 processor that EAS provides can load TensorFlow models in the SavedModel or SessionBundle format. We recommend that you use the SavedModel format. You must convert a Keras or Checkpoint model to a SavedModel model before deployment. For more information, see Export TensorFlow models in the SavedModel format.
Note This official processor does not support custom TensorFlow operations.
You can deploy TensorFlow models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to TensorFlow1.12. For more information, see Upload and deploy models in the console.

  • Use Machine Learning Studio to deploy models

    For more information, see Use Machine Learning Studio to deploy models.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to tensorflow_cpu_1.12 or tensorflow_gpu_1.12 based on the service resources. If the value of the processor parameter does not match the resources, a deployment error occurs. The following sample code shows you how to modify the service configuration file:
    {
      "name": "tf_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/savedmodel_example.zip",
      "processor": "tensorflow_cpu_1.12",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • Use DSW to deploy models

    Modify the service.json service configuration file. This method is similar to the method of deploying models by using the EASCMD client. For more information, see Deploy models.

TensorFlow1.14 processor

The TensorFlow1.14 processor that EAS provides can load TensorFlow models in the SavedModel or SessionBundle format. We recommend that you use the SavedModel format. You must convert a Keras or Checkpoint model to a SavedModel model before deployment. For more information, see Export TensorFlow models in the SavedModel format.
Note This official processor does not support custom TensorFlow operations.
You can deploy TensorFlow models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to TensorFlow1.14. For more information, see Upload and deploy models in the console.

  • Use Machine Learning Studio to deploy models

    For more information, see Use Machine Learning Studio to deploy models.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to tensorflow_cpu_1.14 or tensorflow_gpu_1.14 based on the service resources. If the value of the processor parameter does not match the resources, a deployment error occurs. The following sample code shows you how to modify the service configuration file:
    {
      "name": "tf_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/savedmodel_example.zip",
      "processor": "tensorflow_cpu_1.14",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • Use DSW to deploy models

    Modify the service.json service configuration file. This method is similar to the method of deploying models by using the EASCMD client. For more information, see Deploy models.

TensorFlow1.15 processor with a built-in optimization engine based on PAI-Blade of the agility edition

The TensorFlow1.15 processor that EAS provides can load TensorFlow models in the SavedModel or SessionBundle format. We recommend that you use the SavedModel format. You must convert a Keras or Checkpoint model to a SavedModel model before deployment. For more information, see Export TensorFlow models in the SavedModel format.
Note
  • This official processor does not support custom TensorFlow operations.
  • TensorFlow1.15 processor provides a built-in optimization engine based on PAI-Blade of the agility edition. You can use this processor to deploy TensorFlow models after these models are optimized by PAI-Blade of the agility edition.
You can deploy TensorFlow models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to TensorFlow1.15. For more information, see Upload and deploy models in the console.

  • Use Machine Learning Studio to deploy models

    For more information, see Use Machine Learning Studio to deploy models.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to tensorflow_cpu_1.15 or tensorflow_gpu_1.15 based on the service resources. If the value of the processor parameter does not match the resources, a deployment error occurs. The following sample code shows you how to modify the service configuration file:
    {
      "name": "tf_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/savedmodel_example.zip",
      "processor": "tensorflow_cpu_1.15",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • Use DSW to deploy models

    Modify the service.json service configuration file. This method is similar to the method of deploying models by using the EASCMD client. For more information, see Deploy models. For more information about the parameters in the service configuration file, see Create a service.

TensorFlow2.3 processor

The TensorFlow2.3 processor that EAS provides can load TensorFlow models in the SavedModel or SessionBundle format. We recommend that you use the SavedModel format. You must convert a Keras or Checkpoint model to a SavedModel model before deployment. For more information, see Export TensorFlow models in the SavedModel format.
Note This official processor does not support custom TensorFlow operations.
You can deploy TensorFlow models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to TensorFlow2.3. For more information, see Upload and deploy models in the console.

  • Use Machine Learning Studio to deploy models

    For more information, see Use Machine Learning Studio to deploy models.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to tensorflow_cpu_2.3. The following sample code shows you how to modify the service configuration file:
    {
      "name": "tf_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/savedmodel_example.zip",
      "processor": "tensorflow_cpu_2.3",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • Use DSW to deploy models

    Modify the service.json service configuration file. The method is similar to the method of deploying models by using the EASCMD client. For more information, see Deploy models.

PyTorch1.6 processor with a built-in optimization engine based on PAI-Blade of the agility edition

The PyTorch1.6 processor that EAS provides can load models in the TorchScript format. For more information, see TorchScript.
Note
  • This official processor does not support PyTorch extensions. You cannot use this processor to import or export non-Tensor models.
  • PyTorch1.6 processor provides a built-in optimization engine based on PAI-Blade of the agility edition. You can use this processor to deploy PyTorch models after these models are optimized by PAI-Blade of the agility edition.
You can deploy TorchScript models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to PyTorch1.6. For more information, see Upload and deploy models in the console.

  • Use Machine Learning Studio to deploy models

    For more information, see Use Machine Learning Studio to deploy models.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to pytorch_cpu_1.6 or pytorch_gpu_1.6 based on the service resources. If the value of the processor parameter does not match the resources, a deployment error occurs. The following sample code shows you how to modify the service configuration file:
    {
      "name": "pytorch_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/torchscript_model.pt",
      "processor": "pytorch_gpu_1.6",
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 1,
        "cuda": "10.0",
        "memory": 2000
      }
    }
  • Use DSW to deploy models

    Modify the service.json service configuration file. This method is similar to the method of deploying models by using the EASCMD client. For more information, see Deploy models. For more information about the parameters in the service configuration file, see Create a service.

Caffe processor

The Caffe processor that EAS provides can load deep learning models that are trained by the Caffe framework. When you deploy a model, you must set the model and weight parameters for the model package due to the flexibility of the Caffe framework.
Note This official processor does not support custom data layers.
You can deploy Caffe models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to Caffe. For more information, see Upload and deploy models in the console.

  • Use Machine Learning Studio to deploy models

    For more information, see Use Machine Learning Studio to deploy models.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to caffe_cpu or caffe_gpu based on the service resources. If the value of the processor parameter does not match the resources, a deployment error occurs. The following sample code shows you how to modify the service configuration file:
    {
      "name": "caffe_serving_test",
      "generate_token": "true",
      "model_path": "http://xxxxx/caffe_model.zip",
      "processor": "caffe_cpu",
      "model_config": {
        "model": "deploy.prototxt",
        "weight": "bvlc_reference_caffenet.caffemodel"
      },
      "metadata": {
        "instance": 1,
        "cpu": 1,
        "gpu": 0,
        "memory": 2000
      }
    }
  • Use DSW to deploy models

    Modify the service.json service configuration file. This method is similar to the method of deploying models by using the EASCMD client. For more information, see Deploy models.

EasyNLP Processor

The EasyNLP processor that EAS provides can load deep learning natural language processing (NLP) models that are trained by the EasyTransfer framework.

You can deploy EasyTransfer models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to EasyNLP. For more information, see Upload and deploy models in the console.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to easy_nlp_cpu_tf1.12 or easy_nlp_gpu_tf1.12 based on the service resources. If the value of the processor parameter does not match the resources, a deployment error occurs. Set the nested parameter type under model_config to the type of the model that is used for training. The following sample code shows you how to modify the service configuration file. For more information about the description of other parameters, see Create a service.
    • Deploy models by using the EasyNLP processor of the GPU edition
      {
        "name": "ev_app_demo",
        "generate_token": "true",
        "model_path": "http://xxxxx/your_model.zip",
        "processor": "easy_nlp_gpu_tf1.12",
        "model_config": "{\"type\":\"text_classify_bert\"}",
        "metadata": {
          "resource": "your_resource_name",
          "cuda": "9.0",
          "instance": 1,
          "memory": 4000,
          "gpu": 1,
          "cpu": 4,
          "rpc.worker_threads" : 5
        }
      }
    • Deploy models by using the EasyNLP processor of the CPU edition
      {
        "name": "easynlp_serving_test",
        "generate_token": "true",
        "model_path": "http://xxxxx/your_model.zip",
        "processor": "easy_nlp_cpu_tf1.12",
        "model_config": "{\"type\":\"text_classify_bert\"}",
        "metadata": {
          "resource": "your_resource_name",
          "instance": 1,
          "gpu": 0,
          "cpu": 4,
          "rpc.worker_threads" : 5
        }
      }

EasyVision processor

The EasyVision processor that EAS provides can load deep learning models that are trained by the EasyVision framework.

You can deploy EasyVision models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to EasyVision. For more information, see Upload and deploy models in the console.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to easy_vision_cpu_tf1.12_torch151 or easy_vision_gpu_tf1.12_torch151 based on the service resources. If the value of the processor parameter does not match the resources, a deployment error occurs. Set the nested parameter type under model_config to the type of the model that is used for training. The following sample code shows you how to modify the service configuration file. For more information about the description of other parameters, see Create a service.
    • Deploy models by using the EasyVision processor of the GPU edition
      {
        "name": "ev_app_demo",
        "processor": "easy_vision_gpu_tf1.12_torch151",
        "model_path": "oss://path/to/your/model",
        "model_config": "{\"type\":\"classifier\"}",
        "metadata": {
          "resource": "your_resource_name",
          "cuda": "9.0",
          "instance": 1,
          "memory": 4000,
          "gpu": 1,
          "cpu": 4,
          "rpc.worker_threads" : 5
        }
      }
    • Deploy models by using the EasyVision processor of the CPU edition
      {
        "name": "ev_app_cpu_demo",
        "processor": "easy_vision_cpu_tf1.12_torch151",
        "model_path": "oss://path/to/your/model",
        "model_config": "{\"type\":\"classifier\"}",
        "metadata": {
          "resource": "your_resource_name",
          "instance": 1,
          "memory": 4000,
          "gpu": 0,
          "cpu": 4,
          "rpc.worker_threads" : 5
        }
      }

MediaFlow processor

The MediaFlow processor that EAS provides is a general-purpose orchestration engine that can analyze and process videos, audio, and images.

You can deploy MediaFlow models by using one of the following methods:
  • Deploy models in the PAI console

    Set the Processor Type parameter to MediaFlow. For more information, see Upload and deploy models in the console.

  • Use the EASCMD client to deploy models
    In the service.json service configuration file, set the processor parameter to mediaflow. In addition, you must set the following parameters to deploy models by using the MediaFlow processor. For more information about other parameters, see Create a service.
    • graph_pool_size: the number of graph pools.
    • worker_threads: the number of worker threads.
    The following sample code shows you how to modify the service configuration file:
    • Deploy a model for video classification
      {
        "model_entry": "video_classification/video_classification_ext.js", 
        "name": "video_classification", 
        "model_path": "oss://path/to/your/model", 
        "generate_token": "true", 
        "processor": "mediaflow", 
        "model_config" : {
            "graph_pool_size":8,
            "worker_threads":16
        },
        "metadata": {
          "eas.handlers.disable_failure_handler" :true,
          "resource": "your_resource_name", 
            "rpc.worker_threads": 30,
            "rpc.enable_jemalloc": true,
          "rpc.keepalive": 500000, 
          "cpu": 4, 
          "instance": 1, 
          "cuda": "9.0", 
          "rpc.max_batch_size": 64, 
          "memory": 10000, 
          "gpu": 1 
        }
      }
    • Deploy a model for speech recognition
      {
        "model_entry": "asr/video_asr_ext.js", 
        "name": "video_asr", 
        "model_path": "oss://path/to/your/model", 
        "generate_token": "true", 
        "processor": "mediaflow", 
        "model_config" : {
            "graph_pool_size":8,
            "worker_threads":16
        },
        "metadata": {
          "eas.handlers.disable_failure_handler" :true,
          "resource": "your_resource_name", 
            "rpc.worker_threads": 30,
            "rpc.enable_jemalloc": true,
          "rpc.keepalive": 500000, 
          "cpu": 4, 
          "instance": 1, 
          "cuda": "9.0", 
          "rpc.max_batch_size": 64, 
          "memory": 10000, 
          "gpu": 1 
        }
      }
    In the service.json service configuration file, the values of the model_entry, name, and model_path parameters are different for video classification and speech recognition. You must modify the parameters based on the type of model that you want to deploy.

Triton processor

Triton Inference Server is a new-generation online service framework launched by NVIDIA. Triton Inference Server simplifies the deployment and management of GPU-based models and is compatible with the API standards of KFServing. In addition, Triton Inference Server has the following features:
  • Supports multiple open source frameworks such as TensorFlow, PyTorch, ONNX Runtime, TensorRT, and custom framework backends.
  • Runs models concurrently on GPUs to maximize GPU utilization.
  • Supports the HTTP and gRPC protocols and allows you to send a request in binary format to reduce the request size.
  • Supports the dynamic batching feature to improve service throughput.
EAS integrates Triton Inference Server into the built-in Triton processor.
Note
  • The Triton processor is in public preview in the China (Shanghai) region. Other regions do not support the processor.
  • The models that are deployed by using the Triton processor must be stored in Object Storage Service (OSS). Therefore, you must activate OSS before you use the Triton processor and upload model files to OSS. For more information about how to upload objects to OSS, see Upload objects.
  • Only exclusive resource groups in EAS support the Triton processor.
The following content describes how to use the Triton processor to deploy a model as a service and how to call the service:
  • Use the Triton processor to deploy a model
    You can use the Triton processor to deploy models only on the EASCMD client. For more information about how to use the EASCMD client to deploy models, see Create a service. In the service.json service configuration file, set the processor parameter to triton. In addition, you must set the parameters related to OSS so that the Triton processor can obtain model files from OSS. The following sample code shows you how to modify the service.json service configuration file:
    {
      "name": "triton_test",                          
      "processor": "triton",
      "processor_params": [
        "--model-repository=oss://triton-model-repo/models", 
        "--allow-http=true", 
      ],
      "metadata": {
        "instance": 1,
        "cpu": 4,
        "gpu": 1,
        "memory": 10000,
        "resource":"<your resource id>"
      }
    }
    The following table describes the specific parameters that are required if you use the Triton processor to deploy models. For more information about common parameters, see Run commands to use the EASCMD client.
    Parameter Description
    processor_params The parameters that you want to pass to Triton Inference Server when the deployment starts. Parameters that are not supported will be automatically filtered. The following Table 1 table describes the parameters that can be passed to Triton Inference Server. The model-repository parameter is required. For more information about optional parameters, see main.cc.
    oss_endpoint The endpoint of OSS. If you do not specify an endpoint, the system automatically uses the OSS service in the region where the EAS service is deployed. If you want to use the OSS service that is activated in another region, you must set this parameter. For more information about the valid values of this parameter, see Regions and endpoints.
    metadata resource The ID of the exclusive resource group that is used to deploy the model in EAS. If you want to deploy a model by using the Triton processor, the resources to be used must belong to the exclusive resource group in EAS. For more information about how to create an exclusive resource group in EAS, see Dedicated resource groups.
    Table 1. Parameters that can be passed to Triton Inference Server
    Parameter Required Description
    model-repository Yes The OSS path of the model. You must set the model-repository parameter to a subdirectory of an OSS bucket rather than the root directory of the OSS bucket.

    For example, you can set the parameter to oss://triton-model-repo/models. triton-model-repo is the name of the OSS bucket, and models is a subdirectory of the OSS bucket.

    log-verbose No For more information, see main.cc.
    log-info No
    log-warning No
    log-error No
    exit-on-error No
    strict-model-config No
    strict-readiness No
    allow-http No
    http-thread-count No
    pinned-memory-pool-byte-size No
    cuda-memory-pool-byte-size No
    min-supported-compute-capability No
    buffer-manager-thread-count No
    backend-config No
  • Use the native Triton client to call the service deployed by using the Triton processor
    Before you use the Triton client for Python to call the deployed service, run the following commands to install the native Triton client:
    pip3 install nvidia-pyindex
    pip3 install tritonclient[all]
    Run the following command to download a test image to the current directory:
    wget http://pai-blade.oss-cn-zhangjiakou.aliyuncs.com/doc-assets/cat.png
    The following sample code shows that the Triton client for Python sends a request in the binary format to the service that is deployed by using the Triton processor:
    import numpy as np
    import time
    from PIL import Image
    
    import tritonclient.http as httpclient
    from tritonclient.utils import InferenceServerException
    
    URL = "<servcice url>"  # Replace <servcice url> with the endpoint of the deployed service. 
    HEADERS = {"Authorization": "<service token>"} # Replace <service token> with the token that is used to access the deployed service. 
    input_img = httpclient.InferInput("input", [1, 299, 299, 3], "FP32")
    img = Image.open('./cat.png').resize((299, 299))
    img = np.asarray(img).astype('float32') / 255.0
    input_img.set_data_from_numpy(img.reshape([1, 299, 299, 3]), binary_data=True)
    
    output = httpclient.InferRequestedOutput(
        "InceptionV3/Predictions/Softmax", binary_data=True
    )
    triton_client = httpclient.InferenceServerClient(url=URL, verbose=False)
    
    start = time.time()
    for i in range(10):
        results = triton_client.infer(
            "inception_graphdef", inputs=[input_img], outputs=[output], headers=HEADERS
        )
        res_body = results.get_response()
        elapsed_ms = (time.time() - start) * 1000
        if i == 0:
            print("model name: ", res_body["model_name"])
            print("model version: ", res_body["model_version"])
            print("output name: ", res_body["outputs"][0]["name"])
            print("output shape: ", res_body["outputs"][0]["shape"])
        print("[{}] Avg rt(ms): {:.2f}".format(i, elapsed_ms))
        start = time.time()