All Products
Search
Document Center

Platform For AI:Develop custom processors by using Python

Last Updated:Apr 12, 2024

You can use Python to develop a custom processor and debug the processor together with the model on on-premises machines. When debugging is complete, you can upload the files of the processor and model to Object Storage Service (OSS) and mount the files when you deploy the service. This topic describes how to use Python to develop custom processors.

Background information

    Note
    • We recommend that you separate the model file from the prediction logic of the processor. This helps you reuse the processor for future deployment when the model requires modification. To obtain the path of the model file within the processor, call the get_model_path() method. This path is then used to load the model.

    • If a custom processor has excessive dependencies or the processor package is large in size, we recommend that you use an image to deploy the model. For more information about the two deployment methods, see the "Deployment method" section in the Overview topic.

Procedure for developing a custom processor by using Python:

  1. Step 1: Create a Python environment

    Elastic Algorithm Service (EAS) SDK for Python supports multiple machine learning frameworks and can integrate with data analysis and processing frameworks such as Pandas. This topic describes how to create and upload an on-premises environment for Python development.

  2. Step 2: Add prediction logic

    EAS SDK for Python adopts a high-performance remote procedure call (RPC) framework and contains necessary internal interfaces to facilitate interaction among EAS clusters. You need to only implement several functions in the prediction logic before you deploy the model in EAS.

  3. Step 3: Run on-premises tests

    After you add the prediction logic, you must run on-premises tests to ensure that the service can work as expected after you deploy the service.

  4. Step 4: Package the Python code and environment

    Package the Python code and environment in the required format.

  5. Step 5: Upload the package

    Upload the package file and the model file to OSS.

  6. Step 6: Deploy and test the service

    Use the custom processor to deploy the model service.

Prerequisites

The model file is prepared.

Note

To facilitate management, we recommend that you separate the model file from the custom processor files. After you develop the model and the processor, upload the files to OSS and mount them when you deploy the service.

Step 1: Create a Python environment

Platform for AI (PAI) provides three methods to create a Python environment. You can use the EASCMD client or an official image for faster creation, or use package management tools such as Conda and Pyenv to manually create a Python environment that meets your business requirements. If you have customization requirements, you can manually build a development environment.

(Recommended) Method 1: Use the EASCMD client (Linux only)

The EASCMD client provided by EAS encapsulates the logic of initializing EAS SDK for Python. After you download EASCMD, you need to only run one command to initialize the environment for the SDK and generate relevant template files.

# Install and initialize EASCMD. In this example, the EASCMD client for Linux is installed. 
$ wget https://eas-data.oss-cn-shanghai.aliyuncs.com/tools/eascmd/v2/eascmd64
# After you install EASCMD, modify the access permissions by configuring your AccessKey pair. 
$ chmod +x eascmd64
$ ./eascmd64 config -i <access_id> -k <access_key>

# Initialize the environment. 
$ ./eascmd64 pysdk init ./pysdk_demo
Note

If the following message appears in the command output, run the source ~/.bashrc command and then rerun the command that initializes the environment.

[PYSDK] conda install complete, execute 'source ~/.bashrc' , and try again!

Select the Python version that you want to use in the command output. The default version is 3.6. After you select a version, the following directory and files are automatically created within the directory ./pysdk_demo: a directory named ENV, the app.py file (contains a template for the prediction logic), and the app.json file (contains a template for service deployment).

(Recommended) Method 2: Use an official pre-built image

  • EAS provides two images in which Conda and Python are installed and the corresponding Python environment is created. Sample images:

    # This image contains only Conda. 
    registry.cn-shanghai.aliyuncs.com/eas/eas-python-base-image:latest
    # This image contains Conda, Python 3.6, and Allspark 0.15. 
    registry.cn-shanghai.aliyuncs.com/eas/eas-python-base-image:py3.6-allspark-0.15

    Use the docker run command to obtain the Python environment of the image. The following code uses the Linux operating system as an example:

    $sudo docker run -ti registry.cn-shanghai.aliyuncs.com/eas/eas-python-base-image:py3.6-allspark-0.8
    (/data/eas/ENV) [root@487a04df**** eas]# ENV/bin/python app.py

    You can install dependencies such as TensorFlow 1.12 based on your business requirements and submit the modified container as a new image.

    ENV/bin/pip install tensorflow==1.12
    docker commit $container_id $image_tag

    You can also create a Python environment in an ENV directory and copy the content to the /data/eas/ directory of a Docker image. This method prevents repeated uploads of the entire ENV directory upon deployment.

Method 3: Manually create an environment

If EASCMD cannot meet your business requirements or an error occurs during the initialization process, you can manually initialize a Python environment. We recommend that you use Conda. The following steps use the Linux operating system as an example:

  1. Optional. Run the following commands to install Conda. If you have installed Conda in your local system, skip this step.

    $ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    $ sh Miniconda3-latest-Linux-x86_64.sh
  2. Run the following commands to initialize the development environment:

    mkdir pysdk_demo
    cd pysdk_demo
    # Use Conda to create a Python environment. You must specify the directory as ENV. 
    conda create -p ENV python=3.7
    # Install EAS SDK for Python. 
    ENV/bin/pip install http://eas-data.oss-cn-shanghai.aliyuncs.com/sdk/allspark-0.15-py2.py3-none-any.whl
    # Install other required dependencies, such as TensorFlow 1.14. 
    ENV/bin/pip install tensorflow==1.14

    If the commands succeed, an ENV directory is created in the ./pysdk_demo directory.

Step 2: Add prediction logic

To add the prediction logic, create a file named app.py within the directory that contains the ENV directory. Sample file:

Note
  • If you use the EASCMD client to create the Python environment, the app.py file is automatically created. You can modify the file content based on your business requirements.

  • If you use an official image to create the Python environment, the app.py file is automatically created. You can modify the file content based on your business requirements.

# -*- coding: utf-8 -*-
import allspark


class MyProcessor(allspark.BaseProcessor):
    """ MyProcessor is a example
        you can send mesage like this to predict
        curl -v http://127.0.0.1:8080/api/predict/service_name -d '2 105'
    """
    def initialize(self):
        """ load module, executed once at the start of the service
             do service intialization and load models in this function.
        """
        self.module = {'w0': 100, 'w1': 2}
        # model_dir = self.get_model_path().decode()
        # Define the load_model function. If you need to load the model.pt model file, you can implement the function as torch.load(model_dir + "/model.pt"). 
        # self.model = load_model(model_dir)

    def pre_process(self, data):
        """ data format pre process
        """
        x, y = data.split(b' ')
        return int(x), int(y)

    def post_process(self, data):
        """ process after process
        """
        return bytes(data, encoding='utf8')

    def process(self, data):
        """ process the request data
        """
        x, y = self.pre_process(data)
        w0 = self.module['w0']
        w1 = self.module['w1']
        y1 = w1 * x + w0
        if y1 >= y:
            return self.post_process("True"), 200
        else:
            return self.post_process("False"), 400


if __name__ == '__main__':
    # parameter worker_threads indicates concurrency of processing
    runner = MyProcessor(worker_threads=10)
    runner.run()

The preceding sample code provides an example on how to use EAS SDK for Python: Create a class that inherits the base class BaseProcessor and implement the initialize() and process() functions. The following table describes the main functions provided by the SDK.

Function

Description

Usage notes

initialize()

Initializes the processor. This function is called during service startup to load the model.

Add the following code to the initialize() function to separate the model file from the implementation of the processor.

model_dir = self.get_model_path().decode()
self.model = load_model(model_dir)
  • Use the get_model_path() method to obtain the path of the model file which is of the BYTES type.The path is also the storage path of the uploaded model file in the service instance.

  • Define the load_model() function to load and use model files to deploy services. If you need to load the model.pt model file, you can implement the functio as torch.load(model_dir + "/model.pt").

get_model_path()

Retrieves the storage path of the model which is of the BYTES type.

If you upload the model file by specifying the model_path parameter in the JSON file, you can use the get_model_path() method obtain the storage path of the uploaded model file in the service instance.

process(data)

Processes a request. This function accepts the request body as an argument and returns the response to the client.

The data input parameter specifies the request body, which is of the BYTES type. The response_data output parameter is of the BYTES type and the status_code output parameter is of the INT type. In a success response, the returned value of status_code is 0 or 200.

_init_(worker_threads=5, worker_processes=1,endpoint=None)

The constructor of the processor.

  • worker_threads: the number of threads. Default value: 5.

  • worker_processes: the number of processes. Default value: 1, which specifies that the single-process multi-thread mode is used. A value greater than 1 specifies that multiple processes concurrently handle requests and the corresponding threads only read the request data. Each process invokes the initialize() function.

  • endpoint: the endpoint to which the service listens. You can specify the IP address and port number to which the service listens. Example: endpoint='0.0.0.0:8079'.

    Note

    Do not use port 8080 or 9090 because EAS listens to these ports.

run()

Starts the service.

N/A

Step 3: Run on-premises tests

  1. In the directory that contains the app.py file, run the following command to launch the Python project:

    ./ENV/bin/python app.py

    Sample success output:

    [INFO] waiting for service initialization to complete...
    [INFO] service initialization complete
    [INFO] create service
    [INFO] rpc binds to predefined port 8080
    [INFO] install builtin handler call to /api/builtin/call
    [INFO] install builtin handler eastool to /api/builtin/eastool
    [INFO] install builtin handler monitor to /api/builtin/monitor
    [INFO] install builtin handler ping to /api/builtin/ping
    [INFO] install builtin handler prop to /api/builtin/prop
    [INFO] install builtin handler realtime_metrics to /api/builtin/realtime_metrics
    [INFO] install builtin handler tell to /api/builtin/tell
    [INFO] install builtin handler term to /api/builtin/term
    [INFO] Service start successfully
  2. Open another terminal and run the following command to send two requests.

    Verify the responses based on the sample code in the "Step 2: Add prediction logic" section of this topic.

    curl http://127.0.0.1:8080/test  -d '10 20'

Step 4: Package the Python code and environment

You can use the following methods to package the Python code and environment:

  • Run the pack command provided by the EASCMD client (Linux only).

    $ ./eascmd64 pysdk pack ./pysdk_demo

    Sample success output:

    [PYSDK] Creating package: /home/xi****.lwp/code/test/pysdk_demo.tar.gz
  • Manually package the Python code and environment according to the requirements listed in the following table.

    Requirement

    Description

    Format

    The package must be compressed in the .zip or .tar.gz format.

    Content

    • The root directory of the package must be /ENV and the package must contain an app.py file.

    • Example: .tar.gz package.

Step 5: Upload the package

After you package the Python code and environment, upload the package and the model file to OSS. This way, you can use the files when you deploy the service. The package is in the .zip or .tar.gz format. For information about how to upload files to OSS, see Get started with ossutil.

Step 6: Deploy and test the service

You can deploy the model service in the PAI console or on the EASCMD client.

  1. Deploy the service.

    Use the PAI console

    1. Go to the Deploy Service page. For more information, see Model service deployment by using the PAI console.

    2. On the Deploy Service page, configure the parameters. The following table describes key parameters. For more information, see Model service deployment by using the PAI console.

      Parameter

      Description

      Deployment Method

      Select Deploy Service by Using Model and Processor.

      Model File

      Configure this parameter based on your business requirements.

      Processor Type

      Select Custom Processor.

      Processor Language

      Select python.

      Processor Package

      Select Import OSS File and select the OSS path where the package file is stored.

      Processor Main File

      Set the value to ./app.py.

    3. Optional. Add the data_image parameter in the Configuration Editor section. Set the value to the image path that you configure when you package the files.

      Note

      Configure the data_image parameter only if you use an image to upload the development environment.Step 4: Package the Python code and environment

    4. Click Deploy.

    Use EASCMD

    The following section uses the Linux operating system as an example.

    1. Download the EASCMD client and perform identity authentication. For more information, see Download the EASCMD client and complete user authentication.

    2. Create a JSON file named app.json in the directory where the client is downloaded. The file contains the following content:

      • Sample file if the processor is packaged manually or by using the EASCMD client:

        {
          "name": "pysdk_demo",
          "processor_entry": "./app.py",
          "processor_type": "python",
          "processor_path": "oss://examplebucket/exampledirectory/pysdk_demo.tar.gz",
          "model_path": "oss://examplebucket/exampledirectory/model",
          "cloud": {
                "computing": {
                    "instance_type": "ecs.c7.large"
                }
          },
          "metadata": {
            "instance": 1,
            }
        }
      • Sample file if the development environment is uploaded by using an image:

        {
          "name": "pysdk_demo",
          "processor_entry": "./app.py",
          "processor_type": "python",
          "processor_path": "http://eas-data.oss-cn-shanghai.aliyuncs.com/demo/app.py",
          "data_image": "registry.cn-shanghai.aliyuncs.com/eas-service/develop:latest",
          "model_path": "oss://examplebucket/exampledirectory/model",
          "cloud": {
                "computing": {
                    "instance_type": "ecs.c7.large"
                }
          },
          "metadata": {
            "instance": 1,
            }
        }
    3. Open a terminal window. In the directory where the JSON file resides, run the following command to deploy the service:

      $ ./eascmd64 create app.json

      Sample success output:

      [RequestId]: 1202D427-8187-4BCB-8D32-D7096E95B5CA
      +-------------------+-------------------------------------------------------------------+
      | Intranet Endpoint | http://182848887922****.vpc.cn-beijing.pai-eas.aliyuncs.com/api/predict/pysdk_demo |
      |             Token | ZTBhZTY3ZjgwMmMyMTQ5OTgyMTQ5YmM0NjdiMmNiNmJkY2M5ODI0****          |
      +-------------------+-------------------------------------------------------------------+
      [OK] Waiting task server to be ready
      [OK] Fetching processor from [oss://eas-model-beijing/195557026392****/pysdk_demo.tar.gz]
      [OK] Building image [registry-vpc.cn-beijing.aliyuncs.com/eas/pysdk_demo_cn-beijing:v0.0.1-20190806082810]
      [OK] Pushing image [registry-vpc.cn-beijing.aliyuncs.com/eas/pysdk_demo_cn-beijing:v0.0.1-20190806082810]
      [OK] Waiting [Total: 1, Pending: 1, Running: 0]
      [OK] Service is running
  2. Test the service.

    1. Go to the EAS-Online Model Services page. For more information, see Model service deployment by using the PAI console..

    2. Find the service that you want to test and click Invocation Method in the Service Type column to obtain the public endpoint and token.

    3. Run the following command in the terminal window to call the service.

      $ curl <service_url> -H 'Authorization: <token>' -d '10 20'

      Modify the following parameters:

      • Replace <service_url> with the public endpoint that you obtained in step b. Example: http://182848887922****.vpc.cn-beijing.pai-eas.aliyuncs.com/api/predict/pysdk_demo.

      • Replace <token> with the token that you obtained in step b. Example: ZTBhZTY3ZjgwMmMyMTQ5OTgyMTQ5YmM0NjdiMmNiNmJkY2M5ODI0****.

      • Use the -d option to specify the input parameters of the service.

References