The visual modeling platform plug-in allows you to label training data, train common computer vision models, and deploy the models. It deeply optimizes the models that are used on mobile platforms. You can test the performance of a model on your mobile phone by scanning the QR code of the model. You can also deploy a model on your mobile device. This topic describes how to use a visual modeling platform plug-in to detect objects.

Prerequisites

  • AutoLearning is authorized to access Object Storage Service (OSS). For more information, see OSS authorization.
  • An instance of the object detection type is created. For more information, see Create instances.
  • The image dataset for model training is uploaded to OSS and meets the requirements and specifications of optical character recognition (OCR). For more information, see Limits. We recommend that you use the graphical management tool ossbrowser to upload images in bulk. For more information, see Use ossbrowser.

Background information

Test data: Demo data of object detection

Limits

The image dataset for object detection must meet the following dataset requirements and specifications:
  • Dataset requirements
    • Image quality: The images are not damaged, and the resolution of the images must be higher than 30 pixels per inch (PPI). AutoLearning supports images in the JPG and JPEG formats.
    • Data balance: We recommend that you balance the image quantity among image categories and include more than 50 images in each category.
    • Generalization: The images are taken in real scenes from different perspectives.
  • Dataset specifications
    |-- your_image_dir /
        | -- a.jpg
        | -- a.xml
        | -- b.png
        | -- b.xml
        | -- c.png
        ...            
    The images stored in OSS for model training must meet the preceding format requirements. your_image_dir refers to the folder that stores all the images for model training. The image labeling results are stored in the XML format that is supported by Pattern Analysis, Statistic Modelling, and Computational Learning (PASCAL) Visual Object Classes (VOC).
    The following example describes the XML format:
    <?xml version="1.0" encoding="utf-8"?>
    <annotation>
        <size>
            <width>1280</width>
            <height>720</height>
            <depth>3</depth>
        </size>
        <object>
            <name>dog</name>
            <bndbox>
                <xmin>549</xmin>
                <xmax>715</xmax>
                <ymin>257</ymin>
                <ymax>289</ymax>
            </bndbox>
            <truncated>0</truncated>
            <difficult>0</difficult>
        </object>
        <object>
            <name>cat</name>
            <bndbox>
                <xmin>842</xmin>
                <xmax>1009</xmax>
                <ymin>138</ymin>
                <ymax>171</ymax>
            </bndbox>
            <truncated>0</truncated>
            <difficult>0</difficult>
        </object>
        <segmented>0</segmented>
    </annotation>
    In the preceding example, the following two objects are detected: dog and cat.

Procedure

To use the visual modeling platform plug-in to detect objects, perform the following steps:
  1. Step 1: Create a dataset

    Create a training dataset for object detection.

  2. Step 2: Label images

    If unlabeled data exists, label it on the AutoLearning platform.

  3. Step 3: Create a task

    Create a model training task.

  4. Step 4: View training details

    You can view the training progress, node details, and training logs during model training.

  5. Step 5: Generate a mini program to test the model

    You can use Alipay on your mobile phone to scan the QR code to test the model performance.

  6. Step 6: Deploy the model

    The visual modeling platform plug-in is highly compatible with Elastic Algorithm Service (EAS) of Machine Learning Platform for AI (PAI). You can use the plug-in to deploy a model as a RESTful service with ease.

Step 1: Create a dataset

  1. Go to the Computer Vision Model Training page.
    1. Log on to the PAI console.
    2. In the left-side navigation pane, choose AI Industry Plug-In > Visual Modeling Platform Plug-in.
  2. On the Computer Vision Model Training page, find the instance that you want to manage and click Open in the Operation column.
  3. In the Data Preparation step, click New Dataset.
  4. In the New Dataset panel, set the parameters.
    Parameter Description
    Dataset name The name of the dataset. The name must be 1 to 30 characters in length and can contain underscores (_) and hyphens (-). It must start with a letter or digit.
    Description The description of the dataset. The description helps distinguish different datasets.
    Storage type Only OSS is supported. You cannot change the value.
    OSS path The OSS path where the images for model training are stored.
  5. Click Confirm.
    The visual modeling platform plug-in automatically creates indexes on images and labeling data, but does not save the indexed images. The plug-in can retrieve your images in OSS to train models only after the plug-in is authorized. You can view the information of datasets in the Dataset list section. If the status of the dataset changes from Data import to To be manually marked or Labeling completed, the dataset is created.

Step 2: Label images

If your dataset contains unlabeled images, you can label them on the AutoLearning platform.

  1. In the Dataset list section of the Data Preparation step, find the created dataset and click Labeling in the Operation column.
  2. On the Labeling tab, label all images and click Submit. Label images
  3. Click Preview to view the labeling results. Labeling results of object detection

Step 3: Create a task

  1. In the Data Preparation step, click the Training tasks step in the upper part of the page.
  2. In the Training tasks step, click New task.
  3. In the New task panel, set the parameters.
    Step Parameter Description
    Basic information Task name The name of the task. The task name must be 1 to 30 characters in length, and can contain letters, digits, underscores (_), and hyphens (-). It must start with a letter or digit.
    Description The description of the task. This helps distinguish different tasks.
    Dataset Select dataset Select the created dataset as the training dataset.
    Select label Select labels for object detection.
    Algorithm and training Select algorithm The following algorithms are supported:
    • Object Detection (High Performance): balances the inference performance of the server in the cloud and clients, and provides fast prediction services.
    • Object Detection (High Precision): provides higher precision but a lower prediction speed than that of the high-performance algorithm.
    Resource configuration Set the Number of GPUs and GPU type parameters for the training task.
    Show Advanced Settings Click Show Advanced Settings and customize the algorithm parameters that are involved in model training. For more information, see Table 1. If you customize no parameters in the Show Advanced Settings section, default values are used.
    Table 1. Parameters in the Show Advanced Settings section
    Parameter Description Default value
    Data Enhancement The data enhancement methods. Valid values:
    • Rotate: rotates an image.
    • Blur: blurs an image.
    • Noise: adds noises to an image.
    • Shear: performs a shearing on an image.
    • FlipLR: flips an image left and right.
    • FlipUD: flips an image up and down.
    Noise and FlipLR
    Model width The width of the model. Valid values: 0.35, 0.5, 0.75, and 1. 0.5
    Epoch Training The number of epochs for model training. 150
    Optimizer The optimization algorithms for model training. Valid values:
    • Adam
    • RmsProp
    • Momentum
    Adam
    Initialize learning rate The initial learning rate used during model training. 0.001
    Quantization compression Specifies whether to perform quantization compression. Yes
  4. Click Start training.

Step 4: View training details

  1. In the Training tasks step, find the created task and click Training details in the Operation column.
  2. On the page that appears, you can perform the operations that are described in the following table.
    Operation Description
    View the training progress
    1. On the training details page of the task, click the Training process tab.
    2. On the Training process tab, view the training progress and relevant information in the Basic information section. Object detection
    Terminate the training task On the Training process tab, click Terminate task.
    View the node information
    1. On the Training process tab, click a node.
    2. In the Node Information panel, view the status of the node and the information in the Basic information and Step information sections.
    View training logs
    1. On the Training process tab, click a node.
    2. In the Node Information panel, click the Log tab.

Step 5: Generate a mini program to test the model

  1. After the training is complete, click Model and deploy in the upper-right corner of the training details page.
  2. In the Model and deploy step, scan the QR code by using the Alipay app. Test the object detection model
    The values of the following model metrics are calculated based on a validation set. A validation set is a portion of the training data. By default, 10% of the training data is extracted and used as a validation set.
    • mAP@IoU0.5: calculates the precision rate and recall rate of an object category with a given score threshold and an intersection over union (IoU) value. mAP@IoU0.5 specifies the metrics of precision-recall (PR) curves with different score thresholds in different categories when IoU ≥ 0.5. A higher value of mAP@IoU0.5 indicates a more precise detection model.
    • loss: calculates the loss between ground truth and the predicted value by using the loss function. A lower loss indicates a more precise model.
    • model_size: obtains the model size based on optimization methods such as training, quantization, and encoding.
  3. Use the mini program to scan objects and test how the model recognizes and classifies objects in real time. Detection performance

Step 6: Deploy the model

  1. In the Model and deploy step, click Go to PAI-EAS deployment.
  2. Set model parameters.
    1. In the Model Configuration panel, set the Custom Model Name and Resource Group Type parameters. Other parameters are automatically set.
    2. Click Next.
  3. In the Deployment details and confirmation step, set the parameters.
    Parameter Description
    Number Of Instances Click the Upward arrow or Downward arrow icon to adjust the number of instances.
    Quota The specifications of an instance. This parameter is displayed only when the resource type is set to CPU. One quota contains one CPU core and 4 GB of memory.
    Resources on a single instance The specifications of a GPU server. This parameter is displayed only when the resource type is set to GPU.
  4. Click Deploy.
    Go to the Elastic Algorithm Service page. If the status of the model changes to Running in the State column, the model is deployed.
  5. Call the model service.
    Service call description

    EAS of PAI provides SDKs for you to call services. You can use EAS SDK for Java, Python, or Go to call a deployed model service based on your preferences. For more information, see SDK for Java, SDK for Python, and SDK for Go. The following content describes the key information for calling a model service:

    • HTTP method: POST.
    • Request URL: After the model is deployed as a service on the server, a public endpoint is automatically generated. To view the values of the Access address and Token parameters, perform the following steps:
      1. On the Elastic Algorithm Service page, find the deployed service and click Invoke Intro in the Service Method column.
      2. In the Invoke Intro dialog box, click the Public Network Invoke tab to view the values of the Access address and Token parameters.
    • Request body
      {
        "dataArray":[
          {
            "name":"image",
            "type":"stream",
            "body":"Base64-encoded data"
          }
        ]
      }
      Parameter Required Type Description
      name No STRING N/A
      type No STRING The type of the data. The default type is stream and cannot be changed.
      body Yes STRING The data of an image. The data must be Base64-encoded. Images in the JPG, PNG, and BMP formats are supported.
    • Response parameters
      Parameter Type Description
      success BOOL Indicates whether the call is successful.
      result OBJECT The return result.
      output ARRAY The detection result that is returned in an array.
      label STRING The label of the image. The label represents the category of the image.
      conf NUMBER The confidence level.
      pos ARRAY The relative coordinates (x,y) of a detection frame. The relative coordinates are stored in the order of upper left, upper right, lower right, and lower left.
      meta OBJECT The information about the image.
      height NUMBER The height of the image.
      width NUMBER The width of the image.
    • Error codes
      Error code Error message Description
      1001 INPUT_FORMAT_ERROR The error message returned because the input format is invalid. For example, a required parameter is missing. Check whether the input format is valid.
      1002 IMAGE_DECODE_ERROR The error message returned because the image failed to be decoded. The image is not in a supported format, such as JPG or PNG. Check the image format.
      2001 UNKNOWN_ERROR The error message returned because an internal server error has occurred.
      2002 GET_INSTANCE_ERROR The error message returned because the system failed to find the instance. This error may occur due to insufficient resources. Increase resources such as the number of CPU cores and memory size.
      2003 MODEL_FORWARD_ERROR The error message returned because a model inference failure has occurred. This error is an internal server error.
    Examples
    • Sample request
      curl http://****.cn-shanghai.pai-eas.aliyuncs.com/api/predict/**** -H 'Authorization:****==' -d '{"dataArray": [{"body": "****", "type": "stream", "name": "image"}]}'
      Replace the URL, token, and Base64-encoded information in this example with actual values.
    • Sample response
      {
        "success":true,
        "result":{
          "output":[
            {
              "type":"cv_common",
              "body":[
                {
                  "label":"car",
                  "conf":0.64,
                  "pos":[[0.034,0.031],[0.98,0.031],[0.98,0.97],[0.034,0.97]]
                }
              ]
            }
          ],
          "meta":{
            "height":1920,
            "width":1080
          }
        }
      }
    • Sample error response
      If a request error occurs, the response contains the following parameters:
      • errorCode: the error code.
      • errorMsg: the error message.
      For example, if the request does not contain the dataArray parameter, the following response is returned:
      {
        "success":false,
        "errorCode":"1001",
        "errorMsg":"INPUT_FORMAT_ERROR"
      }