The visual modeling platform plug-in allows you to label training data, train common computer vision models, and deploy the models. It deeply optimizes the models that are used on mobile platforms. You can test the performance of a model on your mobile phone by scanning the QR code of the model. You can also deploy a model on your mobile device. This topic describes how to use a visual modeling platform plug-in to detect objects.

Prerequisites

  • AutoLearning is authorized to access Object Storage Service (OSS). For more information, see OSS authorization.
  • An instance of the object detection type is created. For more information, see Create instances.
  • The image dataset for model training is uploaded to OSS. The image dataset must meet the requirements and specifications of optical character recognition (OCR). For more information, see Usage notes. We recommend that you use the graphical management tool ossbrowser to upload images in bulk. For more information, see Use ossbrowser.

Background information

Test data: Demo data of object detection.

Usage notes

The image dataset for object detection must meet the following dataset requirements and specifications:
  • Dataset requirements
    • Image quality: The images are not damaged and the resolution of the images must be higher than 30p. AutoLearning supports images in JPG and JPEG formats.
    • Data balance: We recommend that you balance the image quantity among image categories and include more than 50 images in each category.
    • Generalization: Select images that are taken in real scenes from different perspectives.
  • Dataset specifications
    |-- your_image_dir /
        | -- a.jpg
        | -- a.xml
        | -- b.png
        | -- b.xml
        | -- c.png
        ...            
    The images stored in OSS for model training must meet the preceding format requirements. your_image_dir refers to the folder that stores all the images for model training. The image labeling results are stored in XML files that are supported by Visual Object Classes (VOC) and Pattern Analysis, Statistic Modelling, and Computational Learning (PASCAL).
    The following example describes the XML format.
    <?xml version="1.0" encoding="utf-8"?>
    <annotation>
        <size>
            <width>1280</width>
            <height>720</height>
            <depth>3</depth>
        </size>
        <object>
            <name>dog</name>
            <bndbox>
                <xmin>549</xmin>
                <xmax>715</xmax>
                <ymin>257</ymin>
                <ymax>289</ymax>
            </bndbox>
            <truncated>0</truncated>
            <difficult>0</difficult>
        </object>
        <object>
            <name>cat</name>
            <bndbox>
                <xmin>842</xmin>
                <xmax>1009</xmax>
                <ymin>138</ymin>
                <ymax>171</ymax>
            </bndbox>
            <truncated>0</truncated>
            <difficult>0</difficult>
        </object>
        <segmented>0</segmented>
    </annotation>
    In the preceding example, the following two objects are detected: dog and cat.

Procedure

To use the visual modeling platform plug-in to detect objects, perform the following steps:
  1. Create a dataset

    Create a training dataset for object detection.

  2. Step 2: Label images

    If unlabeled data exists, label it on the AutoLearning platform.

  3. Step 3: Create a task

    Create a model training task.

  4. Step 4: View training details

    You can view the training progress, node details, and training logs during model training.

  5. Step 5: Generate a mini program to test the model

    You can use Alipay on your mobile phone to scan the QR code to test the model performance.

  6. Step 6: Deploy the model

    The visual modeling platform plug-in is highly compatible with Elastic Algorithm Service (EAS) of Machine Learning Platform for AI (PAI). You can use the plug-in to deploy a model as a RESTful service with ease.

Step 1: Create a dataset

  1. Go to the Computer Vision Model Training page.
    1. Log on to the Machine Learning Platform for AI (PAI) console.
    2. In the left-side navigation pane, choose AI Industry Plug-In > Visual Modeling Platform Plug-in.
  2. On the Computer Vision Model Training page, find the instance that you want to manage and click Open in the Operation column.
  3. In the Data Preparation step, click New Dataset.
  4. In the New Dataset panel, set the parameters.
    Parameter Description
    Dataset name The name of the dataset. The name must be 1 to 30 characters in length and can contain underscores (_) and hyphens (-). It must start with a letter or digit.
    Description The description of the dataset. The description helps distinguish different datasets.
    Storage type Only OSS is supported. You cannot change the value.
    OSS path The OSS path where the images for model training are stored.
  5. Click Confirm.
    The visual modeling platform plug-in automatically creates indexes on images and labeling data, but does not save the indexed images. The plug-in can retrieve your images in OSS to train models only after the plug-in is authorized. You can view the information of datasets in the Dataset list section. If the status of the dataset changes from Data import to To be manually marked or Labeling completed, the dataset is created.

Step 2: Label images

If your dataset contains unlabeled images, you can label them on the AutoLearning platform.

  1. In the Dataset list section of the Data Preparation step, click Labeling in the Operation column.
  2. On the Labeling tab, label all images and click Submit. Label images
  3. Click Preview to view the labeling results. Labeling results of object detection

Step 3: Create a task

  1. In the Data Preparation step, click the Training tasks step in the upper part of the page.
  2. In the Training tasks step, click New task.
  3. In the New task panel, set the parameters.
    Step Parameter Description
    Basic Information Task name The task name must be 1 to 30 characters in length and can contain underscores (_) and hyphens (-). It must start with a letter or digit.
    Description The description of the task. The description helps distinguish different tasks.
    Dataset Select dataset Select the created dataset as the training dataset.
    Select label Select labels for object detection.
    Algorithm and training Select algorithm The following algorithms are supported:
    • Object Detection (High Performance): balances the inference performance of the server in the cloud and clients and provides fast prediction services.
    • Object Detection (High Precision): provides higher precision but a lower prediction speed than that of the high-performance algorithm.
    Resource configuration Set the Number of GPUs and GPU type parameters for the training task.
    Show Advanced Settings Click Show Advanced Settings to customize the algorithm parameters that are involved in model training. For more information, see Table 1. If you customize no parameters in the Show Advanced Settings section, default values are used.
    Table 1. Parameters in the Show Advanced Settings section
    Parameter Description Default value
    Data Enhancement The following data enhancement methods are supported:
    • Rotate: Rotate an image.
    • Blur: Blur an image.
    • Noise: Add noises to an image.
    • Shear: Perform a shearing on an image.
    • FlipLR: Flip an image left and right.
    • FlipUD: Flip an image up and down.
    Noise and FlipLR
    Model width Valid values: 0.35, 0.5, 0.75 and 1. 0.5
    Epoch Training The number of epochs for model training. 150
    Optimizer The optimization algorithms for model training. Valid values:
    • Adam
    • RmsProp
    • Momentum
    Adam
    Initialize learning rate The initial learning rate during model training. 0.001
    Quantization compression Specifies whether to perform quantization compression. Yes
  4. Click Start training.

Step 4: View training details

  1. In the Training tasks step, click Training details in the Operation column of the task that you created.
  2. On the page that appears, you can perform the following operations.
    Operation Description
    View the training progress
    1. On the training details page of the task, click the Training process tab.
    2. On the Training process tab, view the training progress and relevant information in the Basic information section. Object detection
    Terminate the training task On the Training process tab, click Terminate task.
    View the node information
    1. On the Training process tab, click a node.
    2. In the Node Information panel, view the status of the node and the information in the Basic information and Step information sections.
    View training logs
    1. On the Training process tab, click a node.
    2. In the Node Information panel, click the Log tab.

Step 5: Generate a mini program to test the model

  1. After the training is complete, click Model and deploy in the upper-right corner of the training details page.
  2. In the Model and deploy step, scan the QR code by using the Alipay app. Test the object detection model
    The values of the following model metrics are calculated based on a validation set. A validation set is a portion of the training data. By default, 10% of the training data is extracted and used as a validation set.
    • mAP@IoU0.5: calculates the precision rate and recall rate of an object category with a given score threshold and an intersection over union (IoU) value. mAP@IoU0.5 specifies the metrics of precision-recall (PR) curves with different score thresholds in different categories when IoU ≥ 0.5. A higher mAP@IoU0.5 value indicates a more precise detection model.
    • loss: calculates the loss between ground truth and the predicted value by using the loss function. A lower loss indicates a more precise model.
    • model_size: obtains the model size based on optimization methods such as training, quantization, and encoding.
  3. Use the mini program to scan objects to test how the model recognizes and classifies objects in real time.

Step 6: Deploy the model

  1. In the Model and deploy step, click Go to PAI-EAS deployment.
  2. Set model parameters.
    1. In the Model Configuration panel, set the Custom Model Name and Resources Type parameters. Other parameters are automatically set.
    2. Click Next.
  3. In the Deployment details and confirmation panel, set the parameters.
    Parameter Description
    Number Of Instances Click the Upward arrow or Downward arrow icon to adjust the number of instances.
    Quota The specifications of an instance. This parameter is displayed only when the Resources Type parameter is set to CPU. One quota contains one core and 4 GB of memory.
    Resources on a single instance The specifications of a GPU server. This parameter is displayed only when the Resources Type is set to GPU.
  4. Click Deploy.
    The Elastic Algorithm Service page appears. If the status of the model changes to Running in the State column, the model is deployed.
  5. Call the model service.
    Make an API call
    • HTTP method: POST.
    • Request URL: After the model is deployed as a service on the server, a public endpoint is automatically generated. To view the values of the Access address and Token parameters, perform the following steps:
      1. On the Elastic Algorithm Service page, click Invoke Intro in the Service Method column.
      2. In the Invoke Intro dialog box, click the Public Network Invoke tab to view the values of the Access address and Token parameters.
    • Request body
      {
        "dataArray":[
          {
            "name":"image",
            "type":"stream",
            "body":"Base64-encoded data"
          }
        ]
      }
      Parameter Required Type Description
      name No STRING N/A
      type No STRING The type of the data. The default type is stream and cannot be changed.
      body Yes STRING The data of an image. The data must be Base64-encoded. Images in JPG, PNG, and BMP formats are supported.
    • Response parameters
      Parameter Type Description
      success BOOL Indicates whether the call was successful.
      result OBJECT The return result.
      output ARRAY The detection result that was returned in an array.
      label STRING The label of the image. The label represents the category of the image
      conf NUMBER The confidence level.
      pos ARRAY The relative coordinates (x,y) of a detection frame. The relative coordinates are stored in the order of upper left, upper right, lower right, and lower left.
      meta OBJECT The information about the image.
      height NUMBER The height of the image.
      width NUMBER The width of the image.
    • Error codes
      Error code Error message Description
      1001 INPUT_FORMAT_ERROR The error message returned because the input format is invalid. For example, a required parameter is missing. Check whether the input format is valid.
      1002 IMAGE_DECODE_ERROR The error message returned because the image failed to be decoded. The image is not in a supported format, such as JPG and PNG. Check the image format.
      2001 UNKNOWN_ERROR The error message returned because an internal server error occurred.
      2002 GET_INSTANCE_ERROR The error message returned because the system failed to find the instance. This error may occur due to insufficient resources. Increase resources such as the number of CPU cores and memory size.
      2003 MODEL_FORWARD_ERROR The error message returned because a model inference failure occurred. This error is caused by an internal server error.
    Examples
    • Sample request
      curl http://****.cn-shanghai.pai-eas.aliyuncs.com/api/predict/**** -H 'Authorization:****==' -d '{"dataArray": [{"body": "****", "type": "stream", "name": "image"}]}'
      Replace the URL, token, and Base64-encoded information in this example with actual values.
    • Sample response
      {
        "success":true,
        "result":{
          "output":[
            {
              "type":"cv_common",
              "body":[
                {
                  "label":"car",
                  "conf":0.64,
                  "pos":[[0.034,0.031],[0.98,0.031],[0.98,0.97],[0.034,0.97]]
                }
              ]
            }
          ],
          "meta":{
            "height":1920,
            "width":1080
          }
        }
      }
    • Sample error response
      If a request error occurs, the response contains the following parameters:
      • errorCode: the error code.
      • errorMsg: the error message.
      For example, if the request does not contain the dataArray field, the following response is returned:
      {
        "success":false,
        "errorCode":"1001",
        "errorMsg":"INPUT_FORMAT_ERROR"
      }