Machine Learning Platform for AI provides the following templates: object detection, semantic segmentation, comprehensive image annotation, Optical Character Recognition (OCR), and image classification. When you create a labeling job, select a template that meets your requirements.

Object detection

Object detection is used to locate a specific object in an image. The rectangle selection tool is commonly used.

  • Scenarios

    Vehicle detection, passenger detection, and image search.

  • Data structure
    • Input data
      Each row in the manifest file contains a topic. The topic must contain the picUrl field.
      {"data":{"picUrl":"oss://****/pics/fruit/apple-1.jpg"}}
      ...
    • Output data
      Each row in the manifest file contains a topic and the labeling result. The following code provides an example of the JSON string in each row:
      {
          "data": {
              "picUrl": "oss://****/pics/fruit/apple-1.jpg"
          },
          "label-****(Labeling job ID)": {
              "results": [{
                  "data": [{
                      "id":"Znyumd-*****",
                      "type":"image/rectangleLabel",
                      "value":{
                          "rotation":0,
                          "x":40.68320610687023,
                          "width":327.52035623409665,
                          "y":5.762467474590647,
                          "height":296.68117192104745
                      },
                      "labelColor":"#72bf7d",
                      "labels":["apple"]
                  }],
                  "id":"44****",
                  "type":"image"
              }]
          }
      }

Semantic segmentation

Semantic segmentation is used to recognize an object in an image and retrieve the coordinates of the object by scanning all pixels of the object. The commonly used tools are the polygon selection tool, brush tool, and superpixel tool.

  • Scenarios

    Autonomous driving, facial expression recognition, and apparel classification.

  • Data structure
    • Input data
      Each row in the manifest file contains a topic. The topic must contain the picUrl field.
      {"data":{"picUrl":"oss://****/pics/fruit/apple-1.jpg"}}
      ...
    • Output data
      Each row in the manifest file contains a topic and the labeling result. The following code provides an example of the JSON string in each row:
      {
          "data": {
              "picUrl": "oss://****/pics/fruit/apple-1.jpg"
          },
          "label-****(Labeling job ID)": {
              "results": [{
                  "data": [{
                      "id":"Znyumd-*****",
                      "type":"image/polygonLabel",
                      "value":{
                          "points": [
                              [110, 46],
                              [52, 196],
                              [48, 168],
                              [48, 145],
                              [54, 120],
                              [63, 93],
                              [76, 74]
                          ]
                      },
                      "labelColor":"#72bf7d",
                      "labels":["apple"]
                  }],
                  "id":"44****",
                  "type":"image"
              }]
          }
      }

Comprehensive image annotation

Comprehensive image annotation is used to match the content of the input images against a set of labels. This template allows you to use all image labeling tools.

  • Scenarios

    Autonomous driving, content moderation, and content recognition.

  • Data structure
    • Input data
      Each row in the manifest file contains a topic. The topic must contain the picUrl field.
      {"data":{"picUrl":"oss://****/pics/fruit/apple-10.jpg"}}
    • Output data
      Each row in the manifest file contains a topic and the labeling result. The following code provides an example of the JSON string in each row:
      {
          "data": {
              "picUrl": "oss://****/pics/fruit/apple-10.jpg"
          },
          "label-****(Labeling job ID)": {
              "results": [{
                  "data": [{
                      "id":"Znyumd-****",
                      "type":"image/rectangleLabel",
                      "value":{
                          "rotation":0,
                          "x":40.68320610687023,
                          "width":327.52035623409665,
                          "y":5.762467474590647,
                          "height":296.68117192104745
                      },
                      "labelColor":"#72bf7d",
                      "labels":["Ripe apple"]
                  }],
                  "id":"44****",
                  "type":"image"
              }]
          }
      }

OCR

OCR is used to extract text from input images and classify the images based on the text.

  • Scenarios

    Identity card, ticket, license plate, and bank card recognition.

  • Data structure
    • Input data
      Each row in the manifest file contains a topic. The topic must contain the picUrl field.
      {"data":{"picUrl":"oss://****/img/ocr_card/img0.jpeg"}}
    • Output data
      Each row in the manifest file contains a topic and the labeling result. The following code provides an example of the JSON string in each row:
      {
          "data": {
              "picUrl": "oss://****/img/ocr_card/img0.jpeg"
          },
          "label-****(Labeling job ID)": {
              "results": [{
                  "data": [{
                      "direction_of_picture":"downward",
                      "type":"ocr/meta"
                  },
                  {
                      "id": "Y4ZFoC-****",
                      "direction_of_text": "downward",
                      "text": "Alibaba Cloud Intelligence",
                      "type": "ocr/polygonLabel",
                      "value": {
                          "points": [[325.08789110183716,397.47582054138184]]
                      },
                      "labelColor": "#67bd3a",
                      "labels": "Enterprise"
                  }],
                  "id":"24****",
                  "type":"ocr"
              }]
          }
      }

Image classification

Image classification is used to find one or more labels from a set of labels to match the content of an input image and attach the labels to the image. The template supports single-label and multi-label image classification.

  • Scenarios

    Photo classification, image recognition, image search, and content recommendation.

  • Data structure
    • Input data
      Each row in the manifest file contains a topic. The topic must contain the picUrl field.
      {"data":{"picUrl":"oss://****/img/ocr_card/img0.jpeg"}}
    • Output data
      Each row in the manifest file contains a topic and the labeling result. The following code provides an example of the JSON string in each row:
      {
          "data": {
              "picUrl": "oss://****/img/ocr_card/img0.jpeg"
          },
          "label-xxxxx(Labeling job ID)": {
              "results": [{
                  "data": [{
                      "data":"red",
                      "id":"33****",
                      "type":"survey/value"
                  }],
                  "id":"33****",
                  "type":"survey"
              }]
          }
      }