All Products
Search
Document Center

Platform For AI:Image labeling templates

Last Updated:Oct 12, 2023

iTAG of Machine Learning Platform for AI (PAI) provides labeling templates for optical character recognition (OCR), object detection, and image classification. When you create an image labeling job, you can select a labeling template based on your business scenario. This topic describes scenarios of image labeling templates and the data structures of input and output data for these templates.

Background information

iTAG provides image labeling templates that support the following features:

OCR

OCR is used to extract text from input images, and then classify the images based on the text.

  • Scenarios

    This labeling template applies to scenarios such as the recognition of identity cards, tickets, license plates, and bank cards.

  • Data structures

    • Input data

      Each row in the .manifest file of input data contains an object. Each row must contain the source field.

      {"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/demo_test/ocr_pic/img6.jpeg"}}
      ...
    • Output data

      Each row in the .manifest file of output data contains an object and the labeling results for the object. The following code provides an example on the JSON string in each row:

      {
          "data": {
              "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/demo_test/ocr_pic/img6.jpeg"
          }, 
          "label-144863699223676****": {
              "results": [
                  {
                      "questionId": "1", 
                      "data": [
                          {
                              "id": "ecdb7552-2a4e-4d0e-8abb-0f1a2dc0****", 
                              "type": "image/polygon", 
                              "value": [
                                  [
                                      368.1112214498511, 
                                      71.72740814299901
                                  ], 
                                  [
                                      444.34359483614696, 
                                      71.72740814299901
                                  ], 
                                  [
                                      444.34359483614696, 
                                      106.26762661370405
                                  ], 
                                  [
                                      368.1112214498511, 
                                      106.26762661370405
                                  ]
                              ], 
                              "labels": {
                                  "OCR result": "Financial consultant", 
                                  "Single-choice": "Label 1"
                              }
                          }
                      ], 
                      "rotation": 0, 
                      "markTitle": "Label configuration for OCR", 
                      "width": 1024, 
                      "type": "image", 
                      "height": 1024
                  }
              ]
          }
      }

Object detection

Object detection is used to locate a specific object in an image. The rectangle selection tool is commonly used.

  • Scenarios

    This labeling template applies to scenarios such as vehicle detection, passenger detection, and image search.

  • Data structures

    • Input data

      Each row in the .manifest file of input data contains an object. Each row must contain the source field.

      {"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/pic_ocr/img17.jpeg"}}
      ...
    • Output data

      Each row in the .manifest file of output data contains an object and the labeling results for the object. The following code provides an example on the JSON string in each row:

      {
          "data": {
              "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/pic_ocr/img17.jpeg"
          }, 
          "label-144853549785619****": {
              "results": [
                  {
                      "questionId": "1", 
                      "data": [
                          {
                              "id": "e02a574b-9fd9-45e9-8c8a-9682567b****", 
                              "type": "image/polygon", 
                              "value": [
                                  [
                                      499.93454545454546, 
                                      255.0981818181818
                                  ], 
                                  [
                                      911.0109090909091, 
                                      255.0981818181818
                                  ], 
                                  [
                                      911.0109090909091, 
                                      338.6836363636363
                                  ], 
                                  [
                                      499.93454545454546, 
                                      338.6836363636363
                                  ]
                              ], 
                              "labels": {
                                  "Single-choice": "Label 1"
                              }
                          }
                      ], 
                      "rotation": 0, 
                      "markTitle": "Label configuration for object detection", 
                      "width": 1024, 
                      "type": "image", 
                      "height": 1024
                  }
              ]
          }
      }

Image classification

Image classification is used to find one or more labels that match an input image from a set of labels and add the labels to the image. This template supports single-label and multi-label image classification.

  • Scenarios

    This labeling template applies to scenarios such as image classification, image recognition, image search, and content recommendation.

  • Data structures

    • Input data

      Each row in the .manifest file of input data contains an object. Each row must contain the source field.

      {"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/iTAG/pic/1.jpg"}}
      ...
    • Output data

      Each row in the .manifest file of output data contains an object and the labeling results for the object. The following code provides an example on the JSON string in each row:

      {
          "data": {
              "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/pic/3.jpg"
          }, 
          "label-143082452899667****": {
              "results": [
                  {
                      "questionId": "2", 
                      "data": [
                          "Label 1", 
                          "Label 2"
                      ], 
                      "markTitle": "Multiple-choice", 
                      "type": "survey/multivalue"
                  }
              ]
          }
      }