iTAG provides annotation templates for image Optical Character Recognition (OCR), object detection, and image classification. When you create an annotation task, select a template that corresponds to your application scenario. This topic describes the application scenarios and data structures for these templates.
Background information
This topic describes the data structures for the following image annotation templates:
Image OCR
An OCR task first extracts text from an input image. Then, it groups the images based on the category of the extracted text.
-
Scenarios
Scenarios include recognizing certificates, tickets, license plates, and bank cards.
-
Data structure
-
Input data
Each row in the manifest file contains an object. Each row must contain the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/demo_test/ocr_pic/img6.jpeg"}} ... -
Outputs
Each row in the manifest file is generated from the object and its annotation results. The following code shows the JSON structure for each row.
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/demo_test/ocr_pic/img6.jpeg" }, "label-144863699223676****": { "results": [ { "questionId": "1", "data": [ { "id": "ecdb7552-2a4e-4d0e-8abb-0f1a2dc0****", "type": "image/polygon", "value": [ [ 368.1112214498511, 71.72740814299901 ], [ 444.34359483614696, 71.72740814299901 ], [ 444.34359483614696, 106.26762661370405 ], [ 368.1112214498511, 106.26762661370405 ] ], "labels": { "OCR result": "Financial consultant", "Single-choice": "Label 1" } } ], "rotation": 0, "markTitle": "OCR label configuration", "width": 1024, "type": "image", "height": 1024 } ] } }
-
Object detection
An object detection annotation task locates specific objects in an image. A rectangular box tool is commonly used for this task.
-
Scenarios
Scenarios include vehicle detection, pedestrian detection, and image search.
-
Data structure
-
Input data
Each row in the manifest file contains an object. Each row must contain the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/pic_ocr/img17.jpeg"}} ... -
Outputs
Each row in the manifest file is generated from the object and its annotation results. The following code shows the JSON structure for each row.
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/pic_ocr/img17.jpeg" }, "label-144853549785619****": { "results": [ { "questionId": "1", "data": [ { "id": "e02a574b-9fd9-45e9-8c8a-9682567b****", "type": "image/polygon", "value": [ [ 499.93454545454546, 255.0981818181818 ], [ 911.0109090909091, 255.0981818181818 ], [ 911.0109090909091, 338.6836363636363 ], [ 499.93454545454546, 338.6836363636363 ] ], "labels": { "Single-choice": "Label 1" } } ], "rotation": 0, "markTitle": "Object detection label configuration", "width": 1024, "type": "image", "height": 1024 } ] } }
-
Image classification
Image classification is the process of assigning one or more labels from a predefined set to an image. This template supports both single-label and multi-label classification.
-
Scenarios
Scenarios include image classification, image recognition, image search, and content recommendation.
-
Data structure
-
Input data
Each row in the manifest file contains an object. Each row must contain the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/iTAG/pic/1.jpg"}} ... -
Outputs
Each row in the manifest file is generated from the object and its annotation results. The following code shows the JSON structure for each row.
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/pic/3.jpg" }, "label-143082452899667****": { "results": [ { "questionId": "2", "data": [ "Label 1", "Label 2" ], "markTitle": "Multiple-choice", "type": "survey/multivalue" } ] } }
-