The data labeling platform iTAG offers a rich set of templates for images, text, video, audio, and multimodal data.
Usage notes
iTAG provides pre-built templates for the following labeling tasks:
Image: Classification, object detection, optical character recognition (OCR), table recognition, and semantic segmentation.
Text: Classification, named entity recognition (NER), and entity relationship recognition.
Video: Classification, tagging, and OCR.
Audio: Classification, segmentation, and recognition.
Multimodal: Visual question answering (VQA), multimodal Reinforcement Learning from Human Feedback (RLHF) labeling, image-to-text, image-text explanation, dialogue rewriting, dialogue sorting, and dialogue grouping.
For more information, see Manage templates.
Procedure
Upload the data to be labeled to Object Storage Service (OSS). Then, use the dataset management module to import the data from the OSS path and create a dataset. The system generates a
.manifestindex file for your data. This file, in JSON Lines (JSONL) format, contains data paths and metadata for subsequent labeling tasks.ImportantiTAG requires data to be stored in OSS. For seamless access, the OSS bucket must be in the same region as the PAI service.
After creating a dataset, use a general-purpose or custom template to create and distribute a labeling task. The task distribution process has three stages: labeling, review, and acceptance. Labeling is a mandatory stage, while review and acceptance are optional.
Labeling: Annotators claim task packages on the Label Task page, complete the labeling, and submit their work.
Review: Reviewers claim completed task packages on the Quality Inspection Task page to review, modify, or reject them.
Acceptance: The project owner claims the task packages on the Acceptance Task page to accept, modify, or reject them.
Annotators, reviewers, and project owners complete their assigned work on the distributed task packages.
For model training, export the labeling results to a specified OSS directory. The exported output is a .manifest file containing the labeled data.
Billing
iTAG platform (free): The iTAG platform is free for in-house teams managing their own labeling projects.
Intelligent labeling service (free): The intelligent labeling service, powered by large models, is currently free for select templates in the multimodal category (such as image-to-text and image-text explanation). You will be notified in advance of any future charges.
OSS (paid): iTAG runs on OSS. Therefore, costs for OSS storage and data read/write traffic are billed separately according to the OSS billing standards.
Manual labeling outsourcing service (paid): To delegate data labeling to Alibaba Cloud's professional team, submit a ticket or join the DingTalk group (ID: 21930006619) to contact the PAI team. This is a paid service.
Get help
For help with issues such as data loading errors, insufficient permissions, or problems configuring Cross-Origin Resource Sharing (CORS), see the iTAG FAQ.