All Products
Search
Document Center

Platform For AI:Create a labeling job

Last Updated:Aug 09, 2023

After you create a dataset, you can create a labeling job and use iTAG to complete the labeling job. Machine Learning Platform for AI (PAI) provides common labeling templates for you to create labeling jobs. If the common labeling templates cannot meet your requirements, you can combine content and topic components to customize your own labeling templates based on your business scenarios. This topic describes how to use a common labeling template to create a labeling job.

Prerequisites

  • PAI is activated, and a workspace is created.

    You can use the default workspace or create a workspace based on your business plan. For more information about how to create a default workspace, see Activate PAI and create the default workspace. For more information about how to create a regular workspace, see Create a workspace.

  • Object Storage Service (OSS) is activated. The file that contains the data to be labeled is uploaded to an OSS bucket, and a dataset is generated for the file. For more information, see Create a dataset for a labeling job.

Limits

You can manage labeling jobs only as a workspace administrator or a labeling administrator. If you do not have required permissions, contact a workspace administrator to assign the labeling administrator role to your account. For more information, see Manage the members of a workspace.

Procedure

  1. Go to the Intelligent Platform - iTag page.
    1. Log on to the PAI console.
    2. In the left-side navigation pane, click Workspaces. On the Workspaces page, click the name of the workspace in which you want to create a labeling job.
    3. In the left-side navigation pane, choose Data Preprocessing > iTAG to go to the iTAG page.
  2. On the Task Center tab of the iTAG page, click Create Task.
  3. In the Select Data and Template step of the Create Labeling Job wizard, configure the parameters that are described in the following table and click Next.
    ParameterDescription
    Input DatasetSelect a dataset that is created on the Dataset management page in the PAI console.
    Template TypeSelect a type of labeling template. Valid values:
    • Common Template: common labeling templates provided by PAI.
    • Custom Template: labeling templates that you customize. You can combine content and topic components as prompted to customize your own labeling templates.

      Custom labeling templates are suitable for scenarios in which you have diversified requirements. For more information about the input and output data formats of custom labeling templates, see Custom labeling templates.

    TemplateIf you select Common Template for the Template Type parameter, you can specify the specific type of common labeling template that you want to use. Valid values:
    • Valid values if you select Text:
      • Named Entity Recognition: recognizes named entities.
      • Text Classification: classifies text by adding one or more labels to the text.
      • Relationship Analysis for Named Entities: analyzes relationships between named entities, which is suitable for scenarios such as creating knowledge graphs.
      For more information about the scenarios for which the labeling templates are suitable and the input and output data formats of the labeling templates, see Text labeling templates.
    • Valid values if you select Image:
      • OCR: extracts text from selected parts of images by using optical character recognition (OCR).
      • Object Detection: finds objects in images.
      • Image Classification: classifies images by adding one or more labels to the images.
      For more information about the scenarios for which the labeling templates are suitable and the input and output data formats of the labeling templates, see Image labeling templates.
    • Valid values if you select Video:

      Video Classification: classifies videos by adding one or more labels to the videos.

      For more information about the scenarios for which the labeling templates are suitable and the input and output data formats of the labeling templates, see Video labeling template.
    • Valid values if you select Audio:
      • Audio Classification: classifies audio files by adding one or more labels to the audio files.
      • Audio Segmentation: divides an audio file into several audio clips and adds labels to the audio clips.
      • Automatic Speech Recognition: converts the content of audio files to text.
      For more information about the scenarios for which the labeling templates are suitable and the input and output data formats of the labeling templates, see Audio labeling templates.
    OCR Identification Result ConfigurationThis parameter is displayed only if you set Template to Image and Image to OCR.

    By default, OCR Identification Result is selected, which specifies to extract text from selected parts of images by using OCR.

    Label ConfigurationEnter the names of labels that can be used by labeling workers in the labeling job. Press the ENTER key to complete the configuration of a label.

    For example, when you create a labeling job that is used to recognize cats in images, you can enter the names of labels such as Cat, American Shorthair, and British Shorthair to help with image labeling.

    In this section, you can also specify whether labeling workers can add more than one label to the objects they select in the labeling job.
    • If you want only one label to be added to an object, select Single Choice.
    • If you want more than one label to be added to an object, select Multiple Choice.
    In this example, if you select Multiple Choice, a labeling worker can add the Cat and American Shorthair labels at a time to a selected cat image.
    Note Take note that the Single Choice or Multiple Choice option indicates only the number of labels that can be added to an object at a time, but not the times that an object can be selected and labeled.
  4. In the Adjust Preview step, preview the labeling job and click Next.
  5. In the Intelligent Labeling Configurations step, configure data pre-labeling. For more information, see Configure intelligent pre-labeling in iTAG. Then, click Next.
  6. In the Distribute Task step, configure the parameters that are described in the following table and click Create.
    ParameterDescription
    Task NameThe name of the labeling job. The name must be 1 to 100 characters in length, and can contain letters, digits, underscores (_), and hyphens (-). It must start with a letter or a digit.
    Task DescriptionThe description of the labeling job, which is used to distinguish different labeling jobs.
    Assign Subtask PackagesThe rule based on which the labeling job is divided into multiple job packages. After the job packages are distributed, labeling workers claim job packages and label data entries in the job packages.

    Valid values:

    • Fixed Size: specifies a fixed number of data entries for each job package.
      The following list describes the requirements for the number of data entries in different scenarios:
      • If the dataset has 0 to 20,000 data entries, a job package can contain 1 to 200 data entries.
      • If the dataset has 20,000 to 100,000 data entries, a job package can contain 5 to 200 data entries.
      • If the dataset has 100,000 to 500,000 data entries, a job package can contain 25 to 200 data entries.
      • If the dataset has 500,000 to a million data entries, a job package can contain 50 to 200 data entries.
    • Based on Imported Field: divides the labeling job based on the value of the field that you specify. Data entries that have the same field value are placed in the same job package.
    Task WorkflowThe phases contained in the labeling job. You can select one of the following combinations of phases based on your business requirements:
    • Labeling: The labeling job is complete after labeling workers label data entries in job packages and submit the job packages.
    • Labeling - Checking: The labeling job is complete after the following two phases: (1) Labeling workers label data entries in job packages and submit the job packages. (2) Reviewers review labeling results and submit the job packages.
    • Labeling - Acceptance: The labeling job is complete after the following two phases: (1) Labeling workers label data entries in job packages and submit the job packages. (2) The acceptance staff accepts the job packages.
    • Labeling - Checking - Acceptance: The labeling job is complete after the following three phases: (1) Labeling workers label data entries in job packages and submit the job packages. (2) Reviewers review labeling results and submit the job packages. (3) The acceptance staff accepts the job packages.
    Check ProportionThe percentage of job packages to be reviewed to all job packages in the labeling job. This parameter is required if you select Labeling - Checking or Labeling - Checking - Acceptance for the Task Workflow parameter. The default percentage is 100%.
    User ConfigurationThe user configuration. You can specify labeling workers, reviewers, acceptance staff, and job administrators based on the phases that you specify. You can specify multiple members in the current workspace to cooperate on the labeling job. For more information about the roles involved in labeling jobs, see Overview.

View the job list

After a labeling job is complete, you can go to the Task Center tab to view the job list. In the job list, you can view the states of all labeling jobs and select options in the Actions column to manage labeling jobs. For example, you can view the job details and labeling results. View the job list
  • Process labeling jobs

    In the upper-right corner of the iTAG page, click Go to the iTAG Page to go to the iTAG console. In the iTAG console, you can process, review, and accept the job packages that you claim. For more information, see Process labeling jobs.

  • Transfer and release job packages

    On the Task Center tab, you can view the states of all labeling jobs. If a labeling job is not complete, you can click Subtask Details in the Actions column to view the details of job packages in the labeling job. If a job package is not complete, you can click Transfer to transfer the job package to other labeling workers. You can also click Release to release the job package. Then, the job package can be claimed by other labeling workers.

  • Export and view labeling results

    After a labeling job is complete, you can click Export Labeling Result in the Actions column to export the labeling results as prompted. You can click the Export progress icon in the upper-right corner of the Task Center tab to view the export progress. For more information, see Export labeling results.

  • Other job management operations

    You can click the More icon and select other options in the rectangle marked with 4 in the preceding figure to perform other operations on a labeling job, such as unpublishing or publishing the job.

What to do next

You can claim job packages in a labeling job and process the packages. For more information, see Process labeling jobs.