Help Center/ ModelArts/ Data Labeling/ Manual Labeling/ Creating a Labeling Job
Updated on 2024-06-12 GMT+08:00

Creating a Labeling Job

Model training requires a large amount of labeled data. Therefore, before training a model, label data. You can create a manual labeling job labeled by one person or by a group of persons (team labeling), or enable auto labeling to quickly label images. You can also modify existing labels, or delete them and re-label.

Labeling Job Types

Create a labeling job based on the dataset type. ModelArts supports the following types of labeling jobs:

  • Images
    • Image classification: identifies a class of objects in images.
    • Object detection: identifies the position and class of each object in an image.
  • Audio
    • Sound classification: classifies and identifies different sounds.
    • Speech labeling: labels speech content.
    • Speech paragraph labeling: segments and labels speech content.
  • Text
    • Text classification: assigns labels to text according to its content.
    • Named entity recognition: assigns labels to named entities in text, such as time and locations.
    • Text triplet: assigns labels to entity segments and entity relationships in the text.

Prerequisites

Before labeling data, create a dataset.

Data management is being upgraded and is invisible to users who have not used data management.

Procedure

  1. Log in to the ModelArts management console. In the left navigation pane, choose Data Management > Label Data.
  2. On the Data Labeling page, click Create Labeling Job in the upper right corner. On the page that is displayed, create a labeling job.
    1. Enter basic information about the labeling job, including Name and Description.
      Figure 1 Basic information about a labeling job
    2. Select a labeling scene and type as required.
      Figure 2 Selecting a labeling scene and type
    3. Set the parameters based on the labeling job type. For details, see the parameters of the following labeling job types:
    4. Click Create in the lower right corner of the page.

      After the labeling job is created, the data labeling management page is displayed. You can perform the following operations on the labeling job: start auto labeling, publish new versions, modify the labeling job, and delete the labeling job.

Images (Image Classification and Object Detection)

Figure 3 Parameters of labeling jobs for image classification and object detection
Table 1 Parameters of an image labeling job

Parameter

Description

Dataset Name

Select a dataset that supports the labeling type.

Label Set

  • Label name: Enter a label name with 1 to 1024 characters.
  • Add Label: Click Add Label to add one or more labels.
  • Label color: Set label colors for object detection and image segmentation labeling jobs. Select a color from the color palette on the right of a label, or enter the hexadecimal color code to set the color.
  • Add Label Attribute: For an object detection labeling job, you can click the plus sign (+) on the right to add label attributes after setting a label color. Label attributes are used to distinguish different attributes of the objects with the same label. For example, yellow kittens and black kittens have the same label cat and their label attribute is color.

Audio (Sound Classification, Speech Labeling, and Speech Paragraph Labeling)

Figure 4 Parameters of labeling jobs for sound classification, speech labeling, and speech paragraph labeling
Table 2 Parameters of an audio labeling job

Parameter

Description

Dataset Name

Select a dataset that supports the labeling type.

Label Set (for sound classification)

You can add a label set for labeling jobs of sound classification.

  • Label name: Enter 1 to 1024 characters in the Label Set text box.
  • Add Label: Click Add Label to add one or more labels.

Label Management (for speech paragraph labeling)

Label management is available for speech paragraph labeling.

  • Single Label
    A single label is used to label a piece of audio that has only one class.
    • Label: Enter a label name, with 1 to 1024 characters.
    • Label Color: Set the label color in the Label Color column. You can select a color from the color palette or enter a hexadecimal color code to set the color.
  • Multiple Labels
    Multiple labels are suitable for multi-dimensional labeling. For example, you can label a piece of audio as both noise and speech. For speech, you can label the audio with different speakers. You can click Add Label Class to add multiple label classes. A label class can contain multiple labels. The label class or name contains 1 to 256 characters. Only letters, digits, periods (.), underscores (_), and hyphens (-) are allowed.
    • Add Label Class: Enter a label class.
    • Label: Enter a label name.
    • Add Label: Click Add Label to add one or more labels.

Speech Labeling (for speech paragraph labeling)

Only datasets for speech paragraph labeling support speech labeling. By default, speech labeling is disabled. If this function is enabled, you can label speech content.

Text (Text Classification, Named Entity Recognition, and Text Triplet)

Figure 5 Parameters of labeling jobs for text classification, named entity recognition, and text triplet
Table 3 Parameters of a text labeling job

Parameter

Description

Dataset Name

Select a dataset that supports the labeling type.

Label Set (for text classification and named entity recognition)

  • Label name: Enter a label name, with 1 to 1024 characters.
  • Add Label: Click Add Label to add one or more labels.
  • Label color: Select a color from the color palette or enter the hexadecimal color code to set the color.

Label Set (for text triplet)

For datasets of the text triplet type, set entity labels and relationship labels.

  • Entity Label: Set the label name and label color. You can click the plus sign (+) on the right of the color area to add multiple labels.
  • Relationship Label: a relationship between two entities. Set the source entity and target entity. Therefore, add at least two entity labels before adding a relationship label.