Updated on 2024-04-01 GMT+08:00

Labeling Data

After a text classification project is created, the ExeML > Label Data tab page is displayed. The Labeled tab page is displayed by default. If the selected dataset contains labeled data, the labeled data is automatically displayed. You can also click Unlabeled to switch to the Unlabeled tab page. Unlabeled data in the directory of the target dataset is displayed.

Data Labeling for Text Classification

  1. Select a text to be labeled in Labeling Objects and click different labels in the Label Set area to label the text.

    You can add only one label for a text object.

  2. After confirming the file label, click Save Current Page in the lower right corner to save the labeling.

    If a large number of objects are included in Labeling Objects, the page turning icon is displayed in the lower part of the area. After labeling objects on this page, click Save Current Page before you turn to the next page. If you turn pages before saving the labellings, the labeling information on the previous page will be lost. You need to re-label for text data.

Figure 1 Data labeling - text classification

Adding or Deleting Data

In an ExeML project, the data source is the OBS directory corresponding to the input path of the dataset. If the data in the directory cannot meet your requirements, add or delete data on the ExeML page of ModelArts.

  • Adding a file

    On the Unlabeled tab page, click Add File in the upper left corner. In the dialog box that is displayed, select a local file and upload it.

    The format of the file to be uploaded must meet requirement on datasets of the text classification type.

  • Deleting a text object

    On the Labeled or Unlabeled tab page, select a text object to be deleted and click Delete in the upper left corner. In the dialog box that is displayed, confirm the deletion information and click OK.

    On the Labeled tab page, you can tick Select Current Page and click Delete to delete all text objects and their labeling information on the current page.

Figure 2 Adding a file or deleting a text object

Modifying Labeled Data

For labeled text data, only labels of the text object can be deleted. On the Labeled tab page, click the cross icon in the upper right corner of the label to delete the label of the text object. In the dialog box that is displayed, click OK. After the label is deleted, the text object is displayed on the Unlabeled tab page.

Figure 3 Deleting a labeled text

Modifying a Label

After an ExeML project for text classification is created, you can modify labels based on service changes, including label adding, modification, and deletion.

  • Adding a label

    On the Labeled tab page, click the plus sign (+) on the right of All Labels. In the displayed Add Label dialog box, set Label Name and Label Color, and click OK.

  • Modifying a label

    In the lower part of All Labels on the Labeled tab page, select the label to be modified and click the editing icon in the Operation column. In the displayed Modify Label dialog box, change the label name or color and click OK.

  • Deleting a label

    In the lower part of All labels on the Labeled tab page, select a label to be deleted and click the deletion icon in the Operation column. In the displayed Delete dialog box, select Delete label or Delete the label and objects with only the label, and click OK.

    The deleted labels cannot be recovered. Exercise caution when performing this operation.

Figure 4 Modifying a label