Updated on 2023-09-06 GMT+08:00

Confirming Hard Examples

In a labeling task that processes a large amount of data, auto labeling results cannot be directly used for training because the labeled images are insufficient at the initial stage of labeling. It takes a lot of time and manpower to adjust and confirm all unlabeled data one by one. To accelerate labeling progress, ModelArts embeds an auto hard example detection function for labeling unlabeled data in an auto labeling task. This function provides suggestions on labeling priorities for remaining unlabeled images. The auto labeling result of an image with high labeling priority is not as expected. Therefore, this case is called a hard example.

The auto hard example detection function is used to automatically label hard examples during auto labeling and data collection and filtering. Further confirm and label hard example data, and add labeling results to the training dataset to obtain a trained model with higher precision. No manual intervention is required for hard example detection, and you only need to confirm and modify the labeled data, improving data management and labeling efficiency. In addition, you can supplement data similar to hard examples to improve the variety of the dataset and further improve the model training precision. Hard example management involves the following scenarios.

Only datasets of image classification and object detection types support the auto hard example detection function.

Confirming Hard Examples After Auto Labeling

During the execution of an auto labeling task, ModelArts automatically detects and labels hard examples. After the auto labeling task is complete, the labeling results of hard examples are displayed on the To Be Confirmed tab page. Modify hard example data and confirm the labeling result.

  1. Log in to the ModelArts management console. In the left navigation pane, choose Data Management > Datasets. The Datasets page is displayed.
  2. In the dataset list, select the dataset of the object detection or image classification type and click the dataset name to go to the Dashboard tab page of the dataset.
  3. On the Dashboard page of the dataset, click Label in the upper right corner. The dataset details page is displayed.
  4. On the dataset details page, click the To Be Confirmed tab to view and confirm hard examples.

    Labeling data is displayed on the To Be Confirmed tab page only after the auto labeling task is complete. Otherwise, no data is available on the tab page. For details about auto labeling, see Auto Labeling.

    • For datasets of the object detection type

      On the To Be Confirmed tab page, click an image to expand its labeling details. Check whether labeling information is correct, for example, whether the label is correct and whether the target bounding box is correctly added to the right position. If the auto labeling result is inaccurate, manually adjust the label or target bounding box and click Labeled. Then, the re-labeled data is displayed on the Labeled tab page.

      In the hard example shown in Figure 1, the position of the target bounding box for the dog label is incorrect. Use a bounding box to re-label the image again, for example, the miss bounding box shown in the following figure. Then, delete the incorrect labeling bounding box, that is, the false bounding box. After manual adjustment, click Labeled to confirm the hard example.

      Figure 1 Confirming a hard example for object detection
    • For datasets of the image classification type

      On the To Be Confirmed tab page, check whether labels added to images with the Hard example mark are correct. Select the images that are incorrectly labeled, delete the incorrect labels, and add correct labels in Label on the right. Click OK. The selected images and its labeling details are displayed on the Labeled tab page.

      As shown in Figure 2, the selected images are incorrectly labeled. Delete the incorrect labels on the right, add the dog label in Label, and click OK to confirm the hard examples.

      Figure 2 Confirming hard examples for image classification

Labeling Data in a Dataset as Hard Examples

In a dataset, labeled or unlabeled image data can be labeled as hard examples. Data labeled as hard examples can be used to improve model precision through built-in rules during subsequent model training.

  1. Log in to the ModelArts management console. In the left navigation pane, choose Data Management > Datasets. The Datasets page is displayed.
  2. In the dataset list, select the dataset of the object detection or image classification type and click the dataset name to go to the Dashboard tab page of the dataset.
  3. On the Dashboard page of the dataset, click Label in the upper right corner. The dataset details page is displayed.
  4. On the dataset details page, click the Labeled, Unlabeled, or All tab, select the images to be labeled as hard examples, and choose Batch process Hard Examples > Confirm Hard Example. After the labeling is complete, a Hard example mark will be displayed in the upper right corner of a preview image.
    Figure 3 Confirming hard examples