Help Center> ModelArts> User Guide (Senior AI Engineers)> Model Management> Model Evaluation and Diagnosis> Model Optimization Suggestions> Analysis on the Sensitivity of Object Detection Models to Bounding Box Marginalization Degrees and Solution

Analysis on the Sensitivity of Object Detection Models to Bounding Box Marginalization Degrees and Solution

Symptom

In an object detection task, the positions of bounding boxes may be different on an image. Some bounding boxes may reside in the middle of the image, and some other bounding boxes may reside at the edge of the image, which can be reflected by the marginalization degree. The marginalization degree means the ratio of the distance between the center point of a bounding box and the center point of the image to the distance between the center point of the image to the image edge. A larger value indicates that the object is closer to the edge. The following figure shows the scenario where bounding boxes are far away from the center. That is, the bounding boxes are close to the edge.

Figure 1 Bounding boxes close to the edge

Object detection models have different detection effects for datasets with different marginalization degrees. You are advised to refer to the following algorithms and technical description to understand how to reduce the sensitivity of object detection models to bounding box marginalization degrees.

Solution

  • Box loss weight

    In object detection, the loss values of the class and bounding box coordinates are not balanced. If you simply add the values, small bounding box loss values (for example, small objects) are likely to be ignored during reverse gradient calculation, affecting convergence. Weighting the bounding box loss is a technology used to alleviate the imbalance. That is, the box loss weight is automatically adjusted based on the datasets. The following figure shows the unbalanced class loss and box loss values during object detection model training. The difference between the values is about 100 times.

    Figure 2 Unbalanced class loss and box loss values during object detection model training

    The following is the pseudo code of TensorFlow for the box loss weight.

    1
    2
    3
    def model_fn(inputs, mode):    ...    
        # params is the detection algorithm hyperparameter configuration. Change the value of box_loss_weight.
        total_loss = cls_loss + params['box_loss_weight'] * box_loss
    
  • Label smoothing

    Label smoothing was first proposed in InceptionV2 and is widely used in classification tasks. If some labels are incorrect or inaccurate, the network may trust the labels and make mistakes. To improve network generalization and avoid this error, when the loss is calculated for the label encoded by one hot, the actual class position is multiplied by a coefficient (1 – e). e is very small, for example, 0.05, and is sent at a probability of 0.95. The non-labeled class is changed from 0 to 0.05 (e) for loss calculation.

    The following figure shows the one-hot code of the original label in an object detection model.

    Figure 3 One-hot code of the original label

    The following figure shows the label after label smoothing.

    Figure 4 Code of the label after label smoothing

    The following is the pseudo code of the TensorFlow version after label smoothing.

    1
    2
    3
    4
    5
    6
    7
    8
    positive_label_mask = tf.equal(targets, 1.0)    
       if label_smoothing > 0:      
          from tensorflow.python.ops import math_ops      
          smooth_positives = 1.0 - label_smoothing      
          smooth_negatives = label_smoothing / num_classes      
          labels = labels * smooth_positives + smooth_negatives    
          cross_entropy = (        
               tf.nn.sigmoid_cross_entropy_with_logits(labels=labels, logits=logits))
    

Verification

A dataset of traffic signal lights is used for verification. The dataset has only one label. It is used to check whether traffic signal lights are installed at intersections. The following figure shows the class loss curve before and after label smoothing is used. The blue curve indicates the fitting curve before label smoothing is used. As the number of iteration steps increases, overfitting occurs. After label smoothing (gray curve) is used, overfitting can be effectively reduced.

Figure 5 Fitting curve of the class loss

The trained object detection model is evaluated, and recall values in different inference cases corresponding to different marginalized feature distributions are obtained. Table 1 describes the sensitivity of bounding boxes to marginalization degrees before label smoothing and box loss weight are used.

Table 1 Sensitivity of bounding boxes to marginalization degrees before label smoothing and box loss weight are used

Feature Distribution

light

0% - 20%

0.9412

20% - 40%

0.8235

40% - 60%

0.7778

60% - 80%

1

80% - 100%

0.8333

Standard deviation

0.0823

Table 2 describes the sensitivity of bounding boxes to marginalization degrees after label smoothing and box loss weight are used.

After label smoothing and box loss weight are used, the sensitivity of object detection models to marginalization degrees of bounding boxes reduces from 0.0823 to 0.0448.

Table 2 Sensitivity of bounding boxes to marginalization degrees after label smoothing and box loss weight are used

Feature Distribution

light

0% - 20%

1

20% - 40%

0.9412

40% - 60%

0.8889

60% - 80%

1

80% - 100%

1

Standard deviation

0.0448

Suggestions

In the model inference result, if the detected classes are very sensitive to the marginalization degrees of bounding boxes, you are advised to use label smoothing and box loss weight for model optimization and enhancement during training.