Analysis on the Sensitivity of Object Detection Models to Bounding Box Marginalization Degrees and Solution
Symptom
In an object detection task, the positions of bounding boxes may be different on an image. Some bounding boxes may reside in the middle of the image, and some other bounding boxes may reside at the edge of the image, which can be reflected by the marginalization degree. The marginalization degree means the ratio of the distance between the center point of a bounding box and the center point of the image to the distance between the center point of the image to the image edge. A larger value indicates that the object is closer to the edge. The following figure shows the scenario where bounding boxes are far away from the center. That is, the bounding boxes are close to the edge.
Object detection models have different detection effects for datasets with different marginalization degrees. You are advised to refer to the following algorithms and technical description to understand how to reduce the sensitivity of object detection models to bounding box marginalization degrees.
Solution
- Box loss weight
In object detection, the loss values of the class and bounding box coordinates are not balanced. If you simply add the values, small bounding box loss values (for example, small objects) are likely to be ignored during reverse gradient calculation, affecting convergence. Weighting the bounding box loss is a technology used to alleviate the imbalance. That is, the box loss weight is automatically adjusted based on the datasets. The following figure shows the unbalanced class loss and box loss values during object detection model training. The difference between the values is about 100 times.
Figure 2 Unbalanced class loss and box loss values during object detection model training
The following is the pseudo code of TensorFlow for the box loss weight.
1 2 3
def model_fn(inputs, mode): ... # params is the detection algorithm hyperparameter configuration. Change the value of box_loss_weight. total_loss = cls_loss + params['box_loss_weight'] * box_loss
- Label smoothing
Label smoothing was first proposed in InceptionV2 and is widely used in classification tasks. If some labels are incorrect or inaccurate, the network may trust the labels and make mistakes. To improve network generalization and avoid this error, when the loss is calculated for the label encoded by one hot, the actual class position is multiplied by a coefficient (1 – e). e is very small, for example, 0.05, and is sent at a probability of 0.95. The non-labeled class is changed from 0 to 0.05 (e) for loss calculation.
The following figure shows the one-hot code of the original label in an object detection model.
Figure 3 One-hot code of the original label
The following figure shows the label after label smoothing.
Figure 4 Code of the label after label smoothing
The following is the pseudo code of the TensorFlow version after label smoothing.
1 2 3 4 5 6 7 8
positive_label_mask = tf.equal(targets, 1.0) if label_smoothing > 0: from tensorflow.python.ops import math_ops smooth_positives = 1.0 - label_smoothing smooth_negatives = label_smoothing / num_classes labels = labels * smooth_positives + smooth_negatives cross_entropy = ( tf.nn.sigmoid_cross_entropy_with_logits(labels=labels, logits=logits))
Verification
A dataset of traffic signal lights is used for verification. The dataset has only one label. It is used to check whether traffic signal lights are installed at intersections. The following figure shows the class loss curve before and after label smoothing is used. The blue curve indicates the fitting curve before label smoothing is used. As the number of iteration steps increases, overfitting occurs. After label smoothing (gray curve) is used, overfitting can be effectively reduced.
The trained object detection model is evaluated, and recall values in different inference cases corresponding to different marginalized feature distributions are obtained. Table 1 describes the sensitivity of bounding boxes to marginalization degrees before label smoothing and box loss weight are used.
|
Feature Distribution |
light |
|---|---|
|
0% - 20% |
0.9412 |
|
20% - 40% |
0.8235 |
|
40% - 60% |
0.7778 |
|
60% - 80% |
1 |
|
80% - 100% |
0.8333 |
|
Standard deviation |
0.0823 |
Table 2 describes the sensitivity of bounding boxes to marginalization degrees after label smoothing and box loss weight are used.
After label smoothing and box loss weight are used, the sensitivity of object detection models to marginalization degrees of bounding boxes reduces from 0.0823 to 0.0448.
Suggestions
In the model inference result, if the detected classes are very sensitive to the marginalization degrees of bounding boxes, you are advised to use label smoothing and box loss weight for model optimization and enhancement during training.
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.