Analysis on the Sensitivity of Object Detection Models to Bounding Box Aspect Ratios and Solution
Symptom
In an object detection task, different bounding boxes in an image have various shapes, which can be reflected by the aspect ratios of bounding boxes. A wider range of bounding box aspect ratios means more unbalanced distribution of bounding box shapes in the database. Object detection models have different effects on datasets with different aspect ratios. The following describes how to reduce the sensitivity of models to the aspect ratios of bounding boxes.
In the following figure, there are three aspect ratios of the bounding boxes in the image.
Solution
In an object detection task, the Feature Puramid Networks (FPN) is widely used in one-stage detection models. The FPN performs concatenation on feature maps with different scales through feature fusion, and then performs regression on subsequent classes and bounding boxes. This has become a standard method for object detection models. The EfficientDet thesis puts forwards an FPN block repeats method. In this method, FPN is regarded as a basic unit and repeats for feature extraction and fusion. The following figure shows the Bidirectional Feature Pyramid Network (BiFPN), a type of FPN.
The following figure shows the FPN block repeats method. The feature extraction layer is deeper.
The FPN block repeats technology is not only applicable to BiFPN, but also to other FPN structures, such as PANet FPN. The following figure shows that the FPN block repeats technology is applied to PANet.
In this figure, graph a shows a basic structure of PANet, and graph b shows a network with FPN block repeats.
Verification
The open source dataset fruit is used for verification. Before using FPN block repeats, analyze the aspect ratio sensitivity of bounding boxes. The following table indicates that the aspect ratio sensitivity of Apple is 0.0757 and that of Banana is 0.4481.
|
Feature Distribution |
Apple |
Banana |
|---|---|---|
|
0% - 20% |
1 |
0.5714 |
|
20% - 40% |
1 |
1 |
|
40% - 60% |
0.875 |
0 |
|
60% - 80% |
0.8182 |
1 |
|
80% - 100% |
0.8571 |
0 |
|
Standard deviation |
0.0757 |
0.4481 |
After the FPN block repeats technology is used, the aspect ratio sensitivity of bounding boxes is analyzed. As shown in the following figure, the aspect ratio sensitivity of the Apple bounding boxes decreases from 0.0757 to 0.0667, and that of the Banana bounding boxes decreases from 0.4481 to 0.4091.
The FPN block repeats technology significantly improves the aspect ratio sensitivity of bounding boxes.
|
Feature Distribution |
Apple |
Banana |
|---|---|---|
|
0% - 20% |
1 |
0.7857 |
|
20% - 40% |
0.8333 |
0 |
|
40% - 60% |
1 |
1 |
|
60% - 80% |
1 |
1 |
|
80% - 100% |
1 |
0.25 |
|
Standard deviation |
0.0667 |
0.4091 |
Suggestions
In the model inference result, if the detected class is very sensitive to the aspect ratio of bounding boxes, you are advised to use the object detection models with FPN block repeats for optimization during training.
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.