Help Center> ModelArts> User Guide (Senior AI Engineers)> Model Management> Model Evaluation and Diagnosis> Model Optimization Suggestions> Analysis on the Sensitivity of Object Detection Models to Bounding Box Aspect Ratios and Solution

Analysis on the Sensitivity of Object Detection Models to Bounding Box Aspect Ratios and Solution

Symptom

In an object detection task, different bounding boxes in an image have various shapes, which can be reflected by the aspect ratios of bounding boxes. A wider range of bounding box aspect ratios means more unbalanced distribution of bounding box shapes in the database. Object detection models have different effects on datasets with different aspect ratios. The following describes how to reduce the sensitivity of models to the aspect ratios of bounding boxes.

In the following figure, there are three aspect ratios of the bounding boxes in the image.

Figure 1 Example of aspect ratios of bounding boxes

Solution

In an object detection task, the Feature Puramid Networks (FPN) is widely used in one-stage detection models. The FPN performs concatenation on feature maps with different scales through feature fusion, and then performs regression on subsequent classes and bounding boxes. This has become a standard method for object detection models. The EfficientDet thesis puts forwards an FPN block repeats method. In this method, FPN is regarded as a basic unit and repeats for feature extraction and fusion. The following figure shows the Bidirectional Feature Pyramid Network (BiFPN), a type of FPN.

Figure 2 BiFPN structure

The following figure shows the FPN block repeats method. The feature extraction layer is deeper.

Figure 3 FPN block repeats of EfficientDet

The FPN block repeats technology is not only applicable to BiFPN, but also to other FPN structures, such as PANet FPN. The following figure shows that the FPN block repeats technology is applied to PANet.

In this figure, graph a shows a basic structure of PANet, and graph b shows a network with FPN block repeats.

Figure 4 Structure of FPN block repeats

Verification

The open source dataset fruit is used for verification. Before using FPN block repeats, analyze the aspect ratio sensitivity of bounding boxes. The following table indicates that the aspect ratio sensitivity of Apple is 0.0757 and that of Banana is 0.4481.

Table 1 Analysis on the sensitivity of detection models to bounding box aspect ratios before FPN block repeats is used

Feature Distribution

Apple

Banana

0% - 20%

1

0.5714

20% - 40%

1

1

40% - 60%

0.875

0

60% - 80%

0.8182

1

80% - 100%

0.8571

0

Standard deviation

0.0757

0.4481

After the FPN block repeats technology is used, the aspect ratio sensitivity of bounding boxes is analyzed. As shown in the following figure, the aspect ratio sensitivity of the Apple bounding boxes decreases from 0.0757 to 0.0667, and that of the Banana bounding boxes decreases from 0.4481 to 0.4091.

The FPN block repeats technology significantly improves the aspect ratio sensitivity of bounding boxes.

Table 2 Analysis on the sensitivity of object detection models to bounding box aspect ratios after FPN block repeats is used

Feature Distribution

Apple

Banana

0% - 20%

1

0.7857

20% - 40%

0.8333

0

40% - 60%

1

1

60% - 80%

1

1

80% - 100%

1

0.25

Standard deviation

0.0667

0.4091

Suggestions

In the model inference result, if the detected class is very sensitive to the aspect ratio of bounding boxes, you are advised to use the object detection models with FPN block repeats for optimization during training.