Help Center> ModelArts> User Guide (Senior AI Engineers)> Model Management> Model Evaluation and Diagnosis> Model Optimization Suggestions> Analysis on the Sensitivity of Object Detection Models to Bounding Box Clarity and Solution

Analysis on the Sensitivity of Object Detection Models to Bounding Box Clarity and Solution

Symptom

In an object detection task, the bounding box clarity of different datasets may be different, which can be reflected by bounding box clarity sensitivity. Bounding box clarity affects model training and inference.

The left graph shows the original image, and the right graph shows that the clarity of a bounding box changes.

Figure 1 Example of bounding box clarity

Solution

Dropout is a widely used regularization technology in deep learning tasks. However, this technology randomly drops a unit at the feature layer. As a result, semantic information shared by adjacent feature units is also dropped. DropBlock solves the preceding problem. It drops features in a feature block (a contiguous region of a feature map), and performs regularization on the deep learning network. DropBlock is a simple method similar to Dropout. The main difference between the two is that DropBlock drops the adjacent regions of a feature map instead of a separate random unit. For details, see the DropBlock thesis.

DropBlock has two parameters: block_size and γ.

  • block_size: block size (length and width). When block_size is 1, DropBlock degrades to the traditional Dropout. Available values are 3, 5, and 7.
  • γ: probability during the drop process, that is, the Bernoulli probability.

The following shows the comparison between Dropout and DropBlock. In the figure, graph b indicates Dropout, and graph c indicates DropBlock.

Figure 2 Principle comparison between Dropout and DropBlock

Official implementation of TensorFlow:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
class Dropblock(object):
  """DropBlock: a regularization method for convolutional neural networks.
    DropBlock is a form of structured dropout, where units in a contiguous
    region of a feature map are dropped together. DropBlock works better than
    dropout on convolutional layers due to the fact that activation units in
    convolutional layers are spatially correlated.
    See https://arxiv.org/pdf/1810.12890.pdf for details.
  """

  def __init__(self,
               dropblock_keep_prob=None,
               dropblock_size=None,
               data_format='channels_last'):
    self._dropblock_keep_prob = dropblock_keep_prob
    self._dropblock_size = dropblock_size
    self._data_format = data_format

  def __call__(self, net, is_training=False):
    """Builds Dropblock layer.
    Args:
      net: `Tensor` input tensor.
      is_training: `bool` if True, the model is in training mode.
    Returns:
      A version of input tensor with DropBlock applied.
    """
    if (not is_training or self._dropblock_keep_prob is None or
        self._dropblock_keep_prob == 1.0):
      return net

    logging.info('Applying DropBlock: dropblock_size %d,'
                 'net.shape %s', self._dropblock_size, net.shape)

    if self._data_format == 'channels_last':
      _, height, width, _ = net.get_shape().as_list()
    else:
      _, _, height, width = net.get_shape().as_list()

    total_size = width * height
    dropblock_size = min(self._dropblock_size, min(width, height))
    # Seed_drop_rate is the gamma parameter of DropBlcok.
    seed_drop_rate = (
        1.0 - self._dropblock_keep_prob) * total_size / dropblock_size**2 / (
            (width - self._dropblock_size + 1) *
            (height - self._dropblock_size + 1))

    # Forces the block to be inside the feature map.
    w_i, h_i = tf.meshgrid(tf.range(width), tf.range(height))
    valid_block = tf.logical_and(
        tf.logical_and(w_i >= int(dropblock_size // 2),
                       w_i < width - (dropblock_size - 1) // 2),
        tf.logical_and(h_i >= int(dropblock_size // 2),
                       h_i < width - (dropblock_size - 1) // 2))

    if self._data_format == 'channels_last':
      valid_block = tf.reshape(valid_block, [1, height, width, 1])
    else:
      valid_block = tf.reshape(valid_block, [1, 1, height, width])

    randnoise = tf.random_uniform(net.shape, dtype=tf.float32)
    valid_block = tf.cast(valid_block, dtype=tf.float32)
    seed_keep_rate = tf.cast(1 - seed_drop_rate, dtype=tf.float32)
    block_pattern = (1 - valid_block + seed_keep_rate + randnoise) >= 1
    block_pattern = tf.cast(block_pattern, dtype=tf.float32)

    if self._data_format == 'channels_last':
      ksize = [1, self._dropblock_size, self._dropblock_size, 1]
    else:
      ksize = [1, 1, self._dropblock_size, self._dropblock_size]
    block_pattern = -tf.nn.max_pool(
        -block_pattern,
        ksize=ksize,
        strides=[1, 1, 1, 1],
        padding='SAME',
        data_format='NHWC' if self._data_format == 'channels_last' else 'NCHW')

    percent_ones = tf.cast(tf.reduce_sum(block_pattern), tf.float32) / tf.cast(
        tf.size(block_pattern), tf.float32)

    net = net / tf.cast(percent_ones, net.dtype) * tf.cast(
        block_pattern, net.dtype)
    return net

Verification

The open source dataset Canine Coccidiosis Parasite is used for verification. The dataset has only one class. Before DropBlock is used, Table 1 describes the sensitivity of models to bounding box clarity before DropBlock is used.

Table 1 Clarity sensitivity analysis of bounding boxes

Feature Distribution

coccidia

0% - 20%

0.9355

20% - 40%

0.9355

40% - 60%

0.9355

60% - 80%

0.7742

80% - 100%

0.8065

Standard deviation

0.0718

After DropBlock is used, the clarity sensitivity of the bounding boxes is analyzed. In Table 2, the clarity sensitivity of the bounding boxes decreases from 0.0718 to 0.0204.

The DropBlock technology significantly improves the clarity sensitivity of bounding boxes.

Table 2 Clarity sensitivity analysis of bounding boxes

Feature Distribution

coccidia

0% - 20%

1

20% - 40%

0.9677

40% - 60%

0.9677

60% - 80%

0.9677

80% - 100%

0.9355

Standard deviation

0.0204

Suggestions

In the model inference result, if the detected classes are very sensitive to the clarity of the bounding boxes, you are advised to use DropBlock for model optimization and enhancement during training.