Help Center> ModelArts> User Guide (Senior AI Engineers)> Model Management> Model Evaluation and Diagnosis> Model Optimization Suggestions> Common Methods of Optimizing Model Precision in Model Optimization

Common Methods of Optimizing Model Precision in Model Optimization

Overview

In deep learning competitions, many tricks are emerging. One of the controversial methods is to use augmentation during tests to generate multiple copies of the input source images, send them to the models, and combine all inference results. This method is called test time augmentation (TTA). This section describes the TTA principles and suggestions.

Principles

TTA process
The basic TTA process is as follows: Augment the original images to obtain multiple augmented samples and form a data group with the original images. Use these samples to obtain inference results. Combine the inference results using a certain method to obtain the final inference result and then calculate the precision.

Figure 1 TTA process

The following problems need to be confirmed:
1. What augmentation method is used to generate samples for the original images?
2. What method is used to integrate the inference results obtained using the samples?
The following describes the functions of the TTA and how to use the TTA by using the functions provided by the ModelArts platform.
Example of using the TTA
- Dataset: The following figure shows a dataset example. The left part consists of 754 normal images. The right part consists of 358 abnormal electrical board images. After some augmentation measures are taken, the number of normal images increases to 1508, and the number of abnormal images decreases to 1432.
  Figure 2 Dataset example
- Framework and algorithm: See ImageNet open source code.
- Training policy: 50 epochs, initial learning rate lr0.001, batchsize16 trained using Adam's optimizer
Model precision

Precision

Normal

Abnormal

Recall

97.2%

71.3%

Accuracy

89.13%

Precision	Normal	Abnormal
Recall	97.2%	71.3%
Accuracy	89.13%

TTA process

Select an augmentation method to obtain the samples. You can select the method as follows:

Select the augmentation method used in training.
For example, in the ImageNet training code provided by PyTorch, the operator transforms.RandomHorizontalFlip() is used for vertical flipping. For the model, there are many images that are vertically flipped. Therefore, you can use vertical flipping as an augmentation method.

Evaluate the model and analyze the augmentation method to be used based on the model evaluation result.

Evaluate the original model. The evaluation code is as follows (the evaluation code is obtained by modifying the code for performing forward inference in the validation section of the open source code):

with torch.no_grad():
end = time.time()
for i, (images, target) in enumerate(val_loader):
if args.gpu is not None:
images = images.cuda(args.gpu, non_blocking=True)
target = target.cuda(args.gpu, non_blocking=True)

# compute output
output_origin = model(images)
output = output_origin
loss = criterion(output, target)
pred_list += output.cpu().numpy()[:, :2].tolist()
target_list += target.cpu().numpy().tolist()
# measure accuracy and record loss
acc1, acc5 = accuracy(output, target, topk=(1, 5), i=i)
losses.update(loss.item(), images.size(0))
top1.update(acc1[0], images.size(0))
top5.update(acc5[0], images.size(0))

# measure elapsed time
batch_time.update(time.time() - end)
end = time.time()

if i % args.print_freq == 0:
progress.display(i)
# TODO: this should also be done with the ProgressMeter
print(' * Acc@1 {top1.avg:.3f} Acc@5 {top5.avg:.3f}'
.format(top1=top1, top5=top5))
name_list = val_loader.dataset.samples
for idx in range(len(name_list)):
name_list[idx] = name_list[idx][0]
analyse(task_type='image_classification', save_path='./', pred_list=pred_list, label_list=target_list, name_list=name_list)

The evaluation requires three lists. The logits results are combined into the pred_list, which stores the prediction result of each image, for example, [[8.725419998168945, 21.92235565185547]...[xxx, xxx]]. The target_list consists of the labels of each image, for example, [0, 1, 0, 1, 1..., 1, 0]. The name_list consists of the paths for storing original image files, for example, [xxx.jpg, ... xxx.jpg]. The analyse interface in the deep_moxing library is called to generate a model_analysis_results.json file in save_path. The file is uploaded to the output directory of any training task on the page. Then the model evaluation result is displayed on the evaluation page.

Figure 3 Viewing the evaluation result
Click to enlarge

The model sensitivity needs to be analyzed.

**Table 1** Analysis on the sensitivity of a model to image clarity
Feature Distribution	0	1
0% - 20%	0.7929	0.8727
20% - 40%	0.8816	0.7429
40% - 60%	0.9363	0.7229
60% - 80%	0.9462	0.7912
80% - 100%	0.9751	0.7619
Standard deviation	0.0643	0.0523

As shown in the preceding figure, the F1 score of class 0 (normal class) increases with the image clarity. That is, the model performs better in detecting the normal class on clear images. The precision for detecting class 1 (abnormal class) decreases as the image clarity increases. Blurred images can make the model more accurate to detect abnormal classes. Because this model focuses on the identification of abnormal classes, image blurring can be used as a TTA method.

Add the TTA to PyTorch.

The advantage of PyTorch is that you can directly obtain the tensor before model input and perform the required operation, for example, verification.

with torch.no_grad():         
    end = time.time()         
    for i, (images, target) in enumerate(val_loader):             
        if args.gpu is not None:                 
        images = images.cuda(args.gpu, non_blocking=True)

The images obtained here are the image data of a batch that has been pre-processed. Two augmentation methods are determined in 1: vertical flipping and blurring.

If the version is later than 0.4.0, you can use the following code for flipping in PyTorch:

def flip(x, dim):     
    indices = [slice(None)] * x.dim()     
    indices[dim] = torch.arange(x.size(dim) - 1, -1, -1, dtype=torch.long, device=x.device)    
    return x[tuple(indices)]

dim indicates the mode. In this example, 2 indicates vertical flipping, 3 indicates horizontal flipping, and 1 indicates channel rotation. Use img_flip = flip(images, 2) to obtain the images that are flipped vertically.

You can use the blur operations provided by CV2 to perform blur operations.

1
2
3

img = images.numpy() 
img[0] = cv2.blur(img[0], (3, 3)) 
images_blur = torch.from_numpy(img.copy())

Combine the results.
Three outputs are obtained: origin_result (inference results of the original images), flip_output (results obtained after vertical flipping), and blur_output (results obtained after blurring).

How are they combined?

For flip_output, what is the proportion of flipped images in the original training? What is the contribution weight of a flipped image to the result in the final output? As many students with deep learning experience know, the flipping probability is 0.5. That is, the proportion of flipped images is about 0.5. The final contribution of the flipped image is 0.5. The following formula can be obtained:

logits = 0.5 x origin_result + 0.5 x flip_result

In this case, the precision of the model is as follows:

Table 2 Model precision result

Operation

ACC

Recall of Normal Class

Recall of Abnormal Class

Originals

89.13%

97.2%

71.3%

Flipping result combining

87.74%

93.7%

72.7%

Although the accuracy of the normal class decreases, the recall of the abnormal class increases.

For blur_output, the precision of the abnormal class is the highest when the value is between 0 and 20%. However, the precision of the normal class drops. In addition, blurring is used to improve the precision of the abnormal class. Therefore, assume that the contribution of the blurred image is 0.5, the formula is as follows:

logits = 0.5 x origin_result + 0.5 x blur_output

In this case, the precision of the model is as follows:

Table 3 Model precision result

Operation

Accuracy

Recall of Normal Class

Recall of Abnormal Class

Originals

89.13%

97.2%

71.3%

Blurring result combining

88.117%

94.8%

73.3%

In the preceding table, the accuracy of the normal class decreases greatly, and that of the abnormal class increases significantly, which is consistent with the analysis result of model evaluation.

In conclusion, the adjustment causes a large loss to the normal class and the overall precision decreases. However, this is consistent with the model analysis result. The adjustment aims to improve the recall of the abnormal class. The model evaluation result is slightly better than the result by augmenting the original images.

**Table 2** Model precision result
Operation	ACC	Recall of Normal Class	Recall of Abnormal Class
Originals	89.13%	97.2%	71.3%
Flipping result combining	87.74%	93.7%	72.7%

**Table 3** Model precision result
Operation	Accuracy	Recall of Normal Class	Recall of Abnormal Class
Originals	89.13%	97.2%	71.3%
Blurring result combining	88.117%	94.8%	73.3%

Summary

In this test, two TTA methods are used. One is to use the built-in augmentation method. The other one is to analyze model sensitivity and determine the image feature interval that is the most helpful for model inference. Of course, the TTA will increase the model inference time. For AI algorithms that have demanding requirements on the inference latency, you need to carefully select a proper solution.

Parent topic: Model Optimization Suggestions

Last Article: Fine-grained Classification Optimization Using Center Loss

Next Article: Data Augmentation Based on Model Evaluation

Did this article solve your problem?

Thank you for your score！Your feedback would help us improve the website.

Products

Compute

Application

Dedicated Cloud

Storage

Management & Deployment

Migration

Network

Enterprise Intelligence

Video

Database

Edge Cloud Services

DevCloud

Security

Cloud Communications

Internet of Things

Solutions

Industry-Specific Solutions

General-Purpose Solutions

Security

DevOps

Enterprise Intelligence

Essential Platform

Big Data

Visual Cognition

Speech and Semantics

Support

Help Center

Customer Services

Developers

Console

语言 - Language

中国站 - 简体中文

中国站 - English

International - 简体中文

International - English

Help Center

Common Methods of Optimizing Model Precision in Model Optimization

Overview

Principles

Summary