Help Center> ModelArts> User Guide (Senior AI Engineers)> Training Management> Auto Search Jobs> Example: Implementing Auto Data Augmentation Using a Preset Data Augmentation Policy

Example: Implementing Auto Data Augmentation Using a Preset Data Augmentation Policy

On the CIFAR-10 dataset, the best data augmentation policy is found using the Auto Augment algorithm. This section demonstrates how to use a found data augmentation policy.

Sample Code

This is the code for training the image classification model on MNIST using the ResNet50 network. The additional changes required for using the auto data augmentation policy in the training code are described in the comments.

import argparse
import time

import tensorflow as tf
from autosearch.client.augment.offline_search.preprocessor_builder import (
    ImageClassificationTensorflowBuilder,
)    # Change 1: Import the decoder module.
from autosearch.client.nas.backbone.resnet import ResNet50    
from tensorflow.examples.tutorials.mnist import input_data

import autosearch

parser = argparse.ArgumentParser()
parser.add_argument(
    "--max_steps", type=int, default=100, help="Number of steps to run trainer."
)
parser.add_argument("--data_url", type=str, default="MNIST_data")

parser.add_argument(
    "--learning_rate",
    type=float,
    default=0.01,  
    help="Number of steps to run trainer.",
)
FLAGS, unparsed = parser.parse_known_args()


def train():
    mnist = input_data.read_data_sets(FLAGS.data_url, one_hot=True)
    with tf.Graph().as_default():
        sess = tf.InteractiveSession()
        with tf.name_scope("input"):
            x = tf.placeholder(tf.float32, [None, 784], name="x-input")
            y_ = tf.placeholder(tf.int64, [None, 10], name="y-input")
        image_shaped_input = tf.multiply(x, 255)
        image_shaped_input = tf.cast(image_shaped_input, tf.uint8)
        image_shaped_input = tf.reshape(image_shaped_input, [-1, 784, 1])
        image_shaped_input = tf.concat([image_shaped_input, image_shaped_input, image_shaped_input], axis=2)
        image_shaped_input = ImageClassificationTensorflowBuilder("offline")(image_shaped_input)    # Change 2: The decoder module automatically parses the parameters delivered by the framework and converts the parameters into corresponding augmentation operations.
        image_shaped_input = tf.cast(image_shaped_input, tf.float32)
        image_shaped_input = tf.reshape(image_shaped_input, [-1, 28, 28, 3])
        image_shaped_input = tf.multiply(image_shaped_input, 1 / 255.0)
        y = ResNet50(image_shaped_input, include_top=True, mode="train")
        with tf.name_scope("cross_entropy"):
            y = tf.reduce_mean(y, [1, 2])
            y = tf.layers.dense(y, 10)
            with tf.name_scope("total"):
                cross_entropy = tf.losses.softmax_cross_entropy(y_, y)

        with tf.name_scope("train"):
            train_step = tf.train.AdamOptimizer(FLAGS.learning_rate).minimize(  
                cross_entropy
            )

        with tf.name_scope("accuracy"):
            with tf.name_scope("correct_prediction"):
                correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
            with tf.name_scope("accuracy"):
                accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        tf.global_variables_initializer().run()

        def feed_dict(train):
            if train:
                xs, ys = mnist.train.next_batch(100)
            else:
                xs, ys = mnist.test.next_batch(10000)
            return {x: xs, y_: ys}

        max_acc = 0
        latencys = []
        for i in range(FLAGS.max_steps):
            if i % 10 == 0:  # Record summaries and test-set accuracy
                loss, acc = sess.run(
                    [cross_entropy, accuracy], feed_dict=feed_dict(False)
                )
                # print('loss at step %s: %s' % (i, loss))
                print("Accuracy at step %s: %s" % (i, acc))
                if acc > max_acc:
                    max_acc = acc
                # autosearch.reporter(loss=loss)
                autosearch.reporter(mean_accuracy=acc)    # Change 3: Report the precision to the AutoSearch framework.
            else:
                start = time.time()
                loss, _ = sess.run(
                    [cross_entropy, train_step], feed_dict=feed_dict(True)
                )
                end = time.time()
                if i % 10 != 1:
                    latencys.append(end - start)
        latency = sum(latencys) / len(latencys)
        autosearch.reporter(mean_accuracy=max_acc, latency=latency)    # Same as change 3.
        sess.close()

def cloud_init(data_url):
    local_data_url = "/cache/mnist"
    import moxing as mox
    print(
        'Copying from data_url({})" to local path({})'.format(data_url, local_data_url)
    )
    mox.file.copy_parallel(data_url, local_data_url)
    return local_data_url

Because the policy is searched on CIFAR-10, the supported data is in the CIFAR-10 format, that is, three RGB channels. The value range of each pixel is 0–255. The MNIST data is in single-channel mode by default, and the pixel value range is normalized to 0–1. Therefore, in the preceding sample code, there is an additional operation: converting the data into the CIFAR-10 format. For details, see the following code snippet.

1
2
3
4
5
6
7
8
        image_shaped_input = tf.multiply(x, 255)
        image_shaped_input = tf.cast(image_shaped_input, tf.uint8)
        image_shaped_input = tf.reshape(image_shaped_input, [-1, 784, 1])
        image_shaped_input = tf.concat([image_shaped_input, image_shaped_input, image_shaped_input], axis=2)
        image_shaped_input = ImageClassificationTensorflowBuilder("offline")(image_shaped_input)    # Change 2: The decoder module automatically parses the parameters delivered by the framework and converts the parameters into corresponding augmentation operations.
        image_shaped_input = tf.cast(image_shaped_input, tf.float32)
        image_shaped_input = tf.reshape(image_shaped_input, [-1, 28, 28, 3])
        image_shaped_input = tf.multiply(image_shaped_input, 1 / 255.0)

Compiling the Configuration File

grid_search is used to transfer parameters to the decoder module embedded in the code. Actually, there is only one policy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
general:
  gpu_per_instance: 1

search_space:
  - type: discrete
    params:
      - name: image_classification_auto_augment
	values: [
		["4-4-3", "6-6-7", "7-3-9", "6-7-9", "1-6-5", "1-5-1", "5-6-7", "7-6-5", "6-3-7", "0-5-8", "0-9-4", "0-5-6", "14-3-5", "1-6-5", "6-0-8", "4-8-8", "14-2-6", "4-8-6", "14-2-6", "0-8-1", "14-4-1", "1-6-5", "6-0-0", "14-5-2", "0-9-5", "6-5-3", "5-7-5", "6-0-2", "14-2-8", "14-1-5", "0-9-4", "1-8-4", "6-0-7", "1-4-7", "14-2-5", "1-7-5", "1-6-8", "4-6-2", "4-3-7", "4-2-4", "0-5-2", "14-7-2", "0-2-0", "1-1-0", "6-9-3", "0-4-1", "1-8-8", "1-7-7", "1-7-7", "14-5-0", "1-3-7", "0-4-8", "6-9-6", "4-2-8", "0-1-5", "6-0-0", "8-2-4", "1-1-1", "1-7-7", "0-6-4", "1-8-2", "0-9-5", "1-5-0", "14-6-6", "1-9-5", "4-7-0", "0-7-3", "1-7-0", "6-5-1", "5-1-7", "5-1-4", "14-6-5", "0-3-9", "8-5-3", "0-9-2", "2-0-3", "14-4-3", "4-2-4", "1-1-4", "1-7-6", "1-3-8", "0-4-3", "14-6-4", "0-7-6", "0-2-9", "6-4-8", "1-1-0", "1-0-6", "1-8-4", "1-0-4", "1-5-5", "0-1-2", "14-5-5", "0-9-5", "0-6-1", "0-7-8", "1-2-0", "0-1-2", "1-6-9", "1-4-4"]
	]

search_algorithm:
  type: grid_search
  reward_attr: mean_accuracy

scheduler:
  type: FIFOScheduler

Starting a Search Job

The MNIST dataset is required in this example. Upload and configure the dataset by following instructions in Example: Replacing the Original ResNet-50 with a Better Network Architecture, upload the Python script and YAML configuration file, and start the search job. For details, see the section of creating an auto search job.