文档首页 > > AI工程师用户指南> 训练管理> 自动化搜索作业> 使用示例

使用示例

分享
更新时间: 2020/01/17 GMT+08:00

以使用resnet50在MNIST数据集上的分类任务为例。

数据准备

ModelArts在公共OBS桶中提供了MNIST数据集,命名为“Mnist-Data-Set”,本文的操作示例可使用此数据集。请执行如下操作,将数据集上传至您的OBS目录下,例如上传至“test-modelarts/dataset-mnist”

  1. 单击数据集下载链接,将“Mnist-Data-Set”数据集下载至本地。
  2. 在本地,将“Mnist-Data-Set.zip”压缩包解压。例如,解压至本地“Mnist-Data-Set”文件夹下。
  3. 参考上传文件,使用批量上传方式将“Mnist-Data-Set”文件夹下的所有文件上传至“test-modelarts/dataset-mnist”OBS路径下。

    “Mnist-Data-Set”数据集包含的内容如下所示,其中“.gz”为对应的压缩包。

    本示例需使用“.gz”压缩包格式,请务必将数据集的4个压缩包上传至OBS目录。

    • “t10k-images-idx3-ubyte.gz”:验证集,共包含10000个样本。
    • “t10k-labels-idx1-ubyte.gz”:验证集标签,共包含10000个样本的类别标签。
    • “train-images-idx3-ubyte.gz”:训练集,共包含60000个样本。
    • “train-labels-idx1-ubyte.gz”:训练集标签,共包含60000个样本的类别标签。

修改代码

  1. 下载初始代码 mnist_with_summaries.py
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    # -*- coding: utf-8 -*-
    import argparse
    import tensorflow as tf
    import time
    from tensorflow.examples.tutorials.mnist import input_data
    
    parser = argparse.ArgumentParser()
    parser.add_argument('--max_steps', type=int, default=10,
                        help='Number of steps to run trainer.')
    parser.add_argument('--data_url', type=str, default="MNIST_data")
    FLAGS, unparsed = parser.parse_known_args()
    import autonet
    batch_size = 128
    
    def train():
      local_data = "/cache/data"
      import moxing as mox
      mox.file.copy_parallel(FLAGS.data_url, local_data)
      mnist = input_data.read_data_sets(local_data, one_hot=True)
      with tf.Graph().as_default():
        sess = tf.InteractiveSession()
        with tf.name_scope('input'):
          x = tf.placeholder(tf.float32, [None, 784], name='x-input')
          y_ = tf.placeholder(tf.int64, [None, 10], name='y-input')
        image_shaped_input = tf.reshape(x, [-1, 28, 28, 1])
        """用户自定义的resnet50经典网络"""
        def resnet50(input):
          ...
        y = resnet50(x)
        with tf.name_scope('cross_entropy'):
          y = tf.reduce_mean(y, [1, 2])
          y = tf.layers.dense(y, 10)
          cross_entropy = tf.losses.softmax_cross_entropy(
            y_, y)
    
        with tf.name_scope('train'):
          learning_rate = 0.001
          train_step = tf.train.AdamOptimizer(learning_rate).minimize(
            cross_entropy)
    
        with tf.name_scope('accuracy'):
          with tf.name_scope('correct_prediction'):
            correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
          with tf.name_scope('accuracy'):
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        tf.global_variables_initializer().run()
        def feed_dict(train):
          if train:
            xs, ys = mnist.train.next_batch(100)
          else:
            xs, ys = mnist.test.images, mnist.test.labels
          return {x: xs, y_: ys}
        max_acc = 0
        latencys = []
        for i in range(FLAGS.max_steps):
          if i % 10 == 0:  # Record summaries and test-set accuracy
            loss, acc = sess.run([cross_entropy, accuracy], feed_dict=feed_dict(False))
            print('Accuracy at step %s: %s' % (i, acc))
            if acc > max_acc:
              max_acc = acc
          else:
            start = time.time()
            loss, _ = sess.run([cross_entropy, train_step], feed_dict=feed_dict(True))
            end = time.time()
            if i % 10 != 1:
              latencys.append(end - start)
        latency = sum(latencys) / len(latencys)
        print (latency, max_acc)
    
    
    if __name__ == '__main__':
      train()
    
  2. 参考代码编写规范修改代码。

    纯NAS搜索代码,将修改后的代码保存为example1.py,并上传至OBS目录下。

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    # -*- coding: utf-8 -*-
    import argparse
    import tensorflow as tf
    import time
    from tensorflow.examples.tutorials.mnist import input_data
    
    parser = argparse.ArgumentParser()
    parser.add_argument('--max_steps', type=int, default=10,
                        help='Number of steps to run trainer.')
    parser.add_argument('--data_url', type=str, default="MNIST_data")
    FLAGS, unparsed = parser.parse_known_args()
    import autonet
    batch_size = 128
    
    def train():
      local_data = "/cache/data"
      import moxing as mox
      mox.file.copy_parallel(FLAGS.data_url, local_data)
      mnist = input_data.read_data_sets(local_data, one_hot=True)
      with tf.Graph().as_default():
        sess = tf.InteractiveSession()
        with tf.name_scope('input'):
          x = tf.placeholder(tf.float32, [None, 784], name='x-input')
          y_ = tf.placeholder(tf.int64, [None, 10], name='y-input')
        image_shaped_input = tf.reshape(x, [-1, 28, 28, 1])
        #调用引擎内置的autonet包里的resnet50代替用户自定义的resnet50网络
        from autonet.client import ResNet50
        y = ResNet50(image_shaped_input, include_top=True, mode="train")
        with tf.name_scope('cross_entropy'):
          y = tf.reduce_mean(y, [1, 2])
          y = tf.layers.dense(y, 10)
          cross_entropy = tf.losses.softmax_cross_entropy(
            y_, y)
    
        with tf.name_scope('train'):
          learning_rate = 0.001
          train_step = tf.train.AdamOptimizer(learning_rate).minimize(
            cross_entropy)
    
        with tf.name_scope('accuracy'):
          with tf.name_scope('correct_prediction'):
            correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
          with tf.name_scope('accuracy'):
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        tf.global_variables_initializer().run()
        def feed_dict(train):
          if train:
            xs, ys = mnist.train.next_batch(100)
          else:
            xs, ys = mnist.test.images, mnist.test.labels
          return {x: xs, y_: ys}
        max_acc = 0
        latencys = []
        for i in range(FLAGS.max_steps):
          if i % 10 == 0:  # Record summaries and test-set accuracy
            loss, acc = sess.run([cross_entropy, accuracy], feed_dict=feed_dict(False))
            print('Accuracy at step %s: %s' % (i, acc))
            if acc > max_acc:
              max_acc = acc
          else:
            start = time.time()
            loss, _ = sess.run([cross_entropy, train_step], feed_dict=feed_dict(True))
            end = time.time()
            if i % 10 != 1:
              latencys.append(end - start)
        latency = sum(latencys) / len(latencys)
        #调用autonet.client.report向上层manager反馈相应指标
    
        autonet.client.report(max_acc, latency=latency)
    
    
    if __name__ == '__main__':
      train()
    

    多元搜索(超参+NAS)代码,将修改后的代码保存为example2.py,并上传至OBS目录下。同时,编写好的yaml配置文件,也上传至对应OBS目录下。

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    # -*- coding: utf-8 -*-
    import argparse
    import tensorflow as tf
    from tensorflow.examples.tutorials.mnist import input_data
    import autonet.client as autonet  # addtional
    import hp_search as hp
    import time
    
    parser = argparse.ArgumentParser()
    parser.add_argument('--max_steps', type=int, default=100,
                        help='Number of steps to run trainer.')
    parser.add_argument('--data_url', type=str, default="MNIST_data")
    
    #接受超参搜索传入的超参learning_rate,超参key值与config文件保持一致
    parser.add_argument('--learning_rate', type=float, default=-1,       
                        help='Number of steps to run trainer.')
    FLAGS, unparsed = parser.parse_known_args()
    def train():
      local_data = "/cache/data"
      import moxing as mox
      mox.file.copy_parallel(FLAGS.data_url, local_data)
      mnist = input_data.read_data_sets(local_data, one_hot=True)
      with tf.Graph().as_default():
        sess = tf.InteractiveSession()
        with tf.name_scope('input'):
          x = tf.placeholder(tf.float32, [None, 784], name='x-input')
          y_ = tf.placeholder(tf.int64, [None, 10], name='y-input')
        image_shaped_input = tf.reshape(x, [-1, 28, 28, 1])
        #调用引擎内置的autonet包里的resnet50代替用户自定义的resnet50网络
        from autonet.client import ResNet50
        y = ResNet50(image_shaped_input, include_top=True, mode="train")
        with tf.name_scope('cross_entropy'):
          y = tf.reduce_mean(y, [1, 2])
          y = tf.layers.dense(y, 10)
          with tf.name_scope('total'):
            cross_entropy = tf.losses.softmax_cross_entropy(
              y_, y)
    
        with tf.name_scope('train'):
          #使用超参搜索器传入的超参进行训练
          train_step = tf.train.AdamOptimizer(FLAGS.learning_rate).minimize(
            cross_entropy)
    
        with tf.name_scope('accuracy'):
          with tf.name_scope('correct_prediction'):
            correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
          with tf.name_scope('accuracy'):
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        tf.global_variables_initializer().run()
    
        def feed_dict(train):
          if train:
            xs, ys = mnist.train.next_batch(100)
          else:
            xs, ys = mnist.test.images, mnist.test.labels
          return {x: xs, y_: ys}
        max_acc = 0
        latencys = []
        for i in range(FLAGS.max_steps):
          if i % 10 == 0:
            loss, acc = sess.run([cross_entropy, accuracy], feed_dict=feed_dict(False))
            print('Accuracy at step %s: %s' % (i, acc))
            if acc > max_acc:
              max_acc = acc
          else:
            start = time.time()
            loss, _ = sess.run([cross_entropy, train_step], feed_dict=feed_dict(True))
            end = time.time()
            if i % 10 != 1:
              latencys.append(end - start)
            #每训练一步向超参搜索器反馈一次指标
            hp.METRICS(loss=loss)
        latency = sum(latencys) / len(latencys)
        #调用autonet.client.report向上层manager反馈相应指标
    
        autonet.report(max_acc, latency=latency)
        sess.close()
    
    if __name__ == '__main__':
      train()
    

启动搜索作业

  • 纯NAS启动配置
    参考 创建自动化搜索作业操作指导,创建一个自动化搜索作业,将启动文件设置为 修改代码中步骤 2的示例代码文件example1.py,设置如下图所示的运行参数,然后启动作业。
    图1 自动化搜索作业参数(纯NAS启动配置)
  • 多元搜索(超参+NAS)启动配置
    参考 创建自动化搜索作业操作指导,创建一个自动化搜索作业,将启动文件设置为 修改代码中步骤 2的示例代码文件example2.py,设置如下图所示的运行参数,然后启动作业。
    图2 自动化搜索作业参数(多元搜索启动配置)

查看搜索结果

等待自动化搜索作业运行结束后,单击作业名称进入作业详情页面,单击搜索结果页签,查看搜索结果。两个场景的结果分别为:

  • 纯NAS结果
    图3 查看搜索结果(纯NAS)
  • 多元搜索(超参+NAS)结果
    图4 查看搜索结果(多元搜索)
分享:

    相关文档

    相关产品

文档是否有解决您的问题?

提交成功!

非常感谢您的反馈,我们会继续努力做到更好!

反馈提交失败,请稍后再试!

*必选

请至少选择或填写一项反馈信息

字符长度不能超过200

提交反馈 取消

如您有其它疑问,您也可以通过华为云社区问答频道来与我们联系探讨

跳转到云社区