Developing Code for Training Using a Preset Image

On ModelArts, you must use preset images for building and training models. If these images do not fit specific needs, customizing them becomes necessary. Before creating an algorithm with a preset image, writing algorithm code ensures it fully supports your requirements. This preparation improves efficiency in model training and optimization, streamlining the overall development process.

Configuration Path

When creating an algorithm, set the code directory, boot file, input path, and output path. These settings enable the interaction between your code and ModelArts.

Code directory
Specify the code directory in the OBS bucket and upload training data such as training code, dependency installation packages, or pre-generated model to the directory. After you create a training job, ModelArts downloads the code directory and its subdirectories to the container.

Take OBS path obs://obs-bucket/training-test/demo-code as an example. The content in the OBS path will be automatically downloaded to ${MA_JOB_DIR}/demo-code in the training container, and demo-code (customizable) is the last-level directory of the OBS path.

Do not store training data in the code directory. When the training job starts, the data stored in the code directory will be downloaded to the backend. A large amount of training data may lead to a download failure. It is recommended that the size of the code directory does not exceed 50 MB.
Boot file
The boot file in the code directory is used to start the training. Only Python boot files are supported. For details about the boot process of the boot file of a preset image, see Starting Training Using a Preset Image's Boot File.
Input path
The training data must be uploaded to an OBS bucket or stored in the dataset. In the training code, the input path must be parsed. ModelArts automatically downloads the data in the input path to the local container directory for training. Ensure that you have the read permission to the OBS bucket. After the training job is started, ModelArts mounts a disk to the /cache directory. You can use this directory to store temporary files. For details about the size of the /cache directory, see What Are Sizes of the /cache Directories for Different Resource Specifications in the Training Environment?
Output path
You are advised to set an empty directory as the training output path. In the training code, the output path must be parsed. ModelArts automatically uploads the training output to the output path. Ensure that you have the write and read permissions on the OBS bucket.

In ModelArts, the training code must contain the steps in (Optional) Introducing Dependencies and Parsing and Setting Input and Output Paths.

(Optional) Introducing Dependencies

If your model references other dependencies, place the required file or installation package in Code Directory you set during algorithm creation.

For details about how to install the Python dependency package, see How Do I Create a Training Job When a Dependency Package Is Referenced by the Model to Be Trained?
For details about how to install a C++ dependency library, see How Do I Install a Library That C++ Depends on?
For details about how to load parameters to a pre-trained model, see How Do I Load Some Well Trained Parameters During Job Training?

Parsing and Setting Input and Output Paths

When a ModelArts training job reads data stored in OBS or outputs training results to a specified OBS path, perform the following operations to configure the input and output data:

Parse the input and output paths in the training code. The following method is recommended:

     import argparse
# Create a parsing task.
parser = argparse.ArgumentParser(description='train mnist')

# Add parameters.
parser.add_argument('--data_url', type=str, default="./Data/mnist.npz", help='path where the dataset is saved')
parser.add_argument('--train_url', type=str, default="./Model", help='path where the model is saved')

# Parse the parameters.
args = parser.parse_args()
 
 
  

After the parameters are parsed, use data_url and train_url to replace the paths to the data source and the data output, respectively.

When creating a training job, set the input and output paths.
Select the OBS path or dataset path as the training input, and the OBS path as the output.

Complete Training Code Example

The training code is closely related to the AI engine you use. The following uses the TensorFlow framework as an example. Before using this case, you need to download the mnist.npz file and upload it to the OBS bucket. The training input is the OBS path where the mnist.npz file is stored.

The following training code example contains the code for saving the model.

import os
import argparse
import tensorflow as tf

parser = argparse.ArgumentParser(description='train mnist')
parser.add_argument('--data_url', type=str, default="./Data/mnist.npz", help='path where the dataset is saved')
parser.add_argument('--train_url', type=str, default="./Model", help='path where the model is saved')
args = parser.parse_args()

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data(args.data_url)
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
])

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)

model.save(os.path.join(args.train_url, 'model'))

Parent topic: Preparing Model Training Code

Previous topic: Starting Training Using a Preset Image's Boot File

Next topic: Developing Code for Training Using a Custom Image

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

For any further questions, feel free to contact us through the chatbot.

Chatbot