Iceberg Detection (Using the MoXing Framework for Image Classification)
This section describes how to use MoXing on ModelArts to classify iceberg images in the Kaggle competition. The images used in this practice are radar images, and the participants need to use algorithms to identify icebergs and ships in the images.
Before using the following sample, complete necessary operations based on Preparations. The following figure shows the process of identifying an iceberg.
- Preparing Data: Obtain the dataset used in this sample, upload it to OBS, and compile the code to convert the dataset format into TFRecord.
- Training a Model: Use the MoXing API to compile a network model for classifying iceberg images, and create a training job for model training.
- Performing Prediction: Create another training job, perform prediction on the sample dataset, and save the result to a CSV file.
- Viewing the Result: View the prediction result in the CSV file.
Preparing Data
ModelArts provides an Iceberg dataset named Iceberg-Data-Set. This example uses this dataset to build a model. Perform the following operations to upload the dataset to the OBS directory test-modelarts/iceberg/iceberg-data created in preparation. Then, use a notebook to convert the dataset to the TFRecord format.
- Download the Iceberg-Data-Set dataset to the local PC.
- Decompress the Iceberg-Data-Set.zip file to the Iceberg-Data-Set directory on the local PC.
- Upload all files in the Iceberg-Data-Set directory to the test-modelarts/iceberg/iceberg-data directory on OBS. For details about how to upload files, see Uploading a File.
The train.json file includes four types of data: band_1, band_2, inc_angle, and is_iceberg (test dataset):
- band_1 and band_2: two channels of radar images. All the images are 75 x 75 images with two bands.
- inc_angle: incidence angle from which the image was taken, in degrees.
- is_iceberg: label. 1 indicates an iceberg, and 0 indicates a ship.
- Choose DevEnviron > Notebooks, click Create, and enter the notebook name in the dialog box that is displayed. Click Next. On the Confirm tab page, check the configurations and click Submit.
In this sample, set Work Environment to Multi-Engine 1.0 (python2). You are advised to use the GPU-based resources in the public resource pool. If the CPU is used, the running duration of the notebook may be long and faults may occur during the running.
- Click Open in the Operation column. The Jupyter Notebook file directory page is displayed.
- Click New in the upper right corner and choose TensorFlow-1.8 from the drop-down list. The code compiling page is displayed.
Figure 1 Code compiling page
- Compile the code for converting the dataset format in the cell. For details about the complete code, see data_format_conversion.py.
Change the value of BASE_PATH in the script code to the actual storage location of the dataset. In this example, the OBS path is test-modelarts/iceberg/iceberg-data/, that is, the parent directory of train.json and test.json.
- Click Run above the cell to run the code. Running the code may take a long time. If the result is not displayed for a long time, try to execute the code segment by segment. Divide the script sample code into multiple segments and execute them in different cells.
After the code is executed successfully, the following files are created in the test-modelarts/iceberg/iceberg-data/ directory:
- iceberg-train-1176.tfrecord: training dataset
- iceberg-eval-295.tfrecord: validation dataset
- iceberg-test-8424.tfrecord: prediction dataset
- To avoid unnecessary fees, you are advised to access the notebook instance management page and click Stop or Delete in the Operation column to stop or delete the notebook instance.
Training a Model
After preparing the required data, use the MoXing API to compile the training script code. ModelArts provides a code sample, train_iceberg.py. You can compile the model training script in DevEnviron > Notebooks and convert the script into the .py file.
The following uses the train_iceberg.py training model as an example.
- Download the ModelArts-Lab project from Gitee and obtain training script train_iceberg.py from the \ModelArts-Lab-master\official_examples\Using_MoXing_to_Create_a_Iceberg_Images_Classification_Application\codes directory of the project.
- Upload the train_iceberg.py file to OBS. Assume that the file is uploaded to the test-modelarts/iceberg/iceberg-code/ directory.
- On the ModelArts management console, choose Training Management > Training Jobs, and click Create in the upper left corner.
- On the page that is displayed, set the parameters for the training job.
- Name and Description: Set the parameters as prompted.
- Algorithm Source: Select Frequently-used. Then set AI Engine to TensorFlow and TF-1.8.0-python2.7. Set Code Directory to the OBS parent directory (test-modelarts/iceberg/iceberg-code/) of the model training script file train_iceberg.py, and set Boot File to train_iceberg.py.
- Data Source: Select Data path, and then select the OBS path for saving the dataset.
- Training Output Path: Select an OBS path to store the generated model and prediction file.
- Resource Pool: Select an available resource pool for training. A GPU resource pool outperforms a CPU resource pool, but costs more.
- Compute Nodes: You are advised to set the value to 1 in this practice.
Figure 2 Create a Training Job
- After confirming the specifications, click Submit.
- On the Training Jobs page, when the training job status changes to Running Success, the model training is completed. If any exception occurs, click the job name to go to the job details page and view the training job logs.
The training job may take more than dozens of minutes to complete. If the training time exceeds a certain period (for example, one hour), manually stop it to release resources. Otherwise, the account balance may be insufficient, especially for the training job using GPUs.
- (Optional) During or after model training, you can create a visualization job to view parameter statistics, such as loss and accuracy. You can also go to the next step Performing Prediction without creating a visualization job.
Choose Training Management > Training Jobs. Click the Visualization Jobs tab, and click Create. On the page that is displayed, enter a visualization job name, and set Training Output Path to the path specified for Training Output Path in 4. Complete visualization job creation as prompted. If status changes to Running, the visualization job has been created. You can click the name of the visualization job to go to its GUI and view information about the training job.
Performing Prediction
After the training job is completed, a model file is generated in Training Output Path. Since only one prediction is required, there is no need to deploy the model as a real-time service. Related prediction operations have been written in the train_iceberg.py file, and the prediction results are exported to the submission.csv file.
To use a training job for prediction, perform the following steps:
- Choose Training Management > Training Jobs, and click Create in the upper left corner.
- Set related parameters and create a training job as prompted.
Name: Set the parameter as prompted,
Algorithm Source and Data Source: Keep the values the same as those set in Training a Model. For details, see 4.
Running Parameter: During the prediction, add parameter is_training=False, indicating that re-training is not performed.
Training Output Path: The value must be the same as the value specified in 4 in Training a Model.
Compute Nodes: Set the value to 1 during the prediction.Figure 3 Creating a prediction
- Go to the Training Jobs page. When the status of the training job changes to Running Success, the prediction is complete. Click training job name. The job details page is displayed.
On the Logs tab page, you can view the value of loss in the eval dataset.
The training job may take more than 10 minutes to complete. If the training time exceeds a certain period (for example, one hour), manually stop it to release resources. Otherwise, the account balance may be insufficient, especially for the training job using GPUs.
Last Article: Using MoXing
Next Article: Use MoXing to Develop Training Scripts for Handwritten Digit Recognition

Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.