Help Center/ ModelArts/ DevEnviron/ JupyterLab/ Visualized Model Training/ MindInsight Visualization Jobs

Updated on 2024-08-14 GMT+08:00

View PDF

MindInsight Visualization Jobs

ModelArts notebook of the new version supports MindInsight visualization jobs. In a development environment, use a small dataset to train and debug an algorithm. This is used to check algorithm convergence and detect training issues, facilitating debugging.

MindInsight visualizes information such as scalars, images, computational graphs, and model hyperparameters during training. It also provides functions such as training dashboard, model lineage, data lineage, and performance debugging, helping you train and debug models efficiently. MindInsight supports MindSpore training jobs. For more information about MindInsight, see MindSpore official website.

MindSpore allows you to save data into the summary log file and obtain the data on the MindInsight GUI.

Prerequisites

When using MindSpore to edit a training script, add the code for collecting the summary record to the script to ensure that the summary file is generated in the training result.

For details, see Collecting Summary Record.

Note

To run a MindInsight training job in a development environment, start MindInsight and then the training process.
Only one-card single-node training is supported.
A running visualization job is not billed separately. When the target notebook instance is stopped, the billing stops.
If the summary file is stored in OBS, OBS storage will be billed separately. After a job is complete, stop the notebook instance and clear OBS data to stop billing.

Creating a MindInsight Visualization Job in a Development Environment

Step 1 Create a Development Environment and Access It Online

Step 2 Upload the Summary Data

Step 3 Start MindInsight

Step 4 View Visualized Data on the Training Dashboard

Step 1 Create a Development Environment and Access It Online

Log in to ModelArts management console, choose Development Workspace > Notebook, and create a development environment instance for the MindSpore engine. After the instance is created, click Open in the Operation column of the instance to access it online.

The images and resource types supported by MindInsight visualization training jobs are as follows:

MindSpore 1.2.0 (CPU or GPU)
MindSpore 1.5.x or later (Ascend)

Step 2 Upload the Summary Data

Summary data is required for MindInsight visualization in a development environment.

Upload the summary data to the /home/ma-user/work/ directory in a development environment or store it in an OBS parallel file system.

For details about how to upload the summary data to /home/ma-user/work/, see Uploading Files to JupyterLab.
To store the summary data in an OBS parallel file system that is mounted to a notebook instance, upload the summary file generated during model training to the OBS parallel file system and ensure that the OBS parallel file system and ModelArts are in the same region. When MindInsight is started in a notebook instance, the notebook instance automatically reads the summary data from the mounted OBS parallel file system.

Step 3 Start MindInsight

Choose a way you like to start MindInsight in JupyterLab.

Figure 1 Starting MindInsight in JupyterLab
Click to enlarge

Method 1

Click to go to the JupyterLab development environment. An IPYNB file will be automatically created.

Enter the following command in the dialog box:

%reload_ext mindinsight
%mindinsight --port {PORT} --summary-base-dir {SUMMARY_BASE_DIR}

Parameters:

port {PORT}: web service port for visualization, which defaults to 8080. If the default port 8080 has been used, specify a port ranging from 1 to 65535.
summary-base-dir{SUMMARY_BASE_DIR}: data storage path in the development environment
- Local path to the development environment: ./work/xxx (relative path) or /home/ma-user/work/xxx (absolute path)
- Path to the OBS parallel file system bucket: obs://xxx/

For example:
# If the summary data is stored in /home/ma-user/work/ of a development environment, run the following command:
%mindinsight --summary-base-dir /home/ma-user/work/xxx 
Or
# If the summary data is stored in an OBS parallel file system, run the following command. Then, the development environment will automatically mount the storage path to the OBS parallel file system and read data from the path.
%mindinsight --summary-base-dir obs://xxx/

Figure 2 MindInsight page (1)
Click to enlarge

Method 2

Click Click to enlarge to go to the MindInsight page.

Data is read from /home/ma-user/work/ by default.

If there are two projects or more, select the target project to view its logs.

Figure 3 MindInsight page (2)
Click to enlarge

Method 3

Choose View > Activate Command Palette, enter MindInsight in the search box, and click Create a new MindInsight.
Figure 4 Create a new MindInsight
Enter the path to the summary data or the storage path to the OBS parallel file system, and click CREATE.
- Local path to the development environment: ./summary (relative path) or /home/ma-user/work/summary (absolute path)
- Path to the OBS parallel file system: obs://xxx/
Figure 5 Path to the summary data

Figure 6 MindInsight page (3)

A maximum of 10 MindInsight instances can be started using method 2 or 3.

Method 4

Click Click to enlarge and run the following command (the UI will not be displayed):

mindinsight start --summary-base-dir ./summary_dir

Figure 7 Opening MindInsight through Terminal
Click to enlarge

Step 4 View Visualized Data on the Training Dashboard

The training dashboard is important for MindInsight visualization. It allows visualization for scalars, parameter distribution, computational graphs, dataset graphs, images, and tensors.

For more information, see Viewing Training Dashboard on the MindSpore official website.

Related Operations

To stop a MindInsight instance, use one of the following methods:

Method 1: Enter the following command in the .ipynb file window of JupyterLab. in which the port number is configured in Start MindInsight (8080 by default):
```
!mindinsight stop --port 8080
```
Method 2: Click . The MindInsight instance management page is displayed, which shows all started MindInsight instances. Click SHUT DOWN next to the target instance to stop it.
Figure 8 Stopping an instance
Method 3: Click in the following figure to close all started MindInsight instances.
Figure 9 Stopping all started MindInsight instances
Method 4 (not recommended): Close the MindInsight window on JupyterLab. In this way, only the visualization window is closed, but the instance is still running on the backend.