Updated on 2024-06-12 GMT+08:00

MindInsight Visualization Jobs

ModelArts notebook of the new version supports MindInsight visualization jobs. In the development environment, use small datasets to train and debug algorithms, during which you can check algorithm convergence and detect issues to facilitate debugging.

MindInsight visualizes information such as scalars, images, computational graphs, and model hyperparameters during training. It also provides functions such as training dashboard, model lineage, data lineage, and performance debugging, helping you train and debug models efficiently. MindInsight supports MindSpore training jobs. For more information about MindInsight, see MindSpore official website.

MindSpore allows you to save data into the summary log file and obtain the data on the MindInsight GUI.

Prerequisites

When using MindSpore to compile a training script, add the code for collecting the summary record to the script to ensure that the summary file is generated in the training result.

For details, see Collecting Summary Record.

Precautions

  • To run a MindInsight training job in a development environment, start MindInsight and then the training process.
  • Only one-card single-node training is supported.

Step 1 Create a Development Environment and Access It Online

On the ModelArts management console, choose DevEnviron > Notebook to access notebook of the new version and create a MindSpore instance. After the instance is created, click Open in the Operation column of the instance to access it online.

The following shows the images and flavors supported by MindInsight visualization training jobs, and select images and flavors based on the site requirements.
  • MindSpore 1.2.0 (CPU or GPU)

Step 2 Upload the Summary Data

Summary data is required for using MindInsight visualization functions in DevEnviron.

You can upload the summary data to the /home/ma-user/work/ directory in the development environment or store it in the OBS parallel file system.

  • For details about how to upload the summary data to the notebook path /home/ma-user/work/, see Uploading Files to JupyterLab.
  • If you want the notebook development environment to mount the OBS parallel file system directory and read the summary data, upload the summary file generated during model training to the OBS parallel file system When MindInsight is started in a notebook instance, the notebook instance automatically mounts the OBS parallel file system directory and reads the summary data.

Step 3 Start MindInsight

Choose a way you like to start MindInsight in JupyterLab.

Figure 1 Starting MindInsight in JupyterLab

Method 1

  1. Click to go to the JupyterLab development environment. The .ipynb file is automatically created.
  2. Enter the following command in the dialog box:
    %reload_ext mindinsight
    %mindinsight --port {PORT} --summary-base-dir {SUMMARY_BASE_DIR} 

    Parameters:

    • port {PORT}: web service port for visualization, which defaults to 8080. If the default port 8080 is occupied, specify a port ranging from 1 to 65535.
    • summary-base-dir {SUMMARY_BASE_DIR}: data storage path in the development environment
      • Local path of the development environment: ./work/xxx (relative path) or /home/ma-user/work/xxx (absolute path)
      • Path of the OBS parallel file system bucket: obs://xxx/
    For example:
    # If the summary data is stored in /home/ma-user/work/ of the development environment, run the following command:
    %mindinsight --summary-base-dir /home/ma-user/work/xxx 
    or
    # If the summary data is stored in the OBS parallel file system, run the following command and the development environment automatically mounts the storage path of the OBS parallel file system and reads data.
    %mindinsight --summary-base-dir obs://xxx/
    Figure 2 MindInsight page (1)

Method 2

Click to go to the MindInsight page.

The directory /home/ma-user/work/ is read by default.

All project log names are displayed in the Runs area. You can view the logs of the target project in the Runs area on the left.

Figure 3 MindInsight page (2)

Method 3

  1. Choose View > Activate Command Palette, enter MindInsight in the search box, and click Create a new MindInsight.
    Figure 4 Create a new MindInsight
  2. Enter the path of the summary data you want to view or the storage path of the OBS parallel file system, and click CREATE.
    • Local path of the development environment: ./summary (relative path) or /home/ma-user/work/summary (absolute path)
    • Path of the OBS parallel file system: obs://xxx/
    Figure 5 Entering the summary data path
    Figure 6 MindInsight page (3)

    A maximum of 10 MindInsight instances can be started using methods 2 and 3.

Method 4

Click and run the following command. (In this way, the UI cannot be displayed.)

mindinsight start --summary-base-dir ./summary_dir
Figure 7 Opening MindInsight through Terminal

Step 4 View Visualized Data on the Training Dashboard

The training dashboard is important for MindInsight visualization. The training dashboard allows for scalar visualization, parameter distribution visualization, computational graph visualization, dataset graph visualization, image visualization, and tensor visualization.

For more information, see Viewing Dashboard on the MindSpore official website.

Related Operations

To stop a MindInsight instance, perform the following steps:

  • Method 1: Enter the following command in the .ipynb file window of JupyterLab. Replace 8080 with the actual port number for starting MindInsight.
    !mindinsight stop --port 8080
  • Method 2: Click . The MindInsight instance management page is displayed, which shows all started MindInsight instances. Click SHUT DOWN next to an instance to stop it.
    Figure 8 Clicking SHUT DOWN to stop an instance
  • Method 3: Click in the following figure to close all started MindInsight instances.
    Figure 9 Stopping all started MindInsight instances
  • Method 4 (not recommended): Close the MindInsight window on JupyterLab. In this case, only the visualization window is closed, but the instance is still running on the backend.