Managing Visualization Jobs

The ModelArts visualization jobs you manage are of the TensorBoard type by default.

TensorBoard is a tool that can effectively display the computational graph of TensorFlow in the running process, the trend of various metrics in time, and the data used in the training. Currently, TensorBoard supports only the training jobs based on the TensorFlow and MXNet engines. For more information about TensorBoard, see the TensorBoard official website.

For training jobs using TensorFlow and MXNet, you can use the summary file generated during model training to create a TensorBoard job.

Prerequisites

To ensure that the summary file is generated in the training result, you need to add the related code to the training script.

  • Using the TensorFlow engine:

    When using the TensorFlow-based MoXing, in mox.run, set save_summary_steps>0 and summary_verbosity≥1.

    If you want to display other metrics, add tensors to log_info in the return value mox.ModelSpec of model_fn. Only the rank-0 tensors (scalars) are supported. The added tensors are written into the summary file. If you want to write tensors of higher ranks in the summary file, use the native tf.summary of TensorFlow in model_fn.

  • Using the MXNet engine:

    Add the following code to the script:

    batch_end_callbacks.append(mx.contrib.tensorboard.LogMetricsCallback('OBS path'))

Precautions

  • You will be charged as long as your visualization jobs are in the Running status. We recommend you to stop the visualization jobs when you no longer need them to avoid unnecessary fees. Visualization jobs can be automatically stopped at the specified time. To avoid unnecessary fees, you are advised to enable this function.
  • By default, CPU resources are used to run visualization jobs and cannot be changed to other resources.
  • The OBS directory you use and ModelArts are in the same region.

Creating a Visualization Job

  1. Log in to the ModelArts management console. In the left navigation pane, choose Training Jobs. On the displayed page, click the Visualization Jobs tab.
  2. In the upper left corner of the visualization job list, click Create to switch to the Create Visualization Job page.
  3. Set Billing Mode to Pay-per-use and Job Type to Visualization Job. Enter the visualization job name and description as required, set the Training Output Path and Auto Stop parameters.
    • Training Output Path: Select the training output path specified when the training job is created.
    • Auto Stop: Enable or disable the auto stop function. A running visualization job will be billed. To avoid unnecessary fees, you can enable the auto stop function to automatically stop the visualization job at the specified time. The options are 1 hour later, 2 hours later, 4 hours later, 6 hours later, and Custom. If you select Custom, you can enter any integer from 1 to 24 hours in the text box on the right.
    Figure 1 Creating a visualization job
  4. Click Next.
  5. After confirming the specifications, click Next.

    In the visualization job list, when the status changes to Running, the TensorBoard job has been created. You can click the name of the visualization job to view its details.

Opening a Visualization Job

In the visualization job list, click the name of the target visualization job. The TensorBoard page is displayed. See Figure 2. Only the visualization job in the Running status can be opened.

Figure 2 TensorBoard page

Running or Stopping a Visualization Job

  • Stopping a visualization job: You can stop a running visualization job to stop billing when it is no longer needed. In the visualization job list, click Stop in the Operation column to stop the visualization job.
  • Running a visualization job: You can run and use a visualization job in the Canceled status again. In the visualization job list, click Run in the Operation column to run the visualization job.

Deleting a Visualization Job

If your visualization job is no longer used, you can delete it to release resources. In the visualization job list, click Delete in the Operation column to delete the visualization job.

A deleted visualized job cannot be recovered. You need to create a new visualization job if you want to use it. Exercise caution when performing this operation.