Help Center> ModelArts> DevEnviron> Managing Notebook Instances> Selecting Storage in DevEnviron
Updated on 2024-04-19 GMT+08:00

Selecting Storage in DevEnviron

Storage varies depending on performance, usability, and cost. No storage media can cover all scenarios. Learning about in-cloud storage application scenarios for better usage.

Only OBS parallel file systems (PFS) and object storage in the same region can be mounted.

Table 1 In-cloud storage application scenarios

Storage

Application Scenario

Advantage

Disadvantage

EVS

Data and algorithm exploration only in the development environment.

Block storage SSDs feature better overall I/O performance than NFS. The storage capacity can be dynamically expanded to up to 4096 GB.

As persistent storage, EVS disks are mounted to /home/ma-user/work. The data in this directory is retained after the instance is stopped. The storage capacity can be expanded online based on demand.

This type of storage can only be used in a single development environment.

PFS

NOTE:

PFS is a whitelist function. To use this function, contact Huawei technical support.

PFS buckets mounted as persistent storage for AI development and exploration.

- Storage for datasets. Datasets are directly mounted to notebooks for browsing and data processing and can be directly used during training. For details, see How Do I Upload Data to OBS?

After the instance is running, the OBS parallel file system that carries the datasets is dynamically mounted to notebooks. For details, see Dynamically Mounting an OBS Parallel File System.

2. Storage for code. After debugging on a notebook instance, specify the OBS path as the code path for starting training, facilitating temporary modification.

- Storage for checking training. Mount storage to the training output path such as the path to training logs. In this way, view and check training on the notebook instance in real time. This is especially suitable for analyzing the output of jobs trained using TensorBoard or notebook.

PFS is an optimized high-performance object storage file system with low storage costs and large throughput. It can quickly process high-performance computing (HPC) workloads. PFS mounting is recommended if OBS is used.

NOTE:

Package or split the data to be uploaded by 128 MB or 64 MB. Download and decompress the data in local storage for better I/O and throughput performance.

Due to average performance in frequent read and write of small files, PFS storage is not suitable for large model training or file decompression.

NOTE:

Before mounting PFS storage to a notebook instance, grant ModelArts with full read and write permissions on the PFS bucket. The policy will be retained even after the notebook instance is deleted.

OBS

NOTE:

OBS is a whitelist function. To use this function, contact Huawei technical support.

When uploading or downloading a large amount of data in the development environment, you can use OBS buckets to transfer data.

Low storage cost and high throughput, but average performance in reading and writing small files. It is a good practice to package or split the file by 128 MB or 64 MB. In this way, you can download the packages, decompress them, and use them locally.

The object storage semantics is different from the Posix semantics and needs to be further understood.

SFS

Available only in dedicated resource pools. Use SFS storage in informal production scenarios such as exploration and experiments. One SFS device can be mounted to both a development environment and a training environment. In this way, you do not need to download data each time your training job starts. This type of storage is not suitable for heavy I/O training on more than 32 cards.

SFS is implemented as NFS and can be shared between multiple development environments and between development and training environments. This type of storage is preferred for non-heavy-duty distributed training jobs, especially for the ones not requiring to download data additionally when the training jobs start.

The performance of the SFS storage is not as good as that of the EVS storage.

Local storage

First choice for heavy-duty training jobs.

High-performance SSDs for the target VM or BMS, featuring high file I/O throughput. For heavy-duty training jobs, store data in the target directory and then start training.

By default, the storage is mounted to the /cache directory. For details about the available space of the /cache directory, see What Are Sizes of the /cache Directories for Different Notebook Specifications in DevEnviron?.

The storage lifecycle is associated with the container lifecycle. Data needs to be downloaded each time the training job starts.

Using the Storage

  1. How do I use EVS in a development environment?

    When creating a notebook instance, select a small-capacity EVS disk. You can scale out the disk as needed. For details, see Dynamically Expanding EVS Disk Capacity.

  2. How do I use an OBS parallel file system in a development environment?

    When training data in a notebook instance, you can use the datasets mounted to a notebook container, and use an OBS parallel file system. For details, see Dynamically Mounting an OBS Parallel File System.