Help Center/ ModelArts/ DevEnviron/ Managing Notebook Instances/ Selecting Storage in DevEnviron
Updated on 2024-06-12 GMT+08:00

Selecting Storage in DevEnviron

Storage varies depending on performance, usability, and cost. No storage media can cover all scenarios. Learning about in-cloud storage application scenarios for better usage.

Table 1 In-cloud storage application scenarios

Storage

Application Scenario

Advantage

Disadvantage

EVS (block storage)

Data and algorithm exploration only in the development environment.

Block storage SSDs feature better overall I/O performance than NFS. The storage capacity can be dynamically expanded to up to 4096 GB.

As persistent storage, EVS disks are mounted to /home/ma-user/work. The data in this directory is retained after the instance is stopped. The storage capacity can be expanded online based on demand.

This type of storage can only be used in a single development environment.

PFS

PFS buckets mounted as persistent storage for AI development and exploration.

- Storage for datasets. Mount the OBS parallel file system with a dataset stored to a notebook instance. The file system can be directly used during training. For details, see

- Storage for code. After debugging on a notebook instance, specify the OBS path as the code path for starting training, facilitating temporary modification.

- Storage for checking training. Mount storage to the training output path such as the path to training logs. In this way, view and check training on the notebook instance in real time. This is especially suitable for analyzing the output of jobs trained using TensorBoard or notebook.

PFS is an optimized high-performance object storage file system with low storage costs and large throughput. It can quickly process high-performance computing (HPC) workloads. PFS mounting is recommended if OBS is used.

NOTE:

Package or split the data to be uploaded by 128 MB or 64 MB. Download and decompress the data in local storage for better I/O and throughput performance.

Due to average performance in frequent read and write of small files, PFS storage is not suitable for large model training or file decompression.

OBS

Notebook of the new version cannot be stored in or mounted to OBS buckets.

When uploading or downloading a large amount of data in the development environment, you can use OBS buckets to transfer data.

Low storage cost and high throughput, but average performance in reading and writing small files. It is a good practice to package or split the file by 128 MB or 64 MB. In this way, you can download the packages, decompress them, and use them locally.

The object storage semantics is different from the Posix semantics and needs to be further understood.

Local storage

First choice for heavy-duty training jobs.

High-performance SSDs for the target VM or BMS, featuring high file I/O throughput. For heavy-duty training jobs, store data in the target directory and then start training.

By default, the storage is mounted to the /cache directory. For details about the available space of the /cache directory, see

The storage lifecycle is associated with the container lifecycle. Data needs to be downloaded each time the training job starts.

SFS

Available only in dedicated resource pools. Use SFS storage in informal production scenarios such as exploration and experiments. One SFS device can be mounted to both a development environment and a training environment. In this way, you do not need to download data each time your training job starts. This type of storage is not suitable for heavy I/O training on more than 32 cards.

SFS is implemented as NFS and can be shared between multiple development environments and between development and training environments. This type of storage is preferred for non-heavy-duty distributed training jobs, especially for the ones not requiring to download data additionally when the training jobs start.

The performance of the SFS storage is not as good as that of the EVS storage.

Using the Storage

  1. How do I use EVS in a development environment?

    When creating a notebook instance, select the EVS storage. You can expand an EVS disk capacity on a running notebook instance. For details, see Dynamically Expanding EVS Disk Capacity.