Updated on 2024-12-26 GMT+08:00

MoXing Framework Functions

MoXing Framework provides basic common components for MoXing. For example, it facilitates access to Huawei Cloud OBS. Importantly, MoXing Framework is decoupled from specific AI engines and can be seamlessly integrated with all major AI engines (including TensorFlow, MXNet, PyTorch, and MindSpore) supported by ModelArts. MoXing Framework allows you to interact with OBS components using the mox.file APIs described in this section.

MoXing primarily serves to streamline the process of reading and downloading data from OBS buckets. However, it is not suitable for OBS parallel file systems. You are advised to call the OBS Python SDK to develop production service code. For details, see API Overview of OBS SDK for Python.

Why mox.file

Use Python to open a local file.

1
2
with open('/tmp/a.txt', 'r') as f:
  print(f.read())

An OBS directory starts with obs://, for example, obs://bucket/XXX.txt. You cannot directly use the open function to open an OBS file. The preceding code for opening a local file will report an error.

With OBS, you can access various tools like SDK, API, OBS console, and OBS Browser. ModelArts mox.file offers a set of APIs that mimic a local file system, enabling easy management of OBS files. For example, you can use the following code to open a file on OBS:

1
2
3
import moxing as mox
with mox.file.File('obs://bucket_name/a.txt', 'r') as f:
  print(f.read())

The following Python code lists a local path:

1
2
import os
os.listdir('/tmp/my_dir/')

To list an OBS path, add the following code in mox.file:

1
2
import moxing as mox
mox.file.list_directory('obs://bucket_name/my_dir/')

Importing MoXing Framework

To use the MoXing Framework, first add the MoXing Framework module to the beginning of your code.

Import the MoXing Framework module:

1
import moxing as mox

Related Notes

After the MoXing module is introduced, the standard logging module of Python is set to the INFO level, and the version number is printed. You can use the following API to reset the logging level:

1
2
3
4
import logging

from moxing.framework.util import runtime
runtime.reset_logger(level=logging.WARNING)

Before introducing MoXing, you can set the MOX_SILENT_MODE environment variable to 1 to prevent MoXing from printing the version number. Before importing MoXing, set the environment variables using the following Python code.

1
2
3
import os
os.environ['MOX_SILENT_MODE'] = '1'
import moxing as mox

Data Downloading Acceleration

You can use MoXing Framework to accelerate data downloading for training jobs created using ModelArts preset images. This is suitable when the number of files ranges from 1 million to 10 million, a single large file, or the file size is greater than 20 GB.

  1. Log in to the ModelArts console and choose Model Training > Training Jobs in the navigation pane on the left.
  2. In the upper right corner of the page, click Create Training Job, and set MA_MOXING_FWVER=2.2.8.0aa484aa in Environment Variable to install the latest MoXing Framework. For details about other parameters, see Creating a Training Job. Then, you can use moxing.file.copy_parallel in the training job script to accelerate data downloading.
  3. Set MOX_C_ACCELERATE=0 in Environment Variable to disable data downloading acceleration if it is not needed.