Updated on 2024-10-29 GMT+08:00

Creating an Algorithm

Machine learning explores general rules from limited volume of data and uses these rules to predict unknown data. To obtain more accurate prediction results, select a proper algorithm to train your model. ModelArts provides a large number of algorithm samples for different scenarios. This section describes algorithm sources and learning modes.

Algorithm Sources

ModelArts provides the following algorithm sources for model training:

  • Using a subscribed algorithm

    You can directly subscribe to algorithms in ModelArts AI Gallery and use them to build models without writing code.

  • Using a preset image

    To use a custom algorithm, use a framework built in ModelArts. ModelArts supports most mainstream AI engines. For details, see Built-in Training Engines. These built-in engines pre-load some extra Python packages, such as NumPy. You can also use the requirements.txt file in the code directory to install dependency packages. For details about how to create a training job using a preset image, see Developing Code for Training Using a Preset Image.

  • Using a preset image with customization

    If you use a preset image to create an algorithm and you need to modify or add some software dependencies based on the preset image, you can customize the preset image. In this case, select a preset image and choose Customize from the framework version drop-down list box.

    The only difference between this method and creating an algorithm totally based on a preset image is that you must select an image. You can create a custom image based on a preset image.

  • Using a custom image

    The subscribed algorithms and built-in frameworks can be used in most training scenarios. In certain scenarios, ModelArts allows you to create custom images to train models. You can create an image based on the ModelArts image specifications, select your own image and configure the code directory (optional) and boot command to create a training job.

    Custom images can be used to train models in ModelArts only after they are uploaded to Software Repository for Container (SWR). For details, see Creating a Custom Image for Model Training. Customizing an image requires a deep understanding of containers. Use this method only if the subscribed algorithms and custom scripts cannot meet your requirements.

    When you use a custom image to create a training job, the boot command must be executed in the /home/ma-user directory. Otherwise, the training job may run abnormally.

Procedure

Your locally developed algorithms or algorithms developed using other tools can be uploaded to ModelArts for unified management. Note the following when creating an algorithm:

  1. Prerequisites
  2. Accessing the Algorithm Creation Page
  3. Set the algorithm boot mode. The options are as follows:
  4. Configuring Pipelines
  5. Configuring Hyperparameters
  6. Supported Policies
  7. Adding Training Constraints
  8. Previewing the Runtime Environment
  9. Viewing Algorithm Details

Prerequisites

Accessing the Algorithm Creation Page

  1. Log in to the ModelArts console. In the navigation pane, choose Asset Management > Algorithm Management.
  2. In the My algorithm tab, click Create Algorithm. Enter the basic algorithm information, including Name and Description.

Using a Preset Image

Select a preset image to create an algorithm.

Figure 1 Using a preset image to create an algorithm
Set Image, Code Directory, and Boot File based on the algorithm code. Ensure that the framework of the AI image you select is the same as the one you use for editing algorithm code. For example, if TensorFlow is used for editing algorithm code, select a TensorFlow image when you create an algorithm.
Table 1 Parameters

Parameter

Description

Boot Mode

Select Preset image.

Select a preset image and its version used by the algorithm.

Code Directory

Select an OBS path for storing the algorithm code. The files required for training, such as the training code, dependency installation packages, and pre-generated models, are uploaded to the code directory.

Do not store training data in the code directory. When the training job starts, the data stored in the code directory will be downloaded to the backend. A large amount of training data may lead to a download failure.

After you create the training job, ModelArts downloads the code directory and its subdirectories to the training container.

Take OBS path obs://obs-bucket/training-test/demo-code as an example. The content in the OBS path will be automatically downloaded to ${MA_JOB_DIR}/demo-code in the training container, and demo-code (customizable) is the last-level directory of the OBS path.

NOTE:
  • Any programming language is supported.
  • The total number of both files and folders cannot exceed 1,000.
  • The total size of files cannot exceed 5 GB.

Boot File

The file must be stored in the code directory and end with .py. ModelArts supports boot files edited only in Python.

The boot file in the code directory is used to start a training job.

Using a Preset Image with Customization

Select a preset image with customization to create an algorithm.

Figure 2 Creating an algorithm using a preset image with customization
Set Image, Code Directory, and Boot File based on the algorithm code. Ensure that the framework of the AI image you select is the same as the one you use for editing algorithm code. For example, if TensorFlow is used for editing algorithm code, select a TensorFlow image when you create an algorithm.
Table 2 Parameters

Parameter

Description

Boot Mode

Select Preset image.

Select Customize for the engine version.

Image

Select your image uploaded to SWR. For details about how to create an image, see Creating a Custom Training Image.

Code Directory

Select an OBS path for storing the algorithm code. The files required for training, such as the training code, dependency installation packages, and pre-generated models, are uploaded to the code directory.

Do not store training data in the code directory. When the training job starts, the data stored in the code directory will be downloaded to the backend. A large amount of training data may lead to a download failure.

When the training job starts, ModelArts downloads the training code directory and its subdirectories to the training container.

Take OBS path obs://obs-bucket/training-test/demo-code as an example. The content in the OBS path will be automatically downloaded to ${MA_JOB_DIR}/demo-code in the training container, and demo-code (customizable) is the last-level directory of the OBS path.

NOTE:
  • Any programming language is supported for training code. The training boot file must be a Python file.
  • The total number of both files and folders cannot exceed 1,000.
  • The total size of files cannot exceed 5 GB.
  • The file depth cannot exceed 32.

Boot File

The file must be stored in the code directory and end with .py. ModelArts supports boot files edited only in Python.

The boot file in the code directory is used to start a training job.

Selecting a preset image with customization results in the same background behavior as running a training job directly with that image. For example:

  • The system automatically injects environment variables.
    • PATH=${MA_HOME}/anaconda/bin:${PATH}
    • LD_LIBRARY_PATH=${MA_HOME}/anaconda/lib:${LD_LIBRARY_PATH}
    • PYTHONPATH=${MA_JOB_DIR}:${PYTHONPATH}
  • The selected boot file will be automatically started using Python commands. Ensure that the Python environment is correct. The PATH environment variable is automatically injected. Run the following commands to check the Python version for the training job:
    • export MA_HOME=/home/ma-user; docker run --rm {image} ${MA_HOME}/anaconda/bin/python -V
    • docker run --rm {image} $(which python) -V
  • The system automatically adds hyperparameters associated with the preset image.

Using a Customer Image

Select a custom image to create an algorithm.

Figure 3 Creating an algorithm using a custom image
Table 3 Parameters

Parameter

Description

Boot Mode

Select Custom image.

Image

Select your image uploaded to SWR. For details about how to create an image, see Creating a Custom Training Image.

Code Directory

Select an OBS path for storing the algorithm code. The files required for training, such as the training code, dependency installation packages, and pre-generated models, are uploaded to the code directory. Configure this parameter only if your custom image does not contain training code.

Do not store training data in the code directory. When the training job starts, the data stored in the code directory will be downloaded to the backend. A large amount of training data may lead to a download failure.

When the training job starts, ModelArts downloads the training code directory and its subdirectories to the training container.

Take OBS path obs://obs-bucket/training-test/demo-code as an example. The content in the OBS path will be automatically downloaded to ${MA_JOB_DIR}/demo-code in the training container, and demo-code (customizable) is the last-level directory of the OBS path.

NOTE:
  • Any programming language is supported for training code. The training boot file must be a Python file.
  • The total number of both files and folders cannot exceed 1,000.
  • The total size of files cannot exceed 5 GB.
  • The file depth cannot exceed 32.

Boot Command

Command for booting an image. This parameter is mandatory.

When a training job is running, the boot command is automatically executed after the code directory is downloaded.
  • If the training boot script is a .py file, train.py for example, the boot command is as follows.
    python ${MA_JOB_DIR}/demo-code/train.py
  • If the training boot script is a .sh file, main.sh for example, the boot command is as follows:
    bash ${MA_JOB_DIR}/demo-code/main.sh

You can use semicolons (;) and ampersands (&&) to combine multiple commands. demo-code in the command is the last-level OBS directory where the code is stored. Replace it with the actual one.

For details about how to use custom images supported by training, see Using a Custom Image to Create a Training Job.

If all used images are customized, do as follows to use a specified Conda environment to start training:

Training jobs do not run in a shell. Therefore, you are not allowed to run the conda activate command to activate a specified Conda environment. In this case, use other methods to start training.

For example, Conda in your custom image is installed in the /home/ma-user/anaconda3 directory, the Conda environment is python-3.7.10, and the training script is stored in /home/ma-user/modelarts/user-job-dir/code/train.py. Use a specified Conda environment to start training in one of the following ways:

  • Method 1: Configure the correct DEFAULT_CONDA_ENV_NAME and ANACONDA_DIR environment variables for the image.
    ANACONDA_DIR=/home/ma-user/anaconda3
    DEFAULT_CONDA_ENV_NAME=python-3.7.10
    Run the python command to start the training script. The following shows an example:
    python /home/ma-user/modelarts/user-job-dir/code/train.py
  • Method 2: Use the absolute path of Conda environment Python.
    Run the /home/ma-user/anaconda3/envs/python-3.7.10/bin/python command to start the training script. The following shows an example:
    /home/ma-user/anaconda3/envs/python-3.7.10/bin/python /home/ma-user/modelarts/user-job-dir/code/train.py
  • Method 3: Configure the PATH environment variable.
    Configure the bin directory of the specified Conda environment into the path environment variable. Run the python command to start the training script. The following shows an example:
    export PATH=/home/ma-user/anaconda3/envs/python-3.7.10/bin:$PATH; python /home/ma-user/modelarts/user-job-dir/code/train.py
  • Method 4: Run the conda run -n command.
    Run the /home/ma-user/anaconda3/bin/conda run -n python-3.7.10 command to execute the training. The following shows an example:
    /home/ma-user/anaconda3/bin/conda run -n python-3.7.10 python /home/ma-user/modelarts/user-job-dir/code/train.py

If there is an error indicating that the .so file is unavailable in the $ANACONDA_DIR/envs/$DEFAULT_CONDA_ENV_NAME/lib directory, add the directory to LD_LIBRARY_PATH and place the following command before the preceding boot command:

export LD_LIBRARY_PATH=$ANACONDA_DIR/envs/$DEFAULT_CONDA_ENV_NAME/lib:$LD_LIBRARY_PATH;

For example, the example boot command used in method 1 is as follows:

export LD_LIBRARY_PATH=$ANACONDA_DIR/envs/$DEFAULT_CONDA_ENV_NAME/lib:$LD_LIBRARY_PATH; python /home/ma-user/modelarts/user-job-dir/code/train.py

Configuring Pipelines

An algorithm obtains data from an OBS bucket or dataset for model training. The training output is stored in an OBS bucket. The input and output parameters in your algorithm code must be parsed to enable data exchange between ModelArts and OBS. For details about how to develop code for training on ModelArts, see Preparing Model Training Code.

When you use a preset image to create an algorithm, configure the input and output parameters defined in the algorithm code.

  • Input configurations
    Table 4 Input configurations

    Parameter

    Description

    Parameter Name

    Set this parameter based on the data input parameter in your algorithm code. The code path parameter must be the same as the training input parameter parsed in your algorithm code. Otherwise, the algorithm code cannot obtain the input data.

    For example, If you use argparse in the algorithm code to parse data_url into the data input, set the data input parameter to data_url when creating the algorithm.

    Description

    Customize the description of the input parameter.

    Obtained from

    Select a source of the input parameter, Hyperparameters (default) or Environment variables.

    Constraints

    Enable this parameter to specify the input source. You can select a storage path or ModelArts dataset. This parameter is optional.

    If you select a ModelArts dataset, set the following parameters:

    • Labeling Type: For details, see Creating a Manual Labeling Job.
    • Data Format, which can be Default, CarbonData, or both. Default indicates the manifest format.
    • Data Segmentation is available only for image classification, object detection, text classification, and sound classification datasets.

      The options are Segmented dataset, Dataset not segmented, and Unlimited. For details, see Publishing a Data Version.

    Add

    Add multiple input data sources based on your algorithm.

  • Output configurations
    Table 5 Output configurations

    Parameter

    Description

    Parameter Name

    Set this parameter based on the data output parameter in your algorithm code. The code path parameter must be the same as the data output parameter parsed in your algorithm code. Otherwise, the algorithm code cannot obtain the output path.

    For example, if you use argparse in the algorithm code to parse train_url into the data output, set the data output parameter to train_url when creating the algorithm.

    Description

    Customize the description of the output parameter.

    Obtained from

    Select a source of the output parameter, Hyperparameters (default) or Environment variables.

    Add

    Add multiple output data paths based on your algorithm.

Configuring Hyperparameters

When you create an algorithm, ModelArts Standard allows you to customize hyperparameters so you can view or modify them anytime. Defined hyperparameters are displayed in the boot command and passed to your boot file as CLI parameters.

  1. Import hyperparameters.

    You can click Add hyperparameter to manually add hyperparameters.

  2. Edit hyperparameters.

    To ensure data security, do not enter sensitive information, such as plaintext passwords.

    For details, see Table 6.

    Table 6 Hyperparameter parameters

    Parameter

    Description

    Name

    Enter the hyperparameter name.

    Enter 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed.

    Type

    Select the data type of the hyperparameter. The value can be String, Integer, Float, or Boolean

    Default

    Set the default value of the hyperparameter. This value will be used for training jobs by default.

    Restrain

    Click Restrain. Then, set the range of the default value or enumerated value in the dialog box displayed.

    Required

    Select Yes or No.

    • If you select No, you can delete the hyperparameter on the training job creation page when using this algorithm to create a training job.
    • If you select Yes, you cannot delete the hyperparameter on the training job creation page when using this algorithm to create a training job.

    Description

    Enter the description of the hyperparameter.

    Only letters, digits, spaces, hyphens (-), underscores (_), commas (,), and periods (.) are allowed.

Supported Policies

Auto search on ModelArts automatically finds the optimal hyperparameters without any code modification. For details about parameter settings, see Creating a Training Job for Automatic Model Tuning.

Only the tensorflow_2.1.0-cuda_10.1-py_3.7-ubuntu_18.04-x86_64 and pytorch_1.8.0-cuda_10.2-py_3.7-ubuntu_18.04-x86_64 images are available for auto search.

Adding Training Constraints

You can add training constraints of the algorithm based on your needs.

  • Resource Type: Select the required resource types.
  • Multicard Training: Choose whether to support multi-card training.
  • Distributed Training: Choose whether to support distributed training.

Previewing the Runtime Environment

When creating an algorithm, click the arrow on in the lower right corner of the page to know the paths of the code directory, boot file, and input and output data in the training container.

Viewing Algorithm Details

To view the algorithm information, go to the ModelArts Standard console, choose Asset Management > Algorithm Management, and click the target algorithm to open its details page.

Publishing an Algorithm

Publish your created algorithm to AI Gallery, making it available for other users to use.

To publish an algorithm, go to the ModelArts Standard console, choose Asset Management > Algorithm Management, and click the algorithm to open its details page. Click Publish to proceed, then configure parameters on the Publish in AI Gallery page and click Publish again.
  • If the algorithm is published for the first time, set Publish Mode to Create Asset, enter an asset title, and select the available region.
  • To update a published algorithm, set Publish Mode to Add asset version, select an asset title from the Item Title drop-down list, and enter the offering version.

To publish an asset in AI Gallery for the first time, select I have read and agree to Huawei Cloud AI Gallery Publishers' Agreement and Huawei Cloud AI Gallery Service Agreement.

Go to AI Gallery to view the asset details.

Deleting an Algorithm

To delete your algorithm, choose Asset Management > Algorithm Management. Click Delete in the Operation column. In the displayed dialog box, click OK to confirm the deletion.