Creating an Algorithm
Your locally developed algorithms or algorithms developed using other tools can be uploaded to ModelArts for unified management. Note the following when creating a custom algorithm:
- Prerequisites
- Accessing the Algorithm Creation Page
- Setting Basic Parameters
- Setting the Boot Mode
- Configuring Pipelines
- Defining Hyperparameters
- Supported Policies
- Adding Training Constraints
- Runtime Environment Preview
- Follow-up Operations
Prerequisites
- Data is available either by creating a dataset in ModelArts or by uploading the dataset used for training to the OBS directory.
- Your training script has been uploaded to the OBS directory. For details about how to develop a training script, see Developing a Custom Script.
- At least one empty folder has been created in OBS for storing the training output.
- The account is not in arrears because resources are consumed when training jobs are running.
- The OBS directory you use and ModelArts are in the same region.
Accessing the Algorithm Creation Page
- Log in to the ModelArts management console and click Algorithm Management in the left navigation pane.
- On the My Algorithms page, click Create. The Create Algorithm page is displayed.
Setting the Boot Mode
Select a preset image to create an algorithm.
Parameter |
Description |
---|---|
Boot Mode > Preset image |
Select a preset image and its version used by the algorithm. To use an old-version image, select Show Old Images. |
Code Directory |
OBS path for storing the algorithm code. The files required for training, such as the training code, dependency installation packages, and pre-generated models, are uploaded to the code directory. Do not store training data in the code directory. When the training job starts, the data stored in the code directory will be downloaded to the backend. A large amount of training data may lead to a download failure. After you create the training job, ModelArts downloads the code directory and its subdirectories to the container. Take OBS path obs://obs-bucket/training-test/demo-code as an example. The content in the OBS path will be automatically downloaded to ${MA_JOB_DIR}/demo-code in the training container, and demo-code (customizable) is the last-level directory of the OBS path.
NOTE:
|
Boot File |
The file must be stored in the code directory and end with .py. ModelArts supports boot files edited only in Python. The boot file in the code directory is used to start a training job. |
Configuring Pipelines
A preset image-based algorithm obtains data from an OBS bucket or dataset for model training. The training output is stored in an OBS bucket. The input and output parameters in your algorithm code must be parsed to enable data exchange between ModelArts and OBS. For details about how to develop code for training on ModelArts, see Developing a Custom Script.
When you use a preset image to create an algorithm, configure the input and output pipelines.
- Input configurations
Table 2 Input configurations Parameter
Description
Parameter Name
Set the name based on the data input parameter in your algorithm code. The code path parameter must be the same as the training input parameter parsed in your algorithm code. Otherwise, the algorithm code cannot obtain the input data.
For example, If you use argparse in the algorithm code to parse data_url into the data input, set the data input parameter to data_url when creating the algorithm.
Description
Customizable description of the input parameter,
Obtained from
Source of the input parameter. You can select Hyperparameters (default) or Environment variables.
Constraints
Whether data is obtained from a storage path or ModelArts dataset.
If you select the ModelArts dataset as the data source, the following constraints are added:
- Labeling Type: For details, see Creating a Labeling Job.
- Data Format, which can be Default, CarbonData, or both. Default indicates the manifest format.
- Data Segmentation: available only for image classification, object detection, text classification, and sound classification datasets.
Possible values are Segmented dataset, Dataset not segmented, and Unlimited. For details, see Publishing a Data Version.
Add
Multiple data input sources are allowed.
- Output configurations
Table 3 Output configurations Parameter
Description
Parameter Name
Set the name based on the data output parameter in your algorithm code. The code path parameter must be the same as the training output parameter parsed in your algorithm code. Otherwise, the algorithm code cannot obtain the output path.
For example, If you use argparse in the algorithm code to parse train_url into the data output, set the data output parameter to train_url when creating the algorithm.
Description
Customizable description of the output parameter,
Obtained from
Source of the output parameter. You can select Hyperparameters (default) or Environment variables.
Add
Multiple data output paths are allowed.
Defining Hyperparameters
When you use a preset image to create an algorithm, ModelArts allows you to customize hyperparameters so you can view or modify them anytime. After the hyperparameters are defined, they are displayed in the startup command and transferred to your boot file as CLI parameters.
- Import hyperparameters.
You can click Add hyperparameter to manually add hyperparameters.
- Edit hyperparameters.
For details, see Table 4.
Table 4 Hyperparameters Parameter
Description
Name
Hyperparameter name
Enter 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed.
Type
Type of the hyperparameter, which can be String, Integer, Float, or Boolean
Default
Default value of the hyperparameter, which is used for training jobs by default
Constraints
Click Restrain. Then, set the range of the default value or enumerated value in the dialog box displayed.
Required
Select Yes or No.
- If you select No, you can delete the hyperparameter on the training job creation page when using this algorithm to create a training job.
- If you select Yes, you cannot delete the hyperparameter on the training job creation page when using this algorithm to create a training job.
Description
Description of the hyperparameter
Only letters, digits, spaces, hyphens (-), underscores (_), commas (,), and periods (.) are allowed.
Supported Policies
ModelArts supports auto search. Auto search automatically finds the optimal hyperparameters without any code modification. This improves model precision and convergence speed. For details about parameter settings, see Parameters of hyperparameter search.
Only the pytorch_1.8.0-cuda_10.2-py_3.7-ubuntu_18.04-x86_64 and tensorflow_2.1.0-cuda_10.1-py_3.7-ubuntu_18.04-x86_64 images are available for auto search.
Adding Training Constraints
You can add training constraints of the algorithm based on your needs.
- Resource Type: Select the required resource types.
- Multicard Training: Choose whether to support multi-card training.
- Distributed Training: Choose whether to support distributed training.
Runtime Environment Preview
When creating an algorithm, click the arrow on in the lower right corner of the page to know the path of the code directory, boot file, and input and output data in the training container.
Follow-up Operations
After an algorithm is created, use it to create a training job. For details, see Creating a Training Job.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot