Model Training Workflow
The process of developing an AI model is referred to as modeling, which typically involves two stages:
- Development: This involves preparing and configuring the environment and debugging code to ensure it is ready for deep learning training. It is recommended that you debug code within the ModelArts development environment.
- Experimentation: This stage focuses on fine-tuning datasets and adjusting hyperparameters. Through multiple rounds of experimentation, you can train a model that meets your performance goals. It is recommended that you conduct these experiments using ModelArts training jobs.
These two processes are interchangeable. For example, once the code stabilizes in the development stage, the workflow enters the experimentation stage to iterate on the model by continuously tuning hyperparameters. Conversely, if you identify a potential optimization for training performance during the experimentation stage, you can return to the development stage to optimize your code.
ModelArts provides model training capabilities, allowing you to monitor training progress and continuously tune model parameters. You can also select resource pools of different specifications based on your data requirements for model training.
Follow the guidance below to train models on ModelArts.
| Task | Subtask | Description |
|---|---|---|
| Making preparations | Prepare training code. | The essential elements for model training include training code, a training framework (image), and training data. Training code consists of the boot file or command and training dependency packages.
|
| Prepare a training image. | Model training supports multiple image sources. For details, see Preparing a Model Training Image.
| |
| Prepare training data. | In addition to training datasets, training data can also include predictive models. Prepare your data before creating a training job.
| |
| Creating an algorithm | Create an algorithm. | Before creating a production training job, you must prepare your own algorithm or subscribe to an algorithm from AI Gallery. |
| Creating a production training job | Use basic training features. |
|
| Use advanced training features. | ModelArts supports the following advanced training features:
| |
| Viewing training results and logs | View training job details. | During or after a training job, you can view parameter settings, job events, and other details on the training job details page. |
| View training job logs. | Training logs record the execution process and exception information. You can use these logs to locate issues occurred during job execution. |
| Creation Method | Use Case |
|---|---|
| Preset images | Use this method if you have developed your algorithm locally using a mainstream framework. |
| Custom images | Use this method if your algorithm relies on a non-mainstream framework. You can create an image using your algorithm and use the image to create training jobs. |
| Existing algorithms | Use this method if you want to use algorithms already managed in the Algorithm Management module, including those you created yourself or those subscribed to from AI Gallery. |
| AI Gallery algorithms | Use this method if you want to leverage ready-to-use algorithms. You can subscribe to algorithms from AI Gallery to quickly create training jobs. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot