Esta página aún no está disponible en su idioma local. Estamos trabajando arduamente para agregar más versiones de idiomas. Gracias por tu apoyo.

On this page

Distributed Training

Updated on 2024-06-12 GMT+08:00

ModelArts provides the following capabilities:

  • Extensive built-in images, meeting your requirements
  • Custom development environments set up using built-in images
  • Extensive tutorials, helping you quickly understand distributed training
  • Distributed training debugging in development tools such as PyCharm, VS Code, and JupyterLab

Constraints

  • The development environment refers to the new-version Notebook provided by ModelArts, excluding the old-version Notebook.
  • If the notebook instance flavors are changed, you can only perform single-node debugging. You cannot perform distributed debugging or submit remote training jobs.
  • Only the PyTorch and MindSpore AI frameworks can be used for multi-node distributed debugging. If you want to use MindSpore, each node must be equipped with eight cards.
  • The OBS paths in the debugging code should be replaced with your OBS paths.
  • PyTorch is used to write debugging code in this document. The process is the same for different AI frameworks. You only need to modify some parameters.

Related Chapters

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback