Supported Models
Models are classified into large language models (LLMs) and multimodal models. The details are as follows.
|
Series |
Model |
Training Scenario |
Training Framework |
Version |
Open-Source Weight File Download Address |
|---|---|---|---|---|---|
|
DeepSeek |
DeepSeek-R1-671B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
|
|
DeepSeek-V3-671B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
https://huggingface.co/deepseek-ai/DeepSeek-V3-Base/tree/main |
|
|
DeepSeek-V2-Lite 16B |
Pre-training and full-parameter fine-tuning |
MindSpeed-LLM |
6.5.906 or later |
||
|
Qwen2 |
Qwen2-0.5B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
|
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|||
|
Qwen2-1.5B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
||
|
Qwen2-7B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|||
|
Qwen2-72B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|||
|
Qwen2.5 |
Qwen2.5-0.5B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
|
|
Pre-training and fine-tuning |
LLaMA-Factory |
||||
|
Qwen2.5-1.5B |
Reinforcement learning |
MindSpeed-RL |
6.5.906 or later |
||
|
Qwen2.5-7B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
||||
|
Reinforcement learning |
MindSpeed-RL |
6.5.906 or later |
|||
|
Qwen2.5-14B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|||
|
Reinforcement learning |
LLaMA-Factory |
6.5.907 or later |
|||
|
Qwen2.5-32B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|||
|
Reinforcement learning |
MindSpeed-RL |
6.5.906 or later |
|||
|
Reinforcement learning |
VeRL |
6.5.907 or later |
|||
|
Qwen2.5-72B |
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
||
|
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
|||
|
Reinforcement learning |
LLaMA-Factory |
6.5.907 or later |
|||
|
Qwen3 |
Qwen3-0.6B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.905 or later |
|
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
|||
|
Qwen3-1.7B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.905 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
|||
|
Qwen3-4B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.905 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
|||
|
Reinforcement learning |
VeRL |
6.5.907 or later |
|||
|
Qwen3-8B |
Reinforcement learning |
VeRL |
6.5.906 or later |
||
|
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.905 or later |
|||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
|||
|
Qwen3-14B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.905 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
|||
|
Qwen3-32B |
Reinforcement learning |
VeRL |
6.5.906 or later |
||
|
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.905 or later |
|||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
|||
|
Qwen3-30B-A3B |
Pre-training and full-parameter fine-tuning |
MindSpeed-LLM |
6.5.905 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
|||
|
Qwen3-235b-A22B |
Pre-training and full-parameter fine-tuning |
MindSpeed-LLM |
6.5.905 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
|||
|
Llama |
Llama3.1 -8B/70B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct |
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|||
|
Llama3.2-1B/3B |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|||
|
GLM |
glm-4-9b-chat |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
|
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|||
|
Mistral AI |
Mixtral-8x7B-Instruct-v0.1 |
Pre-training and fine-tuning |
MindSpeed-LLM |
6.5.902 or later |
|
Series |
Model |
Training Scenario |
Training Framework |
Version |
Open-Source Weight File Download Address |
|---|---|---|---|---|---|
|
Qwen2 VL |
Qwen2-VL-2B |
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
|
|
Qwen2-VL-7B |
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
||
|
Qwen2-VL-72B |
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.902 or later |
||
|
Qwen2.5 VL |
Qwen2.5-VL-3B |
Reinforcement learning |
VeRL |
6.5.906 or later |
|
|
Pre-training and fine-tuning |
MindSpeed-MM |
6.5.907 or later |
|||
|
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.907 or later |
|||
|
Qwen2.5-VL-7B |
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
||
|
Pre-training and fine-tuning |
MindSpeed-MM |
6.5.907 or later |
|||
|
Reinforcement learning |
VeRL |
6.5.906 or later |
|||
|
Qwen2.5-VL-32B |
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.906 or later |
||
|
Reinforcement learning |
VeRL |
6.5.905 or later |
|||
|
Qwen2.5-VL-72B |
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
||
|
Reinforcement learning |
VeRL |
6.5.906 or later |
|||
|
Gemma |
Gemma3-27b |
Pre-training and fine-tuning |
LLaMA-Factory |
6.5.905 or later |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot