更新时间:2025-08-20 GMT+08:00
分享

支持的模型列表

模型列表分为表1 大语言模型列表表2 多模态模型列表,详细如下。

表1 支持的大语言模型列表和权重获取地址

模型参数量

是否适配MindSpeed-LLM

是否适配Llama-Factory

是否适配VeRL

开源权重文件获取地址

llama3.1-8b

x

https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct

llama3.1-70b

x

https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct

llama3.2-1b

x

https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct

llama3.2-3b

x

https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct

qwen2-0.5b

x

https://huggingface.co/Qwen/Qwen2-0.5B-Instruct

qwen2-1.5b

x

x

https://huggingface.co/Qwen/Qwen2-1.5B-Instruct

qwen2-7b

x

https://huggingface.co/Qwen/Qwen2-7B-Instruct

qwen2-72b

x

https://huggingface.co/Qwen/Qwen2-72B-Instruct

qwen2.5-0.5b

x

https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct

qwen2.5-7b

x

https://huggingface.co/Qwen/Qwen2.5-7B-Instruct

qwen2.5-14b

x

https://huggingface.co/Qwen/Qwen2.5-14B-Instruct

qwen2.5-32b

x

https://huggingface.co/Qwen/Qwen2.5-32B-Instruct

qwen2.5-72b

x

https://huggingface.co/Qwen/Qwen2.5-72B-Instruct

qwen3-0.6b

x

https://huggingface.co/Qwen/Qwen3-0.6B

qwen3-1.7b

x

https://huggingface.co/Qwen/Qwen3-1.7B

qwen3-4b

x

https://huggingface.co/Qwen/Qwen3-4B

qwen3-8b

x

https://huggingface.co/Qwen/Qwen3-8B

qwen3-14b

x

https://huggingface.co/Qwen/Qwen3-14B

qwen3-32b

https://huggingface.co/Qwen/Qwen3-32B

qwen3_moe-30B_A3B

x

https://huggingface.co/Qwen/Qwen3-30B-A3B

qwen3_moe-235B_A22B

x

https://huggingface.co/Qwen/Qwen3-235B-A22B

glm4-9b

x

https://huggingface.co/THUDM/glm-4-9b-chat

mixtral-8x7b

x

x

https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1

DeepSeek-V3

x

x

https://huggingface.co/deepseek-ai/DeepSeek-V3-Base/tree/main

DeepSeek-R1

x

x

https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main

表2 支持的多模态模型列表和权重获取地址

模型参数量

是否适配MindSpeed-LLM

是否适配Llama-Factory

是否适配VeRL

开源权重文件获取地址

qwen2_vl-2b

x

x

https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct/tree/main

qwen2_vl-7b

x

x

https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct/tree/main

qwen2_vl-72b

x

x

https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct

qwen2.5_vl-7b

x

x

https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct

qwen2.5_vl-32b

x

x

https://huggingface.co/Qwen/Qwen2.5-VL-32B-Instruct

qwen2.5_vl-72b

x

x

https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct

internvl2.5-8b

x

x

https://huggingface.co/OpenGVLab/InternVL2_5-8B

internvl2.5-38b

x

x

https://huggingface.co/OpenGVLab/InternVL2_5-38B

internvl2.5-78b

x

x

https://huggingface.co/OpenGVLab/InternVL2_5-78B

gemma3-27b

x

x

https://huggingface.co/google/gemma-3-27b-it

下线模型

以下模型不再跟随版本演进,如训练以下模型可参考6.5.901版本训练文档
  • Llama2/3:llama2-7b/13b/70b、llama3-8b/70b
  • Qwen/Qwen1.5:qwen-7b/14b/72b、qwen1.5-7b/14b/32b/72b
  • Yi:yi-6b、yi-32b
  • BaiChuan2:baichuan2-7b、baichuan2-13b
  • mistral-7b、falcon-11B、MiniCPM-2B、MiniCPM3-4B、glm3-6b

相关文档