支持的模型列表
模型列表分为表1 大语言模型列表和表2 多模态模型列表,详细如下。
模型参数量 |
是否适配MindSpeed-LLM |
是否适配Llama-Factory |
是否适配VeRL |
开源权重文件获取地址 |
---|---|---|---|---|
llama3.1-8b |
√ |
√ |
x |
https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct |
llama3.1-70b |
√ |
√ |
x |
https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct |
llama3.2-1b |
√ |
√ |
x |
|
llama3.2-3b |
√ |
√ |
x |
|
qwen2-0.5b |
√ |
√ |
x |
|
qwen2-1.5b |
√ |
x |
x |
|
qwen2-7b |
√ |
√ |
x |
|
qwen2-72b |
√ |
√ |
x |
|
qwen2.5-0.5b |
√ |
√ |
x |
|
qwen2.5-7b |
√ |
√ |
x |
|
qwen2.5-14b |
√ |
√ |
x |
|
qwen2.5-32b |
√ |
√ |
x |
|
qwen2.5-72b |
√ |
√ |
x |
|
qwen3-0.6b |
√ |
√ |
x |
|
qwen3-1.7b |
√ |
√ |
x |
|
qwen3-4b |
√ |
√ |
x |
|
qwen3-8b |
√ |
√ |
x |
|
qwen3-14b |
√ |
√ |
x |
|
qwen3-32b |
√ |
√ |
√ |
|
qwen3_moe-30B_A3B |
√ |
√ |
x |
|
qwen3_moe-235B_A22B |
√ |
√ |
x |
|
glm4-9b |
√ |
√ |
x |
|
mixtral-8x7b |
√ |
x |
x |
|
DeepSeek-V3 |
√ |
x |
x |
https://huggingface.co/deepseek-ai/DeepSeek-V3-Base/tree/main |
DeepSeek-R1 |
√ |
x |
x |
模型参数量 |
是否适配MindSpeed-LLM |
是否适配Llama-Factory |
是否适配VeRL |
开源权重文件获取地址 |
---|---|---|---|---|
qwen2_vl-2b |
x |
√ |
x |
|
qwen2_vl-7b |
x |
√ |
x |
|
qwen2_vl-72b |
x |
√ |
x |
|
qwen2.5_vl-7b |
x |
√ |
x |
|
qwen2.5_vl-32b |
x |
x |
√ |
|
qwen2.5_vl-72b |
x |
√ |
x |
|
internvl2.5-8b |
x |
√ |
x |
|
internvl2.5-38b |
x |
√ |
x |
|
internvl2.5-78b |
x |
√ |
x |
|
gemma3-27b |
x |
√ |
x |
下线模型
- Llama2/3:llama2-7b/13b/70b、llama3-8b/70b
- Qwen/Qwen1.5:qwen-7b/14b/72b、qwen1.5-7b/14b/32b/72b
- Yi:yi-6b、yi-32b
- BaiChuan2:baichuan2-7b、baichuan2-13b
- mistral-7b、falcon-11B、MiniCPM-2B、MiniCPM3-4B、glm3-6b