支持模型列表
本方案支持的模型列表和模型权重获取地址如表1所示。
类别 |
模型 |
W4A16量化(AWQ) |
W8A8(compress-tensor) |
模型权重下载地址 |
---|---|---|---|---|
LLM |
DeepSeek-R1-Distill-Qwen-1.5B |
× |
× |
|
DeepSeek-R1-Distill-Qwen-7B |
× |
× |
||
DeepSeek-R1-Distill-Qwen-14B |
× |
× |
||
DeepSeek-R1-Distill-Qwen-32B |
× |
× |
||
DeepSeek-R1-Distill-Llama-8B |
× |
× |
||
DeepSeek-R1-Distill-Llama-70B |
× |
× |
||
GLM4-9B |
× |
× |
||
Qwen2-0.5B |
√ |
√ |
||
Qwen2-1.5B |
√ |
√ |
||
Qwen2-7B |
√ |
× |
||
Qwen2-72B |
√ |
√ |
||
Qwen2-57B-A14B |
× |
× |
||
Qwen2.5-0.5B |
√ |
√ |
||
Qwen2.5-1.5B |
√ |
√ |
||
Qwen2.5-3B |
√ |
√ |
||
Qwen2.5-7B |
√ |
× |
||
Qwen2.5-14B |
√ |
√ |
||
Qwen2.5-32B |
√ |
√ |
||
Qwen2.5-72B |
√ |
√ |
||
Qwen3-0.6B |
× |
× |
||
Qwen3-1.7B |
× |
× |
||
Qwen3-4B |
× |
× |
||
Qwen3-8B |
× |
× |
||
Qwen3-14B |
× |
× |
||
Qwen3-32B |
× |
× |
||
Qwen3-30B-A3B |
× |
× |
||
Qwen3-235B-A22B |
× |
× |
||
QWQ-32B |
× |
× |
||
多模态理解 |
Qwen2.5VL-7B |
× |
× |
|
Qwen2.5VL-32B |
× |
× |
||
Qwen2.5VL-72B |
√ |
× |
||
gemma3-27B |
× |
× |
||
Embeding&Rerank |
bge-reranker-v2-m3 |
× |
× |
|
bge-base-en-v1.5 |
× |
× |
||
bge-base-zh-v1.5 |
× |
× |
||
bge-large-en-v1.5 |
× |
× |
||
bge-large-zh-v1.5 |
× |
× |
||
bge-m3 |
× |
× |