模型最小卡数配置

不同模型推荐的训练参数和计算规格要求如表1所示，

目前仅提供微调（SFT）及训练（PT）阶段卡数配置。一般snt9B规格为单节点8卡，Snt9B23规格为单机8卡=16*DIE,其中1*DIE等效于Snt9B中的1卡,Snt9B23规格实际训练过程中设置并行策略时2*DIE为最小单位；以下配置仅参考，一般小于8卡使用8卡训练，用户可基于卡数配置浮动调动。

* 表格中“-”代表不支持，规格与卡数中的 4*Ascend表示4卡在Snt9B中表示4卡，Snt9B23表示4*DIE，以此类推。

表1 模型最小卡数配置
支持模型参数量	训练策略类型	序列长度SEQ_LEN	MindSpeed-LLM规格卡数/DIE
支持模型参数量	训练策略类型	序列长度SEQ_LEN	Snt9B	Snt9B23
llama3.1-8b	full	4096/8192	4*Ascend
llama3.1-8b	lora	4096/8192	4*Ascend
llama3.1-70b	full	4096	32*Ascend
	lora	4096	16*Ascend
	full	8192	64*Ascend
	lora	8192	16*Ascend
llama3.2-1b	full/lora	4096/8192	1*Ascend	2*Ascend
llama3.2-3b	full	4096/8192	2*Ascend
llama3.2-3b	lora	4096/8192	1*Ascend	2*Ascend
qwen2-0.5b	full/lora	4096/8192	1*Ascend	2*Ascend
qwen2-1.5b	full/lora	4096/8192	1*Ascend	2*Ascend
qwen2-7b	full	4096	4*Ascend
	lora	4096	2*Ascend
	full	8192	8*Ascend
	lora	8192	2*Ascend
qwen2-72b	full	4096	32*Ascend
	lora	4096	16*Ascend
	full	8192	64*Ascend
	lora	8192	16*Ascend
qwen2.5-0.5b	full/lora	4096/8192	1*Ascend	2*Ascend
qwen2.5-7b	full	4096	2*Ascend
	lora	4096	2*Ascend
	full	8192	2*Ascend
	lora	8192	2*Ascend
qwen2.5-14b	full	4096	8*Ascend
	lora	4096	4*Ascend
	full	8192	8*Ascend
	lora	8192	8*Ascend
qwen2.5-32b	full	4096	16*Ascend
	lora	4096	16*Ascend
	full	8192	16*Ascend
	lora	8192	16*Ascend
qwen2.5-72b	full	4096	32*Ascend
	lora	4096	16*Ascend
	full	8192	64*Ascend
	lora	8192	16*Ascend
qwen3-0.6b	full/lora	4096/8192	8*Ascend
qwen3-1.7b	full/lora	4096/8192	8*Ascend
qwen3-4b	full/lora	4096/8192	8*Ascend
qwen3-8b	full/lora	4096/8192	8*Ascend
qwen3-14b	full/lora	4096/8192	8*Ascend
qwen3-32b	full	4096/8192	32*Ascend
	lora	4096	8*Ascend
	lora	8192	16*Ascend
qwen3_moe-30B_A3B	full	4096	16*Ascend
	full	8192	32*Ascend
	lora	4096/8192	16*Ascend
qwen3_moe-235B_A22B	full	4096	256*Ascend
qwen3_moe-235B_A22B	lora	4096	128*Ascend
glm4-9b	full	4096/8192	8*Ascend
glm4-9b	lora	4096/8192	2*Ascend
mixtral-8x7b	full	4096/8192	16*Ascend
DeepSeek-V3/R1	full	4096	512*Ascend
DeepSeek-V3/R1	lora	4096	64*Ascend