AI Compute Service Capability Map

ModelArts supports the training and inference of the following open-source models using AI Compute Service NPUs.

LLMs

ModelArts now supports several leading LLMs on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.

**Table 1** LLM inference capabilities
Supported Model	Supported Model Parameter	Use Case	Software Technology Stack	Documentation
Llama3	Llama3-8B Llama3-70B llama3.1-8B llama3.1-70B llama-3.2-1B llama-3.2-3B	Inference	Ascend-vLLM	LLM Inference
Qwen2	Qwen2-0.5B Qwen2-1.5B Qwen2-7B Qwen2-72B	Inference	Ascend-vLLM
Qwen2.5	Qwen2.5-0.5B Qwen2.5-1.5B Qwen2.5-3B Qwen2.5-7B Qwen2.5-14B Qwen2.5-32B Qwen2.5-72B	Inference	Ascend-vLLM
Qwen3	Qwen3-0.6B Qwen3-1.7B Qwen3-4B Qwen3-8B Qwen3-14B Qwen3-30B-A3B Qwen3-32B Qwen3-235B-A22B Qwen3-235B-A22B-Thinking-2507 Qwen3-235B-A22B-Instruct-2507 Qwen3-Coder-480B-A35B Qwen3-Embedding-0.6B Qwen3-Embedding-4B Qwen3-Embedding-8B Qwen3-Reranker-0.6B Qwen3-Reranker-4B Qwen3-Reranker-8B	Inference	Ascend-vLLM
GLMv4	GLM-4-9B	Inference	Ascend-vLLM
BGE	bge-reranker-v2-m3 bge-base-en-v1.5 bge-base-zh-v1.5 bge-large-en-v1.5 bge-large-zh-v1.5 bge-m3	Inference	Ascend-vLLM
DeepSeek-R1-Distill	DeepSeek-R1-Distill-Llama-8B DeepSeek-R1-Distill-Llama-70B DeepSeek-R1-Distill-Qwen-1.5B DeepSeek-R1-Distill-Qwen-7B DeepSeek-R1-Distill-Qwen-14B DeepSeek-R1-0528-Qwen3-8B	Inference	Ascend-vLLM

**Table 2** LLM training capabilities
Supported Model	Supported Model Parameter	Use Case	Documentation
DeepSeek	DeepSeek-R1-671B DeepSeek-V3-671B DeepSeek-V2-Lite 16B	Pre-training and fine-tuning	LLM Training
Llama	Llama3.1-8B/70B Llama3.2-1B/3B	Pre-training and fine-tuning
Qwen2	Qwen2-0.5B Qwen2-1.5B Qwen2-7B Qwen2-72B	Pre-training and fine-tuning
Qwen2.5	Qwen2.5-0.5B Qwen2.5-1.5B Qwen2.5-7B Qwen2.5-14B Qwen2.5-32B Qwen2.5-72B	Pre-training and fine-tuning
Qwen3	Qwen3-0.6B Qwen3-1.7B Qwen3-4B Qwen3-8B Qwen3-14B Qwen3-32B Qwen3-30B-A3B Qwen3-235B-A22B	Pre-training and fine-tuning
GLM-4	GLM-4-9B-Chat	Pre-training and fine-tuning
Mistral AI	Mixtral-8x7B-Instruct-v0.1	Pre-training and fine-tuning

Multimodal Models

ModelArts now supports several leading multimodal models on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.

**Table 3** Multimodal model inference based on the Ascend-vLLM framework
Supported Model	Supported Model Parameter	Use Case	Software Technology Stack	Documentation
Qwen2-VL	Qwen2-VL-2B Qwen2-VL-7B Qwen2-VL-72B	Inference	Ascend-vLLM	LLM Inference
Qwen2.5-VL	Qwen2.5-VL-2B Qwen2.5-VL-7B Qwen2.5-VL-72B	Inference	Ascend-vLLM
InternVL	InternVL2.5-26B InternVL2-llama3-76B-AWQ InternVL3-8B InternVL3-14B InternVL3-38B InternVL3-78B	Inference	Ascend-vLLM
Gemma	GEMMA-3-27B	Inference	Ascend-vLLM

Image Generation Models

ModelArts now supports several leading AIGC image generation models on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.

**Table 4** Text-to-image models
Model	Use Case	Software Technology Stack	Documentation
Stable Diffusion XL (SDXL)	Diffusers inference ComfyUI inference	PyTorch	Adapting Stable Diffusion for NPU Inference with Diffusers/ComfyUI and Lite Server (6.5.907) Stable Diffusion XL Inference Guide Based on ModelArts Notebook (6.5.907)
Stable Diffusion 1.5 (SD1.5)	Diffusers inference ComfyUI inference	PyTorch
Stable Diffusion 3.5 (SD3.5)	Diffusers inference ComfyUI inference	PyTorch
HUNYUAN	Diffusers inference	PyTorch