AI Compute Service Capability Map
ModelArts supports the training and inference of the following open-source models using AI Compute Service NPUs.
LLMs
ModelArts now supports several leading LLMs on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.
|
Supported Model |
Supported Model Parameter |
Use Case |
Software Technology Stack |
Documentation |
|---|---|---|---|---|
|
Llama3 |
Llama3-8B Llama3-70B llama3.1-8B llama3.1-70B llama-3.2-1B llama-3.2-3B |
Inference |
Ascend-vLLM |
|
|
Qwen2 |
Qwen2-0.5B Qwen2-1.5B Qwen2-7B Qwen2-72B |
Inference |
Ascend-vLLM |
|
|
Qwen2.5 |
Qwen2.5-0.5B Qwen2.5-1.5B Qwen2.5-3B Qwen2.5-7B Qwen2.5-14B Qwen2.5-32B Qwen2.5-72B |
Inference |
Ascend-vLLM |
|
|
Qwen3 |
Qwen3-0.6B Qwen3-1.7B Qwen3-4B Qwen3-8B Qwen3-14B Qwen3-30B-A3B Qwen3-32B Qwen3-235B-A22B Qwen3-235B-A22B-Thinking-2507 Qwen3-235B-A22B-Instruct-2507 Qwen3-Coder-480B-A35B Qwen3-Embedding-0.6B Qwen3-Embedding-4B Qwen3-Embedding-8B Qwen3-Reranker-0.6B Qwen3-Reranker-4B Qwen3-Reranker-8B |
Inference |
Ascend-vLLM |
|
|
GLMv4 |
GLM-4-9B |
Inference |
Ascend-vLLM |
|
|
BGE |
bge-reranker-v2-m3 bge-base-en-v1.5 bge-base-zh-v1.5 bge-large-en-v1.5 bge-large-zh-v1.5 bge-m3 |
Inference |
Ascend-vLLM |
|
|
DeepSeek-R1-Distill |
DeepSeek-R1-Distill-Llama-8B DeepSeek-R1-Distill-Llama-70B DeepSeek-R1-Distill-Qwen-1.5B DeepSeek-R1-Distill-Qwen-7B DeepSeek-R1-Distill-Qwen-14B DeepSeek-R1-0528-Qwen3-8B |
Inference |
Ascend-vLLM |
|
Supported Model |
Supported Model Parameter |
Use Case |
Documentation |
|---|---|---|---|
|
DeepSeek |
DeepSeek-R1-671B DeepSeek-V3-671B DeepSeek-V2-Lite 16B |
Pre-training and fine-tuning |
|
|
Llama |
Llama3.1-8B/70B Llama3.2-1B/3B |
Pre-training and fine-tuning |
|
|
Qwen2 |
Qwen2-0.5B Qwen2-1.5B Qwen2-7B Qwen2-72B |
Pre-training and fine-tuning |
|
|
Qwen2.5 |
Qwen2.5-0.5B Qwen2.5-1.5B Qwen2.5-7B Qwen2.5-14B Qwen2.5-32B Qwen2.5-72B |
Pre-training and fine-tuning |
|
|
Qwen3 |
Qwen3-0.6B Qwen3-1.7B Qwen3-4B Qwen3-8B Qwen3-14B Qwen3-32B Qwen3-30B-A3B Qwen3-235B-A22B |
Pre-training and fine-tuning |
|
|
GLM-4 |
GLM-4-9B-Chat |
Pre-training and fine-tuning |
|
|
Mistral AI |
Mixtral-8x7B-Instruct-v0.1 |
Pre-training and fine-tuning |
Multimodal Models
ModelArts now supports several leading multimodal models on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.
|
Supported Model |
Supported Model Parameter |
Use Case |
Software Technology Stack |
Documentation |
|---|---|---|---|---|
|
Qwen2-VL |
Qwen2-VL-2B Qwen2-VL-7B Qwen2-VL-72B |
Inference |
Ascend-vLLM |
|
|
Qwen2.5-VL |
Qwen2.5-VL-2B Qwen2.5-VL-7B Qwen2.5-VL-72B |
Inference |
Ascend-vLLM |
|
|
InternVL |
InternVL2.5-26B InternVL2-llama3-76B-AWQ InternVL3-8B InternVL3-14B InternVL3-38B InternVL3-78B |
Inference |
Ascend-vLLM |
|
|
Gemma |
GEMMA-3-27B |
Inference |
Ascend-vLLM |
Image Generation Models
ModelArts now supports several leading AIGC image generation models on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.
|
Model |
Use Case |
Software Technology Stack |
Documentation |
|---|---|---|---|
|
Stable Diffusion XL (SDXL) |
Diffusers inference ComfyUI inference |
PyTorch |
Adapting Stable Diffusion for NPU Inference with Diffusers/ComfyUI and Lite Server (6.5.907) Stable Diffusion XL Inference Guide Based on ModelArts Notebook (6.5.907) |
|
Stable Diffusion 1.5 (SD1.5) |
Diffusers inference ComfyUI inference |
PyTorch |
|
|
Stable Diffusion 3.5 (SD3.5) |
Diffusers inference ComfyUI inference |
PyTorch |
|
|
HUNYUAN |
Diffusers inference |
PyTorch |
Video Generation Models
|
Model |
Use Case |
Software Technology Stack |
Documentation |
|---|---|---|---|
|
Wan series |
Inference Training |
PyTorch |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot