Help Center/ ModelArts/ Best Practices/ AI Compute Service Capability Map
Updated on 2025-12-11 GMT+08:00

AI Compute Service Capability Map

ModelArts supports the training and inference of the following open-source models using AI Compute Service NPUs.

LLMs

ModelArts now supports several leading LLMs on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.

Table 1 LLM inference capabilities

Supported Model

Supported Model Parameter

Use Case

Software Technology Stack

Documentation

Llama3

Llama3-8B

Llama3-70B

llama3.1-8B

llama3.1-70B

llama-3.2-1B

llama-3.2-3B

Inference

Ascend-vLLM

LLM Inference

Qwen2

Qwen2-0.5B

Qwen2-1.5B

Qwen2-7B

Qwen2-72B

Inference

Ascend-vLLM

Qwen2.5

Qwen2.5-0.5B

Qwen2.5-1.5B

Qwen2.5-3B

Qwen2.5-7B

Qwen2.5-14B

Qwen2.5-32B

Qwen2.5-72B

Inference

Ascend-vLLM

Qwen3

Qwen3-0.6B

Qwen3-1.7B

Qwen3-4B

Qwen3-8B

Qwen3-14B

Qwen3-30B-A3B

Qwen3-32B

Qwen3-235B-A22B

Qwen3-235B-A22B-Thinking-2507

Qwen3-235B-A22B-Instruct-2507

Qwen3-Coder-480B-A35B

Qwen3-Embedding-0.6B

Qwen3-Embedding-4B

Qwen3-Embedding-8B

Qwen3-Reranker-0.6B

Qwen3-Reranker-4B

Qwen3-Reranker-8B

Inference

Ascend-vLLM

GLMv4

GLM-4-9B

Inference

Ascend-vLLM

BGE

bge-reranker-v2-m3

bge-base-en-v1.5

bge-base-zh-v1.5

bge-large-en-v1.5

bge-large-zh-v1.5

bge-m3

Inference

Ascend-vLLM

DeepSeek-R1-Distill

DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Llama-70B

DeepSeek-R1-Distill-Qwen-1.5B

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Qwen-14B

DeepSeek-R1-0528-Qwen3-8B

Inference

Ascend-vLLM

Table 2 LLM training capabilities

Supported Model

Supported Model Parameter

Use Case

Documentation

DeepSeek

DeepSeek-R1-671B

DeepSeek-V3-671B

DeepSeek-V2-Lite 16B

Pre-training and fine-tuning

LLM Training

Llama

Llama3.1-8B/70B

Llama3.2-1B/3B

Pre-training and fine-tuning

Qwen2

Qwen2-0.5B

Qwen2-1.5B

Qwen2-7B

Qwen2-72B

Pre-training and fine-tuning

Qwen2.5

Qwen2.5-0.5B

Qwen2.5-1.5B

Qwen2.5-7B

Qwen2.5-14B

Qwen2.5-32B

Qwen2.5-72B

Pre-training and fine-tuning

Qwen3

Qwen3-0.6B

Qwen3-1.7B

Qwen3-4B

Qwen3-8B

Qwen3-14B

Qwen3-32B

Qwen3-30B-A3B

Qwen3-235B-A22B

Pre-training and fine-tuning

GLM-4

GLM-4-9B-Chat

Pre-training and fine-tuning

Mistral AI

Mixtral-8x7B-Instruct-v0.1

Pre-training and fine-tuning

Multimodal Models

ModelArts now supports several leading multimodal models on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.

Table 3 Multimodal model inference based on the Ascend-vLLM framework

Supported Model

Supported Model Parameter

Use Case

Software Technology Stack

Documentation

Qwen2-VL

Qwen2-VL-2B

Qwen2-VL-7B

Qwen2-VL-72B

Inference

Ascend-vLLM

LLM Inference

Qwen2.5-VL

Qwen2.5-VL-2B

Qwen2.5-VL-7B

Qwen2.5-VL-72B

Inference

Ascend-vLLM

InternVL

InternVL2.5-26B

InternVL2-llama3-76B-AWQ

InternVL3-8B

InternVL3-14B

InternVL3-38B

InternVL3-78B

Inference

Ascend-vLLM

Gemma

GEMMA-3-27B

Inference

Ascend-vLLM

Image Generation Models

ModelArts now supports several leading AIGC image generation models on AI Compute Service NPUs. These adapted models enable both inference and training tasks directly on NPUs.

Table 4 Text-to-image models

Model

Use Case

Software Technology Stack

Documentation

Stable Diffusion XL (SDXL)

Diffusers inference

ComfyUI inference

PyTorch

Adapting Stable Diffusion for NPU Inference with Diffusers/ComfyUI and Lite Server (6.5.907)

Stable Diffusion XL Inference Guide Based on ModelArts Notebook (6.5.907)

Stable Diffusion 1.5 (SD1.5)

Diffusers inference

ComfyUI inference

PyTorch

Stable Diffusion 3.5 (SD3.5)

Diffusers inference

ComfyUI inference

PyTorch

HUNYUAN

Diffusers inference

PyTorch

Video Generation Models

Table 5 Video generation models

Model

Use Case

Software Technology Stack

Documentation

Wan series

Inference

Training

PyTorch

Video Generation Model Training and Inference