Third-Party Large Models
Specifications of Third-Party Large Models
Currently, ModelArts Studio is focused on the NLP field and offers a selection of popular open-source NLP models from third-party providers for users to choose from.
For example, DeepSeek V3 was released on December 26, 2024. It is a Mixture-of-Experts (MoE) language model with 671B parameters. DeepSeek V3 outperforms GPT-4.5 on mathematical and coding evaluation benchmarks. DeepSeek R1 has the same architecture as DeepSeek V3. It was officially open-sourced on January 20, 2025. As an outstanding representative of models with strong reasoning capabilities, DeepSeek R1 has attracted great attention. DeepSeek R1 has achieved the same or even better performance than top closed-source models such as GPT-4o and GPT-4o1 in core tasks such as mathematical reasoning and code generation, and is recognized as a leading LLM in the industry. Recently, DeepSeek has open-sourced updated versions of its language models, DeepSeek V3-0324 and DeepSeek-R1-0528, which offer enhanced capabilities. These models have also been integrated into ModelArts Studio.
In addition to DeepSeek models, ModelArts Studio integrates Qwen series models. It supports Qwen3 series models (Qwen3-8B/14B/30B-A3B/32B/235B-A22B), DeepSeek-R1-Distill-Qwen-32B, DeepSeek-R1-Distill-LLama-70B/8B, Qwen2.5-72B, QwQ-32B, and Qwen2.5-VL-32B.
ModelArts Studio provides you with third-party NLP models of different specifications to meet different scenarios and requirements. The following table lists the supported models. You can choose the most suitable model based on your requirements for development and application.
|
Supported Region |
Model Name |
Maximum Context Length |
Description |
|---|---|---|---|
|
CN-Hong Kong |
DeepSeek-V3-32K-0.0.1 |
32K |
It was released in March 2025. It supports inference for a context length of 32K tokens. 16 inference units are required to deploy the model. Inference with a context length of 32K tokens supports up to 256 concurrent calls. |
|
DeepSeek-V3-32K-0.0.2 |
32K |
It was released in June 2025. It supports inference for a context length of 32K tokens. 16 inference units are required to deploy the model. Inference with a context length of 32K tokens supports up to 256 concurrent calls. The base model of this version is the open-source model DeepSeek V3-0324. |
|
|
DeepSeek-R1-32K-0.0.1 |
32K |
It was released in March 2025. It supports inference for a context length of 32K tokens. 16 inference units are required to deploy the model. Inference with a context length of 32K tokens supports up to 256 concurrent calls. |
|
|
DeepSeek-R1-32K-0.0.2 |
32K |
It was released in June 2025. It supports inference for a context length of 32K tokens. 16 inference units are required to deploy the model. Inference with a context length of 32K tokens supports up to 256 concurrent calls. The base model of this version is the open-source model DeepSeek R1-0528. |
|
|
DeepSeek-R1-distil-Qwen-32B |
32K |
DeepSeek-R1-Distill-Qwen-32B is a model fine-tuned based on the open-source model Qwen2.5-32B using data generated by DeepSeek-R1. |
|
|
DeepSeek-R1-distill-LLama-70B |
32K |
DeepSeek-R1-Distill-Llama-70B is a model fine-tuned based on the open-source model Llama-3.1-70B using data generated by DeepSeek-R1. |
|
|
DeepSeek-R1-distill-LLama-8B |
32K |
DeepSeek-R1-Distill-Llama-8B is a model fine-tuned based on the open-source model Llama-3.1-8B using data generated by DeepSeek-R2. |
|
|
Qwen3-235B-A22B |
32K |
Qwen3-235B-A22B uniquely supports seamless switching between thinking mode and non-thinking mode, allowing users to switch between the two in a dialogue. The model's inference capability significantly outperforms that of QwQ, and its general capability far exceeds that of Qwen2.5-72B-Instruct, achieving SOTA performance among models of the same scale in the industry. |
|
|
Qwen3-32B |
32K |
Qwen3-32B uniquely supports seamless switching between thinking mode and non-thinking mode, allowing users to switch between the two in a dialogue. The model's inference capability significantly outperforms that of QwQ, and its general capability far exceeds that of Qwen2.5-32B-Instruct, achieving SOTA performance among models of the same scale in the industry. |
|
|
Qwen3-30B-A3B |
32K |
Qwen3-30B-A3B uniquely supports seamless switching between thinking mode and non-thinking mode, allowing users to switch between the two in a dialogue. The model's inference capability significantly outperforms that of QwQ, and its general capability far exceeds that of Qwen2.5-32B-Instruct, achieving SOTA performance among models of the same scale in the industry. |
|
|
Qwen3-14B |
32K |
Qwen3-14B uniquely supports seamless switching between thinking mode and non-thinking mode, allowing users to switch between the two in a dialogue. The model's inference capability reaches the SOTA level among models of the same scale in the industry, and its general capability significantly surpasses that of Qwen2.5-14B. |
|
|
Qwen3-8B |
32K |
Qwen3-8B uniquely supports seamless switching between thinking mode and non-thinking mode, allowing users to switch between the two in a dialogue. The model's inference capability reaches the SOTA level among models of the same scale in the industry, and its general capability significantly surpasses that of Qwen2.5-7B. |
|
|
Qwen2.5-72B |
32K |
Compared to Qwen2, Qwen2.5 has acquired significantly more knowledge and has greatly improved capabilities in coding and mathematics. Additionally, the new models achieve significant improvements in instruction following, generating long text, understanding structured data (e.g, tables), and generating structured outputs especially JSON. |
|
|
Qwen2.5-VL-32B |
32K |
This Qwen2.5-VL 32B model provides capabilities including image recognition, precise visual positioning, text recognition and understanding, document parsing, and video comprehension. |
|
|
QWQ-32B |
32K |
This model is the QwQ reasoning model trained based on Qwen2.5-32B. Reinforcement learning greatly improves the model's inference capability. The core metrics of the model, including mathematical code (AIME 24/25, LiveCodeBench) and some general metrics (IFEval, LiveBench, etc.), reach the level of the full version of DeepSeek-R1, with all metrics significantly surpassing those of DeepSeek-R1-Distill-Qwen-32B, which is also based on Qwen2.5-32B. |
Platform Operations Supported by Third-Party Large Models
|
Model Name |
Model Evaluation |
Real-Time Inference |
Model Commissioning in Experience Center |
|---|---|---|---|
|
DeepSeek-V3-32K-0.0.1 |
√ |
√ |
√ |
|
DeepSeek-V3-32K-0.0.2 |
√ |
√ |
√ |
|
DeepSeek-R1-32K-0.0.1 |
√ |
√ |
√ |
|
DeepSeek-R1-32K-0.0.2 |
√ |
√ |
√ |
|
DeepSeek-R1-distil-Qwen-32B |
√ |
√ |
√ |
|
DeepSeek-R1-distill-LLama-70B |
√ |
√ |
√ |
|
DeepSeek-R1-distill-LLama-8B |
√ |
√ |
√ |
|
Qwen3-235B-A22B |
√ |
√ |
√ |
|
Qwen3-32B |
√ |
√ |
√ |
|
Qwen3-30B-A3B |
√ |
√ |
√ |
|
Qwen3-14B |
√ |
√ |
√ |
|
Qwen3-8B |
√ |
√ |
√ |
|
Qwen2.5-72B |
√ |
√ |
√ |
|
Qwen2.5-VL-32B |
√ |
√ |
√ |
|
QWQ-32B |
√ |
√ |
√ |
Dependency of Third-Party Large Models on Resource Pools
|
Model Name |
Cloud-based Deployment |
|---|---|
|
DeepSeek-V3-32K-0.0.1 |
Supported. 16 inference units are required to deploy the model. |
|
DeepSeek-V3-32K-0.0.2 |
Supported. 16 inference units are required to deploy the model. |
|
DeepSeek-R1-32K-0.0.1 |
Supported. 16 inference units are required to deploy the model. |
|
DeepSeek-R1-32K-0.0.2 |
Supported. 16 inference units are required to deploy the model. |
|
DeepSeek-R1-distil-Qwen-32B |
Supported. Two inference units are required to deploy the model. |
|
DeepSeek-R1-distill-LLama-70B |
Supported. Four inference units are required to deploy the model. |
|
DeepSeek-R1-distill-LLama-8B |
Supported. One inference unit is required to deploy the model. |
|
Qwen3-235B-A22B |
Supported. 16 inference units are required to deploy the model. |
|
Qwen3-32B |
Supported. Four inference units are required to deploy the model. |
|
Qwen3-30B-A3B |
Supported. Two inference units are required to deploy the model. |
|
Qwen3-14B |
Supported. One inference unit is required to deploy the model. |
|
Qwen3-8B |
Supported. One inference unit is required to deploy the model. |
|
Qwen2.5-72B |
Supported. Four inference units are required to deploy the model. |
|
Qwen2.5-VL-32B |
Supported. Four inference units are required to deploy the model. |
|
QWQ-32B |
Supported. Four inference units are required to deploy the model. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot