Updated on 2025-02-07 GMT+08:00

Standard Resource Management

When using ModelArts for AI development, you have two options for resource pools:

Dedicated resource pool: A dedicated resource pool provides exclusive compute resources that are not shared with other users, offering better control over resources. You can use the compute resources of the Standard dedicated resource pool for training jobs, model deployment, and development environments on the ModelArts Standard development platform. Before using it, you need to purchase and create a dedicated resource pool.

Public resource pool: A public resource pool provides a shared large-scale computing cluster that allocates resources based on user job parameters, ensuring job isolation. You can use the public resource pool provided by ModelArts for training jobs, model deployment, and development environment instances. It is billed based on usage and is convenient and efficient. You can directly use the public resource pool without the need to create one.

Differences between the dedicated resource pool and the public resource pool:
  • A dedicated resource pool provides you with independent computing clusters and networks, with physical isolation between different users, while the public resource pool only provides logical isolation. The dedicated resource pool has higher isolation and security than the public resource pool.
  • Users of a dedicated resource pool have exclusive resources, so jobs will not be queued when resources are sufficient. On the other hand, the public resource pool uses shared resources and may have queues at any time.
  • A dedicated resource pool supports accessing the user's network, allowing jobs running in the dedicated resource pool to access storage and resources in the connected network. For example, when creating a training job, if you choose a dedicated resource pool with network connectivity, you can access data in SFS during training.
  • A dedicated resource pool supports customizing the physical node's runtime environment, such as GPU/Ascend driver self-upgrade, which is not supported in the public resource pool.

What capabilities does a dedicated resource pool have?

The new version of the dedicated resource pool is a comprehensive improvement in technology and product, with the following main enhancements:

  • Unified type of dedicated resource pool: There is no longer a distinction between training and inference dedicated resource pools. If your business allows, you can run both training and inference workloads in the same dedicated resource pool. You can also enable/disable support for specific job types in the dedicated resource pool through job type settings.
  • Self-service network connectivity for dedicated pools: You can create and manage the network associated with the dedicated resource pool on the ModelArts management console. If you need to access resources in your VPC for jobs running in a dedicated resource pool, interconnect the VPC with the dedicated resource pool network.
  • Improved cluster information: The redesigned dedicated resource pool details page provides more comprehensive cluster information, including job, node, and resource monitoring. This helps you understand the cluster status in a timely manner and better plan resource usage.
  • Self-service management of cluster GPU/NPU drivers: Each user has different requirements for cluster drivers. In the new version of the dedicated resource pool list page, you can choose the accelerator card driver yourself and make immediate changes or smooth upgrades according to your service needs.