Updated on 2022-09-23 GMT+08:00

Resource Pools

Overview

Both public and dedicated resource pools are available for you to select when ModelArts is used for full-process AI development.

  • Public resource pools: provide large-scale public computing clusters, which are allocated based on job parameter settings. Resources are isolated by job.
  • Dedicated resource pools: provide dedicated compute resources, which can be used for notebook instances, training jobs, and model deployment. The resources provided in a dedicated resource pool are exclusive, featuring higher resource efficiency than a public resource pool.

    To use a dedicated resource pool, create it and select the dedicated resource pool during AI development. For details about a dedicated resource pool, see the following:

    Dedicated Resource Pool

    Creating a Dedicated Resource Pool

    Resizing a Dedicated Resource Pool

    Deleting a Dedicated Resource Pool

Dedicated Resource Pool

  • Dedicated resource pools can be used by notebook instances and training jobs, and for service deployment.
  • Dedicated resource pools are classified as the pools Dedicated for Development/Training and the pools Dedicated for Service Deployment. The resource pools dedicated for development/training can only be used for notebook instances and training jobs. The resource pools dedicated for service deployment can only be used for deploying AI applications.
  • Only running dedicated resource pools are available. If a dedicated resource pool is unavailable or abnormal, rectify the fault before using it.

Creating a Dedicated Resource Pool

  1. Log in to the ModelArts management console and choose Dedicated Resource Pools on the left.
  2. On the Dedicated Resource Pools page, select a resource pool type.
  3. Click Create in the upper left corner. The page for creating a dedicated resource pool is displayed.
  4. Configure parameters by referring to Table 1 or Table 2.
    Table 1 Parameters of a resource pool dedicated for development/training

    Parameter

    Description

    Resource Type

    This parameter is not editable.

    Name

    Name of a dedicated resource pool

    Only letters, digits, hyphens (-), and underscores (_) are allowed.

    Description

    Brief description of a dedicated resource pool

    Node Flavor

    CPU or GPU

    Specifications

    Node specifications. GPUs offer better performance, and CPUs are more cost-effective. If a flavor is sold out, you can purchase it only after the resources are released by other users in the resource pool.

    Nodes

    The number of nodes in a dedicated resource pool. A more number of nodes lead to better compute performance.

    Table 2 Parameters of a resource pool dedicated for service deployment

    Parameter

    Description

    Resource Type

    Dedicated for Service Deployment by default, which cannot be changed

    Name

    Name of a dedicated resource pool

    Enter 4 to 24 characters starting with a lowercase letter and not ending with a hyphen (-). Only lowercase letters, digits, and hyphens (-) are allowed.

    Description

    Brief description of a dedicated resource pool

    Custom Network Configuration

    Allows you to customize network configurations. If you enable Custom Network Configuration, your instance will run in the specified network and are accessible to other cloud service instances in the network. If you do not enable Custom Network Configuration, ModelArts will automatically allocate a dedicated network to each user and all the users are isolated from each other.

    If you enable Custom Network Configuration, configure VPC, Subnet, and Security Group. If no network is available, go to the VPC management console and create one.

    Node Flavor

    CPU or GPU

    Specifications

    Node specifications.

    Nodes

    The number of nodes in a dedicated resource pool. A more number of nodes lead to better compute performance.

  5. After confirming the configurations, create the dedicated resource pool. After a dedicated resource pool is created, its status changes to Running.

Resizing a Dedicated Resource Pool

After a dedicated resource pool is created, you can resize it by increasing or decreasing the number of nodes to better suit your service needs.

To resize a dedicated resource pool, do as follows:

  1. Switch to the dedicated resource pool list, locate the row containing the target dedicated resource pool, and click Scale in the Operation column.
  2. On the Resize Dedicated Resource Pool page, increase or decrease the number of nodes to increase or decrease the capacity of the resource pool based on your service requirements.
    • During capacity expansion, add nodes based on service requirements.
    • During capacity reduction, delete the target nodes in the Operation column. To delete one node, disable it in the Operation column in Node List.

      Before deleting a node from a resource pool dedicated for service deployment, ensure that there are no running instances on the node. Otherwise, the deployed services will be interrupted. If you are not sure whether there is any instance running on a node to be deleted, submit a consultation service ticket.

  3. Click Submit. Then, you will be redirected to the dedicated resource pool management page.

    After submitting a request to decrease the capacity of a dedicated resource pool, do not repeatedly perform the operation because the capacity decreasing requires a certain period of time. Repeated deletion may lead to a resizing failure.

    After submitting a request to decrease the capacity of a dedicated resource pool, view the event on the Events tab of the resource pool details page. "Begin to delete resource node %s" indicates that the node deletion starts. "Resource node %s deleted" indicates that the node has been deleted on the backend.

Deleting a Dedicated Resource Pool

Delete a dedicated resource pool that is not needed to release resources.

After a dedicated resource pool is deleted, it cannot be recovered, and the training jobs, notebook instances, real-time services, and batch services that are deployed in the resource pool will become unavailable.

  1. Go to the dedicated resource pool management page and click Delete in the Operation column.
  2. In the dialog box that is displayed, enter DELETE and click OK.