Updated on 2025-05-28 GMT+08:00

Viewing Details About a Standard Dedicated Resource Pool

Resource Pool Details Page

  • Log in to the ModelArts console. In the navigation pane on the left, choose Dedicated Resource Pools > Elastic Clusters.
  • You can search for the resource pools by name, ID, resource pool status, node status, resource pool type, and creation time.
  • In the resource pool list, click a resource pool to go to its details page and view its information.
    • If there are multiple ModelArts Standard resource pools, click in the upper left corner of the details page of one resource pool to switch between resource pools.
    • For a pay-per-use standard resource pool: On the details page, click More in the upper right corner to scale in/out, delete a resource pool, and perform other operations. Operations that can be performed vary depending on resource pools.
    • In the Network area of Basic Information, you can click the number of resource pools associated to view associated resource pools. You can view the number of available IP addresses on the network.
    • In the extended information area, you can view the monitoring information, jobs, nodes, specifications, tags, and events. For details, see the following section.

Viewing Jobs in a Resource Pool

On the resource pool details page, click Jobs. You can view all jobs running in the resource pool. If a job is queuing, you can view its queuing position.

Only training jobs can be viewed.

Viewing Resource Pool Events

On the resource pool details page, click Events. You can view all events of the resource pool. The cause of an event is PoolStatusChange or PoolResourcesStatusChange.

In the event list, click on the right of Event Type to filter events.

  • When a resource pool starts to be created or becomes abnormal, the resource pool status changes and the change will be recorded as an event.
  • When the number of nodes that are available or abnormal or in the process of being created or deleted changes, the resource pool node status changes and the change will be recorded as an event.

Viewing Resource Pool Nodes

On the resource pool details page, click Nodes. You can view all nodes in the resource pool.

Some resources are reserved for cluster components. Therefore, CPUs (Available/Total) does not indicate the number of physical resources on the node. It only displays the number of resources that can be used by services. CPU cores are metered in milicores, and 1,000 milicores equal 1 physical core.

In the node search box, you can search for nodes by node name, node status, HA redundancy, batch, driver version, driver status, IP address, and resource tag.

You can export the node information of a standard resource pool to an Excel file. Select the node names, click Export > Export All Data to XLSX or Export > Export Part Data to XLSX above the node list, and click in the browser to view the exported Excel files.

On the node list page, click to customize the information displayed in the node list.

Viewing Resource Pool Specifications

On the resource pool details page, click Specifications. You can view the resource specifications used by the resource pool and the number of resources corresponding to the specifications, and adjust the container engine space.

Viewing Resource Pool Monitoring Information

On the resource pool details page, click Monitoring. The resource usage including used CPUs, memory usage, and available disk capacity of the resource pool is displayed. If AI accelerators are used in the resource pool, the GPU and NPU monitoring information is also displayed.

Table 1 Monitoring metrics

Parameter

Description

Unit

Value Range

CPU usage

CPU usage of a measured object

%

0%–100%

Memory usage

Percentage of the used physical memory to the total physical memory

%

0%–100%

Used GPUs

Percentage of the used GPU memory to the total GPU memory

%

0%–100%

Used GPU memory

Percentage of the used GPU memory to the total GPU memory

%

0%–100%

Used NPUs

Percentage of the used NPU memory to the total NPU memory

%

0%–100%

Used NPU Memory

Percentage of the used GPU memory to the total GPU memory

%

0%–100%

Available Disk Capacity (GB)

Available disk capacity of a measured object

MB

≥0

Disk Capacity (GB)

Total disk capacity of a monitored object

MB

≥0

Disk Usage

Disk usage of the monitored object

%

0%–100%

GPU/NPU Fragments

Fragments are generated during resource scheduling. As a result, some cards are idle but cannot be used by multi-card tasks. For tasks with different numbers of cards, fragments vary according to the distribution of occupied cards and vary with time. The table lists only the status at the current time.

/

/