Updated on 2025-08-20 GMT+08:00

Managing Lite Cluster Nodes

Nodes are fundamental components of a container cluster. On the resource pool details page, click the Nodes tab to replace, delete, reset, or renew nodes. When you hover over a node name, the resource ID is displayed. You can use the resource ID to query bills or billing information of yearly/monthly resources in the Billing Center.

Deleting, Unsubscribing from, or Releasing a Node

  • To release a single node from a pay-per-use resource pool, find the target node and click Delete in the Operation column, enter DELETE in the displayed dialog box, and click OK.

    To delete nodes in batches, select the target nodes and click Delete above the node list, enter DELETE in the displayed dialog box, and click OK.

  • For a yearly/monthly resource pool whose resources are not expired, click Unsubscribe in the Operation column. You can unsubscribe from nodes in batches.
  • For a yearly/monthly resource pool whose resources are expired (in the grace period), click Release in the Operation column. Nodes in the grace period cannot be released in batches.

    If the delete button is available for a yearly/monthly node, the node is an inventory node, click Delete.

  • Before deleting, unsubscribing from, or releasing a node, ensure that there are no running jobs on this node. Otherwise, the jobs will be interrupted.
  • Delete, unsubscribe from, or release abnormal nodes in a resource pool and add new ones for substitution.
  • If there is only one node, it cannot be deleted, unsubscribed from, or released.

Enabling/Disabling the Deletion Lock

To prevent nodes from being deleted or unsubscribed by mistake, you can enable the deletion lock. Once enabled, the nodes cannot be deleted or unsubscribed unless the lock is disabled.

  • The deletion lock can be enabled only for the nodes in the resource pool.
  • If the deletion lock is enabled, only node deletion and unsubscription are restricted. Other operations, such as node replacement, node restart, and node reset, work properly. Moreover, the resource pool that contains the nodes with deletion lock enabled can be deleted.
  • Enabling deletion lock: Locate the target node and choose More > Enable Deletion Lock in the Operation column. In the displayed dialog box, confirm the information, enter YES in the text box, and click OK.

    To enable deletion lock for multiple nodes in batches, select the target nodes and choose More > Enable Deletion Lock above the node list.

  • Disabling deletion lock: Locate the target node and choose More > Disable Deletion Lock in the Operation column. In the displayed dialog box, confirm the information, enter YES in the text box, and click OK.

    To disable deletion lock for multiple nodes in batches, select the target nodes and choose More > Disable Deletion Lock above the node list.

Querying Plug-in Component

On the resource pool details page, choose Node Management from the navigation pane to view the plug-in usage of the current node.

To view the instance usage of the plug-in, locate the target node and choose More > Query Plug-in Component in the Operation column.

Figure 1 Instances

Renewing a Subscription, Enabling Auto-Renewal, or Modifying Auto-Renewal

For yearly/monthly nodes, you can renew them, enable auto-renewal, and modify auto-renewal in the Nodes tab. You can also perform batch operations on nodes.

The fees generated by auto-renewal will be deducted from your account balance. For details, see Auto-Renewal.

Resetting a node

In the Nodes tab, locate the node you want to reset. Click Reset in the Operation column to reset a node. You can also select the check boxes of multiple nodes and choose More > Reset above the node list to reset multiple nodes.

Configure the parameters.

Table 1 Parameters

Name

Description

Operating System

Choose a supported OS from the drop-down list.

Configuration Mode

Select a configuration mode for resetting the node.

  • By node percentage: the maximum percentage of nodes that can be reset at a time
  • By instance quantity: the maximum number of nodes that can be reset at a time

Driver Version

Specify the driver version of the nodes to reset from the drop-down list.

Check the node reset records on the Records page. If a node is being reset, its status is Resetting. After the reset is complete, the node status changes to Available. Resetting a node will not be charged.

  • Resetting a node will impact the operation of related services. During the reset process, the local disk and the Kubernetes tag on the node will be cleared. Proceed with caution when performing this operation.
  • Only nodes in the Available state can be reset.
  • A single node can be in only one reset task at a time. Multiple reset tasks cannot be delivered to the same node at a time.
  • If there are any nodes in the Replacing state in the operation records, nodes in the resource pool cannot be reset.
  • When the driver of a resource pool is being upgraded, nodes in this resource pool cannot be reset.
  • For GPU and NPU specifications, after the node is reset, the driver of the node may be upgraded. Wait patiently.

Authorizing O&M on the Event Center Page

To view the faulty nodes reported by the ModelArts O&M platform, log in to the ModelArts console. In the navigation pane on the left, choose Event Center. The planned events of the faulty nodes are displayed, including the basic information, event type, event status, and event description. You can either redeploy the nodes or authorize Huawei technical support to perform O&M operations.

  • Authorization conditions

    Table 2 lists the event types and event status of the authorization operations that can be performed on the faulty node.

    Table 2 Authorization conditions

    Event Type

    Event Status

    Authorization Operations

    System maintenance

    To be authorized

    Authorization and redeployment

    Local disk recovery

    To be authorized

    Authorization and redeployment

    After the local disk is recovered, you can restore the partition by resetting the node.

    WARNING:

    After authorization, recovering the local disk will cause local disk data loss. Therefore, migrate services and back up data before authorization.

    Restarting a node

    To be authorized

    Authorization

    O&M authorization

    To be authorized

    Authorization

    Supernode maintenance

    To be authorized

    Authorization

    Supernode redeployment

    To be authorized

    Redeployment

    Redeployment of supernodes must be performed within physical supernodes. When supernodes are sold out, redeployment is not supported, and the authorization button becomes unavailable.

    Supernode local disk recovery

    To be authorized

    Authorization

    WARNING:

    After authorization, recovering the local disk will cause local disk data loss. Therefore, migrate services and back up data before authorization.

  • Authorization

    If the faulty nodes meet the requirements listed in Table 2, you can authorize Huawei technical support to perform O&M on the faulty nodes.

    To do so, log in to the ModelArts console. In the navigation pane on the left, choose Event Center. Locate the target node and click Authorize in the Operation column. In the displayed dialog box, click OK.

    If the planned event does not meet the requirements listed in Table 2, the Authorize button becomes unavailable.

    After the O&M, Huawei technical support will disable the authorization. No further operations are required.

  • Redeployment

    If the faulty nodes meet the redeployment requirements listed in Table 2, you can authorize Huawei technical support to redeploy the faulty nodes.

    After the O&M, Huawei technical support will disable the authorization. No further operations are required.

    Redeploying nodes can restore them quickly, but local disk data will be lost. Therefore, migrate services and back up data before redeployment.

    1. To redeploy a node, log in to the ModelArts console. In the navigation pane on the left, choose Event Center under Resource Management, locate the node, and click Redeploy in the Operation column.

      If the planned event does not meet the requirements listed in Table 2, the Redeploy button becomes unavailable.

    2. Check whether Forcible redeployment is selected, enter YES in the text box, and click OK.

      Redeployment depends on the node status. If the node is unavailable, redeployment cannot be completed. However, you can select Forcible redeployment to forcibly redeploy the node.

      Forcible redeployment resets the node, deleting all data on both its local and cloud disks. Exercise caution when performing this operation.

Restarting a Node

Locate the target node and choose More > Reboot in the Operation column. You can also select node names and click Reboot above the node list to restart nodes in batches. Restarting a node will affect running services.

Adding, Editing, or Deleting Resource Tags

Use resource tags for easy billing management.

To edit the resource tags of a single node, locate the target node and choose More > Edit Resource Tag in the Operation column.

You can also select node names and choose More > Add/Edit Resource Tag or Delete Resource Tag above the node list to manage tags in batches.

Figure 2 Adding, editing, or deleting resource tags

Exporting Node Data

You can export the node information of a Lite resource pool as an Excel file.

Select the target nodes, choose Export > Export All Data to XLSX or Export > Export Part Data to XLSX above the node list, and click in the browser to view the exported Excel file.

Upgrading a Driver

You can upgrade the driver version of a single node in a Lite resource pool or upgrade the driver versions of multiple nodes in batches. For details, see Upgrading the Driver of a Lite Cluster Resource Pool Node.

Searching for a Node

In the search box on the node management page, you can search for nodes by node name, status, batch, driver version, driver status, IP address, node pool, or resource tag.

Specifying Node Information to Display

On the node management page, click in the upper right corner to customize the information to display in the node list.