Managing Lite Cluster Nodes
Nodes are fundamental components of a container cluster. On the resource pool details page, click the Nodes tab to replace, delete, or reset nodes.
- Deleting, unsubscribing from, or releasing a node
- For a pay-per-use resource pool, click Delete in the Operation column.
To delete nodes in batches, select the check boxes next to the node names, and click Delete.
- For a yearly/monthly resource pool whose resources are not expired, click Unsubscribe in the Operation column.
- For a yearly/monthly resource pool whose resources are expired (in the grace period), click Release in the Operation column.
If the delete button is available for a yearly/monthly node, the node is an inventory node, click Delete.
- Before deleting, unsubscribing from, or releasing a node, ensure that there are no running jobs on this node. Otherwise, the jobs will be interrupted.
- Delete, unsubscribe from, or release abnormal nodes in a resource pool and add new ones for substitution.
- If there is only one node, it cannot be deleted, unsubscribed from, or released.
- For a pay-per-use resource pool, click Delete in the Operation column.
- Replacing a node
In the Nodes tab, locate the node to be replaced, and click Replace in the Operation column. No fee is charged for this operation.
Check the node replacement records on the Records page. Running indicates that the node is being replaced. After the replacement, you can check the new node in the node list.
The replacement can last no longer than 24 hours. If no suitable resource is found after the replacement times out, the status changes to Failed. Hover over to check the failure cause.
- The number of replacements per day cannot exceed 20% of the total nodes in the resource pool. The number of nodes to be replaced cannot exceed 5% of the total nodes in the resource pool.
- Ensure that there are idle node resources. Otherwise, the replacement may fail.
- If there are any nodes in the Resetting state in the operation records, nodes in the resource pool cannot be replaced.
- Resetting a node
In the Nodes tab, locate the node you want to reset. Click Reset in the Operation column to reset a node. You can also select multiple nodes, and click Reset to reset multiple nodes.
Configure the parameters described in Figure 1.
Table 1 Parameters Name
Description
Operating System
Select an OS from the drop-down list box.
Configuration Mode
Select a configuration mode for resetting the node.
- By node percentage: the maximum ratio of nodes that can be reset if there are multiple nodes in the reset task
- By node quantity: the maximum number of nodes that can be reset if there are multiple nodes in the reset task
Check the node reset records on the Records page, as shown in Figure 2. If the node is being reset, its status is Resetting. After the reset is complete, the node status changes to Available, as shown in Figure 3. Resetting a node will not be charged.
- Resetting a node will impact the operation of related services. During the reset process, the local disk and the Kubernetes tag on the node will be cleared. Proceed with caution when performing this operation.
- Only nodes in the Available state can be reset.
- A single node can be in only one reset task at a time. Multiple reset tasks cannot be delivered to the same node at a time.
- If there are any nodes in the Replacing state in the operation records, nodes in the resource pool cannot be reset.
- When the driver of a resource pool is being upgraded, nodes in this resource pool cannot be reset.
- For GPU and NPU specifications, after the node is reset, the driver of the node may be upgraded. Wait patiently.
- Authorizing the O&M operations
During fault locating and performance diagnosis, you need to authorize certain O&M operations. To do so, go to the resource pool details page, click the Nodes tab, locate the target node, and click More > Authorize in the Operation column. In the displayed dialog box, click OK.
Figure 4 Authorization
Normally, the Authorize button is unavailable. It will become available after the Huawei technical support applies for O&M.
After the O&M, Huawei technical support will disable the authorization. You do not need perform further operation.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot