- What's New
- Product Bulletin
- Service Overview
- Billing
- Getting Started
-
User Guide
-
UCS Clusters
- Overview
- Huawei Cloud Clusters
-
On-Premises Clusters
- Overview
- Service Planning for On-Premises Cluster Installation
- Registering an On-Premises Cluster
- Installing an On-Premises Cluster
- Managing an On-Premises Cluster
- Attached Clusters
- Multi-Cloud Clusters
- Single-Cluster Management
- Fleets
-
Cluster Federation
- Overview
- Enabling Cluster Federation
- Using kubectl to Connect to a Federation
- Upgrading a Federation
-
Workloads
- Workload Creation
-
Container Settings
- Setting Basic Container Information
- Setting Container Specifications
- Setting Container Lifecycle Parameters
- Setting Health Check for a Container
- Setting Environment Variables
- Configuring a Workload Upgrade Policy
- Configuring a Scheduling Policy (Affinity/Anti-affinity)
- Configuring Scheduling and Differentiation
- Managing a Workload
- ConfigMaps and Secrets
- Services and Ingresses
- MCI
- MCS
- DNS Policies
- Storage
- Namespaces
- Multi-Cluster Workload Scaling
- Adding Labels and Taints to a Cluster
- RBAC Authorization for Cluster Federations
- Image Repositories
- Permissions
-
Policy Center
- Overview
- Basic Concepts
- Enabling Policy Center
- Creating and Managing Policy Instances
- Example: Using Policy Center for Kubernetes Resource Compliance Governance
-
Policy Definition Library
- Overview
- k8spspvolumetypes
- k8spspallowedusers
- k8spspselinuxv2
- k8spspseccomp
- k8spspreadonlyrootfilesystem
- k8spspprocmount
- k8spspprivilegedcontainer
- k8spsphostnetworkingports
- k8spsphostnamespace
- k8spsphostfilesystem
- k8spspfsgroup
- k8spspforbiddensysctls
- k8spspflexvolumes
- k8spspcapabilities
- k8spspapparmor
- k8spspallowprivilegeescalationcontainer
- k8srequiredprobes
- k8srequiredlabels
- k8srequiredannotations
- k8sreplicalimits
- noupdateserviceaccount
- k8simagedigests
- k8sexternalips
- k8sdisallowedtags
- k8sdisallowanonymous
- k8srequiredresources
- k8scontainerratios
- k8scontainerrequests
- k8scontainerlimits
- k8sblockwildcardingress
- k8sblocknodeport
- k8sblockloadbalancer
- k8sblockendpointeditdefaultrole
- k8spspautomountserviceaccounttokenpod
- k8sallowedrepos
- Configuration Management
- Traffic Distribution
- Observability
- Container Migration
- Pipeline
- Error Codes
-
UCS Clusters
- Best Practices
-
API Reference
- Before You Start
- Calling APIs
-
API
- UCS Cluster
-
Fleet
- Adding a Cluster to a Fleet
- Removing a Cluster from a Fleet
- Registering a Fleet
- Deleting a Fleet
- Querying a Fleet
- Adding Clusters to a Fleet
- Updating Fleet Description
- Updating Permission Policies Associated with a Fleet
- Updating the Zone Associated with the Federation of a Fleet
- Obtaining the Fleet List
- Enabling Fleet Federation
- Disabling Cluster Federation
- Querying Federation Enabling Progress
- Creating a Federation Connection and Downloading kubeconfig
- Creating a Federation Connection
- Downloading Federation kubeconfig
- Permissions Management
- Using the Karmada API
- Appendix
-
FAQs
- About UCS
-
Billing
- How Is UCS Billed?
- What Status of a Cluster Will Incur UCS Charges?
- Why Am I Still Being Billed After I Purchase a Resource Package?
- How Do I Change the Billing Mode of a Cluster from Pay-per-Use to Yearly/Monthly?
- What Types of Invoices Are There?
- Can I Unsubscribe from or Modify a Resource Package?
-
Permissions
- How Do I Configure Access Permissions for Each Function of the UCS Console?
- What Can I Do If an IAM User Cannot Obtain Cluster or Fleet Information After Logging In to UCS?
- How Do I Restore ucs_admin_trust I Deleted or Modified?
- What Can I Do If I Cannot Associate the Permission Policy with a Fleet or Cluster?
- How Do I Clear RBAC Resources After a Cluster Is Unregistered?
- Policy Center
-
Fleets
- What Can I Do If Cluster Federation Verification Fails to Be Enabled for a Fleet?
- What Can I Do If an Abnormal, Federated Cluster Fails to Be Removed from the Fleet?
- What Can I Do If an Nginx Ingress Is in the Unready State After Being Deployed?
- What Can I Do If "Error from server (Forbidden)" Is Displayed When I Run the kubectl Command?
- Huawei Cloud Clusters
- Attached Clusters
-
On-Premises Clusters
- What Can I Do If an On-Premises Cluster Fails to Be Connected?
- How Do I Manually Clear Nodes of an On-Premises Cluster?
- How Do I Downgrade a cgroup?
- What Can I Do If the VM SSH Connection Times Out?
- How Do I Expand the Disk Capacity of the CIA Add-on in an On-Premises Cluster?
- What Can I Do If the Cluster Console Is Unavailable After the Master Node Is Shut Down?
- What Can I Do If a Node Is Not Ready After Its Scale-Out?
- How Do I Update the CA/TLS Certificate of an On-Premises Cluster?
- What Can I Do If an On-Premises Cluster Fails to Be Installed?
- Multi-Cloud Clusters
-
Cluster Federation
- What Can I Do If the Pre-upgrade Check of the Cluster Federation Fails?
- What Can I Do If a Cluster Fails to Be Added to a Federation?
- What Can I Do If Status Verification Fails When Clusters Are Added to a Federation?
- What Can I Do If an HPA Created on the Cluster Federation Management Plane Fails to Be Distributed to Member Clusters?
- What Can I Do If an MCI Object Fails to Be Created?
- What Can I Do If I Fail to Access a Service Through MCI?
- What Can I Do If an MCS Object Fails to Be Created?
- What Can I Do If an MCS or MCI Instance Fails to Be Deleted?
- Traffic Distribution
- Container Intelligent Analysis
- General Reference
Copied.
huawei-npu
Introduction
huawei-npu supports and manages Huawei NPUs in containers.
After this add-on is installed, you can create NPU nodes to enable quick, efficient inference and image recognition.
Prerequisites
- You have added the accelerator/huawei-npu label to the node where huawei-npu to be installed. The label value can be empty.
- To make this add-on run on an Ascend Snt9 device, you need to install Volcano first.
Constraints
- This add-on can only be installed in on-premises clusters v1.28 or later.
- Only Arm and Huawei Cloud EulerOS 2.0 are supported.
- Only Ascend Snt9 NPUs are supported.
- Ascend Snt9 devices require the use of Volcano, and each container supports only 1, 2, 4, or 8 NPUs for scheduling.
Installing the Add-on
- Log in to the UCS console and choose Fleets. Then, click the cluster name to access the cluster console. In the navigation pane, choose Add-ons. On the displayed page, locate huawei-npu and click Install.
- Configure the NPU parameters. You are advised to retain the default values, which can satisfy most scenarios and require no changes.
- Click Install.
Figure 1 Installing huawei-npu
- Before installing huawei-npu, ensure that Volcano has been installed.
- After the NPU driver is installed on a node, restart that node for the driver to take effect. For details about how to check whether the driver is installed, see How to Check Whether the NPU Driver Has Been Installed on a Node.
- Uninstalling this add-on does not automatically delete the installed NPU driver. You need to manually uninstall the NPU driver to delete related resources.
Upgrading the Add-on
- Log in to the UCS console and choose Fleets. Click the cluster name to access the cluster console. In the navigation pane, choose Add-ons.
- Locate huawei-npu in Add-ons Installed. If there is "New version available" next to the version label, click Upgrade.
- Configure basic information and select the version.
- Click Upgrade.
Uninstalling the Add-on
- Log in to the UCS console and choose Fleets. Click the cluster name to access the cluster console. In the navigation pane, choose Add-ons.
- Locate huawei-npu in Add-ons Installed and click Uninstall.
- In the displayed dialog box, click Yes.
Installing an Ascend NPU Driver
Ensure that the Ascend NPU has been allocated to a node, confirm the device model, download the driver from the Ascend official community, and install it by referring to the installation guide.
After the installation is complete, run the following command to check all chips in the /dev directory of the node:
ls -l /dev/davinci*
Run the following command to check whether the driver is loaded:
npu-smi info
If information similar to the following is displayed, the driver has been loaded successfully. Otherwise, the driver failed to load. If the driver failed to load, you can contact Huawei technical support.
How to Check Whether the NPU Driver Has Been Installed on a Node
After ensuring that the driver is successfully installed on a node, restart that node for the driver to take effect. Otherwise, the driver cannot take effect and NPU resources are unavailable. To check whether the driver is installed, perform the following operations:
Log in to the UCS console and choose Fleets. Then, click the cluster name to access the cluster console. In the navigation pane, choose Add-ons. On the displayed page, click the add-on name to view the add-on instance list. Each instance is in the Running state.
If the node is restarted before the NPU driver is installed, the driver installation may fail, and a message is displayed on the Nodes page indicating that the Ascend driver is not ready. In this case, uninstall the NPU driver from the node and restart the node to reinstall the NPU driver. After confirming that the driver is installed, restart the node.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot