What Do I Do If the Ascend AI Accelerator Card (NPU) Is Abnormal?
Symptom
The NPU-enabled application fails to be delivered or cannot run.
Solution
If the NPU-enabled application fails to be created:
An application applying for NPU resources must be deployed on a node with the Ascend AI accelerator card enabled. If you deploy an application that applies for NPU resources on a node without an Ascend AI accelerator card enabled during node registration, the system displays a message indicating that the application fails to be created.
As shown in the following figure, select an Ascend AI accelerator card based on the model when registering an edge node.
For a node with Ascend AI accelerator card enabled, you can view the AI accelerator card information and check the healthy chip list on the node details page.
If the NPU-enabled application running status is abnormal:
- Ensure that the number of Ascend AI accelerator cards applied for by the current application is not greater than that of healthy chips on the node. Otherwise, the application scheduling will fail.
Check the number of Ascend cards on the Upgrade tab page of the containerized application details page:
- Log in to the edge node, and check whether the NPU plug-in is normal.
docker ps -a |grep npu-plugin
- If the container status is abnormal, restart the container.
docker restart $containerID
- If the fault persists, go to What Do I Do If an Application Fails to Be Delivered to an Edge Node? and What Do I Do If a Containerized Application Fails to Be Started on an Edge Node?.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot