What Can I Do If an NPU Driver Fails to Be Upgraded?
During an NPU driver upgrade, there may be some errors due to issues in version compatibility or environment configurations. This section outlines common error scenarios and their solutions, helping you troubleshoot based on the error code or messages.
Scenario 1: Error -8001 Reported When the NPU Driver Information Is Obtained
- Command executed:
npu-smi info
- Error message:
dcmi module initialize failed. ret is -8001
- Solution: Download the corresponding driver package from the documentation center beforehand and run the ./{product name}-npu-driver_x.x.x_linux-{arch}.run--upgrade command to upgrade the driver. The following is an example:
./A300-3010-npu-driver_24.1.rc3.1_linux-aarch64.run --upgrade
If information similar to the following is displayed, the driver has been upgraded:
Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-04-17 09:39:16] [INFO]Start time: 2025-04-17 09:39:16 [Driver] [2025-04-17 09:39:16] [INFO]LogFile: /var/log/ascend_seclog/ascend_install.log [Driver] [2025-04-17 09:39:16] [INFO]OperationLogFile: /var/log/ascend_seclog/operation.log [Driver] [2025-04-17 09:39:16] [INFO]base version is 24.1.rc3.1. [Driver] [2025-04-17 09:39:16] [WARNING]Do not power off or restart the system during the installation/upgrade [Driver] [2025-04-17 09:39:16] [INFO]set username and usergroup, HwHiAiUser:HwHiAiUser [Driver] [2025-04-17 09:39:18] [INFO]device hot reset start /usr/local/Ascend/driver/script/device_hot_reset.sh: line 269: echo: write error: No such device [Driver] [2025-04-17 09:39:23] [INFO]driver install type: Direct [Driver] [2025-04-17 09:39:23] [INFO]upgradePercentage:10% [Driver] [2025-04-17 09:39:29] [INFO]upgradePercentage:30% [Driver] [2025-04-17 09:39:29] [INFO]upgradePercentage:40% [Driver] [2025-04-17 09:39:30] [INFO]upgradePercentage:90% [Driver] [2025-04-17 09:39:33] [INFO]upgradePercentage:100% [Driver] [2025-04-17 09:39:34] [INFO]Driver package upgraded finished! [Driver] [2025-04-17 09:39:34] [WARNING]Kernel modules can not be removed, reboot needed for installation/upgrade to take effect! [Driver] [2025-04-17 09:39:34] [INFO]End time: 2025-04-17 09:39:34
Scenario 2: Error 0x0094 Reported After the upgrade Command Is Executed
- Command executed:
./A300-3010-npu-driver_24.1.rc3.1_linux-x86_64.run --upgrade
- Error message:
Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-04-17 09:43:09] [ERROR]ERR_NO:0x0094;ERR_DES:Operation failed, An all-in-one RUN package is found in /usr/local/HiAI, which needs to be uninstalled before proceeding.
- Solution: Manually uninstall the driver and run the upgrade command again. The following is an example:
./A300-3010-npu-driver_24.1.rc3.1_linux-x86_64.run --uninstall
If information similar to the following is displayed, the driver has been uninstalled:
Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-04-22 15:18:20] [INFO]Start time: 2025-04-22 15:18:20 [Driver] [2025-04-22 15:18:20] [INFO]LogFile: /var/log/ascend_seclog/ascend_install.log [Driver] [2025-04-22 15:18:20] [INFO]OperationLogFile: /var/log/ascend_seclog/operation.log [Driver] [2025-04-22 15:18:20] [INFO]base version is 24.1.rc3.1. [Driver] [2025-04-22 15:18:45] [INFO]Driver package uninstalled successfully! Reboot needed for uninstallation to take effect! [Driver] [2025-04-22 15:18:45] [INFO]End time: 2025-04-22 15:18:45
Scenario 3: Error 0x0091 Reported After the install Command Is Executed
- Command executed:
./A300-3010-npu-driver_24.1.rc3.1_linux-aarch64.run --install-for-all --full
- Error message:
Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-04-17 09:18:42] [INFO]Start time: 2025-04-17 09:18:42 [Driver] [2025-04-17 09:18:42] [INFO]LogFile: /var/log/ascend_seclog/ascend_install.log [Driver] [2025-04-17 09:18:42] [INFO]OperationLogFile: /var/log/ascend_seclog/operation.log [Driver] [2025-04-17 09:18:42] [WARNING]Do not power off or restart the system during the installation/upgrade [Driver] [2025-04-17 09:18:42] [ERROR]ERR_NO:0x0091;ERR_DES:HwHiAiUser not exists! Please add HwHiAiUser [Driver] [2025-04-17 09:18:42] [INFO]End time: 2025-04-17 09:18:42
- Solution: Manually add the HwHiAiUser account and run the install command again. You can create the HwHiAiUser account by running the following command: (This account is the default system account of the Ascend NPU AI acceleration platform and is used to securely manage and run NPU-related services and applications.)
adduser HwHiAiUser
Scenario 4: reboot Displayed After the install Command Is Executed
- Command executed:
./A300-3010-npu-driver_24.1.rc3.1_linux-aarch64.run --install-for-all --full
- Error messages:
[Driver] [2025-04-17 10:09:30] [WARNING]Kernel modules can not be removed, reboot needed for installation/upgrade to take effect!
[Driver] [2025-04-17 10:18:47] [INFO]Driver package installed successfully! Reboot needed for installation/upgrade to take effect!
- Solution: Restart the node for the driver to work since the driver has been installed. You are advised to drain the node before restarting it to avoid impacts on the existing services on the node. For details, see Draining a Node.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot