更新时间:2025-07-18 GMT+08:00
NPU驱动升级失败如何解决?
在NPU驱动升级过程中,可能因版本兼容性或环境配置等问题导致命令报错。本文整理了NPU驱动升级过程中的典型报错场景及其解决方案,您可根据实际报错代码或错误描述快速解决问题。
报错场景一:查询NPU驱动信息时,报错-8001
- 执行代码:
npu-smi info
- 报错信息:
dcmi module initialize failed. ret is -8001
- 解决方案:提前在文档中心下载对应的驱动软件包,再利用./{product name}-npu-driver_x.x.x_linux-{arch}.run--upgrade命令升级驱动,具体示例如下:
./A300-3010-npu-driver_24.1.rc3.1_linux-aarch64.run --upgrade
若回显结果如下,则说明升级成功。
Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-04-17 09:39:16] [INFO]Start time: 2025-04-17 09:39:16 [Driver] [2025-04-17 09:39:16] [INFO]LogFile: /var/log/ascend_seclog/ascend_install.log [Driver] [2025-04-17 09:39:16] [INFO]OperationLogFile: /var/log/ascend_seclog/operation.log [Driver] [2025-04-17 09:39:16] [INFO]base version is 24.1.rc3.1. [Driver] [2025-04-17 09:39:16] [WARNING]Do not power off or restart the system during the installation/upgrade [Driver] [2025-04-17 09:39:16] [INFO]set username and usergroup, HwHiAiUser:HwHiAiUser [Driver] [2025-04-17 09:39:18] [INFO]device hot reset start /usr/local/Ascend/driver/script/device_hot_reset.sh: line 269: echo: write error: No such device [Driver] [2025-04-17 09:39:23] [INFO]driver install type: Direct [Driver] [2025-04-17 09:39:23] [INFO]upgradePercentage:10% [Driver] [2025-04-17 09:39:29] [INFO]upgradePercentage:30% [Driver] [2025-04-17 09:39:29] [INFO]upgradePercentage:40% [Driver] [2025-04-17 09:39:30] [INFO]upgradePercentage:90% [Driver] [2025-04-17 09:39:33] [INFO]upgradePercentage:100% [Driver] [2025-04-17 09:39:34] [INFO]Driver package upgraded finished! [Driver] [2025-04-17 09:39:34] [WARNING]Kernel modules can not be removed, reboot needed for installation/upgrade to take effect! [Driver] [2025-04-17 09:39:34] [INFO]End time: 2025-04-17 09:39:3
报错场景二:执行upgrade命令时,报错0x0094
- 执行代码:
./A300-3010-npu-driver_24.1.rc3.1_linux-x86_64.run --upgrade
- 报错信息:
Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-04-17 09:43:09] [ERROR]ERR_NO:0x0094;ERR_DES:Operation failed, An all-in-one RUN package is found in /usr/local/HiAI, which needs to be uninstalled before proceeding.
- 解决方案:请手动卸载驱动后,重新执行upgrade命令,具体示例如下:
./A300-3010-npu-driver_24.1.rc3.1_linux-x86_64.run --uninstall
回显结果如下,则说明卸载成功。
Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-04-22 15:18:20] [INFO]Start time: 2025-04-22 15:18:20 [Driver] [2025-04-22 15:18:20] [INFO]LogFile: /var/log/ascend_seclog/ascend_install.log [Driver] [2025-04-22 15:18:20] [INFO]OperationLogFile: /var/log/ascend_seclog/operation.log [Driver] [2025-04-22 15:18:20] [INFO]base version is 24.1.rc3.1. [Driver] [2025-04-22 15:18:45] [INFO]Driver package uninstalled successfully! Reboot needed for uninstallation to take effect! [Driver] [2025-04-22 15:18:45] [INFO]End time: 2025-04-22 15:18:45
报错场景三:执行install命令时,报错0x0091
- 执行代码:
./A300-3010-npu-driver_24.1.rc3.1_linux-aarch64.run --install-for-all --full
- 报错信息:
Verifying archive integrity... 100% SHA256 checksums are OK. All good. Uncompressing ASCEND DRIVER RUN PACKAGE 100% [Driver] [2025-04-17 09:18:42] [INFO]Start time: 2025-04-17 09:18:42 [Driver] [2025-04-17 09:18:42] [INFO]LogFile: /var/log/ascend_seclog/ascend_install.log [Driver] [2025-04-17 09:18:42] [INFO]OperationLogFile: /var/log/ascend_seclog/operation.log [Driver] [2025-04-17 09:18:42] [WARNING]Do not power off or restart the system during the installation/upgrade [Driver] [2025-04-17 09:18:42] [ERROR]ERR_NO:0x0091;ERR_DES:HwHiAiUser not exists! Please add HwHiAiUser [Driver] [2025-04-17 09:18:42] [INFO]End time: 2025-04-17 09:18:42
- 解决方案:请手动添加名为HwHiAiUser的账户,添加后重新执行install命令。执行以下命令,创建名为HwHiAiUser的账户,作为昇腾NPU AI加速平台的默认系统账户,用于安全地管理和运行NPU相关服务及应用程序。
adduser HwHiAiUser
报错场景四:执行install命令时,提醒reboot
- 执行代码:
./A300-3010-npu-driver_24.1.rc3.1_linux-aarch64.run --install-for-all --full
- 存在以下两种报错信息:
[Driver] [2025-04-17 10:09:30] [WARNING]Kernel modules can not be removed, reboot needed for installation/upgrade to take effect!
[Driver] [2025-04-17 10:18:47] [INFO]Driver package installed successfully! Reboot needed for installation/upgrade to take effect!
- 解决方案:此时驱动已安装成功,您需要重启节点才可生效。建议在重启节点前进行排水操作,避免对当前节点的现存业务产生影响,具体操作请参见节点排水。