Why Does the NVIDIA Kernel Crashes on a GPU-accelerated ECS?
Symptom
A GPU-accelerated ECS crashed during running. After the ECS was restarted, no NVIDIA driver stack logs were recorded.
Possible Causes
The ECS kernel crashed due to an official NVIDIA driver bug.
Solutions
- Method 1: Restart the ECS.
- Method 2: Update the driver version.
If the problem persists after the ECS is restarted, download the latest CUDA driver from the NVIDIA official website.
- Log in to the official NVIDIA driver download page at https://www.nvidia.cn/Download/index.aspx?lang=en.
Figure 2 Driver download page
- Enter the product information and click Search.
Figure 3 Latest driver version download page
On the Release Highlights tab, you can learn about the version updates and resolved issues of this version and determine whether to upgrade accordingly.
- Log in to the official NVIDIA driver download page at https://www.nvidia.cn/Download/index.aspx?lang=en.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot