Help Center/ Elastic Cloud Server/ Troubleshooting/ Self-diagnosis of Faulty GPU-accelerated ECSs/ Fault Diagnosis and Handling of Graphics Cards/ What Can I Do If an Xid Error Is Displayed in the Message Log When a GPU-accelerated ECS Is Faulty?
Updated on 2025-07-30 GMT+08:00

What Can I Do If an Xid Error Is Displayed in the Message Log When a GPU-accelerated ECS Is Faulty?

Possible Causes

XID

Description

32

Invalid or corrupted push buffer stream

74

NVLINK Error, which indicates that the GPU hardware is faulty and needs to be brought offline for repair.

79

GPU has fallen off the bus, which indicates that the bus is disconnected and needs to be brought offline for repair.

For details, see https://docs.nvidia.com/deploy/xid-errors/index.html.

Solution

  1. Run the dmesg | grep –i xid command to check whether there are Xid errors.

  2. Stop the services, perform service migration, collect fault information by referring to Fault Information Collection, and contact technical support.