CPU Usage of DataNodes Is Close to 100% Occasionally, Causing Node Loss
Symptom
There is a possibility that the CPU usage of DataNodes is close to 100%. As a result, nodes may be lost (the SSH connection is slow or fails).
Cause Analysis
- A lot of write failure logs exist on DataNodes.
Figure 2 DataNode write failure log
- A large number of files are written in a short time, causing insufficient DataNode memory.
Figure 3 Insufficient DataNode memory
Solution
- Check DataNode memory configuration and whether the remaining server memory is sufficient.
- Increase DataNode memory and restart the DataNode.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.