On this page
Help Center/ MapReduce Service/ Troubleshooting/ Cluster Management/ Alarms Indicating Heartbeat Interruptions Between Nodes Are Frequently Generated in the MRS Cluster

Alarms Indicating Heartbeat Interruptions Between Nodes Are Frequently Generated in the MRS Cluster

Updated on 2024-12-18 GMT+08:00

Symptom

The MRS cluster frequently reports alarms indicating that the heartbeats between active and standby Manager nodes or between active and standby DBService nodes are interrupted, or a node is faulty. As a result, Hive is occasionally unavailable, affecting upper-layer services.

Cause Analysis

  1. When the alarm is generated, the VM is restarted. The alarm is generated because the VM is restarted.

  2. According to the OS analysis, the cause of the VM restart is that the node does not have available memory. Memory overflow triggers oom-killer. When the process is invoked, the process enters the disk sleep state. As a result, the VM restarts.

  3. Check the processes that occupy the memory. It is found that the processes that occupy the memory are normal service processes.

Conclusion: The VM memory cannot meet service requirements.

Procedure

  • You are advised to expand the node memory.
  • You are advised to disable unnecessary services.
Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback