Help Center/ Data Lake Insight/ FAQs/ Flink Jobs/ O&M Guide/ Why Is the Flink Job Abnormal Due to Heartbeat Timeout Between JobManager and TaskManager?

Why Is the Flink Job Abnormal Due to Heartbeat Timeout Between JobManager and TaskManager?

Updated on 2023-05-19 GMT+08:00

Symptom

JobManager and TaskManager heartbeats timed out. As a result, the Flink job is abnormal.

Figure 1 Error information

Possible Causes

  1. Check whether the network is intermittently disconnected and whether the cluster load is high.
  2. If Full GC occurs frequently, check the code to determine whether memory leakage occurs.
    Figure 2 Full GC

Handling Procedure

  • If Full GC occurs frequently, check the code to determine whether memory leakage occurs.
  • Allocate more resources for a single TaskManager.
  • Contact technical support to modify the cluster heartbeat configuration.
Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback