Why Do Executors Fail to Be Removed After the NodeManager Is Stopped?
Question
If the NodeManager is stopped with the executor dynamic allocation enabled, the executors on the node containing the stopped NodeManager fail to be removed from the driver UI when the idle time expires.
Answer
When the ResourceManager detects that the NodeManager has stopped, the Spark driver has already marked the executors for termination due to idle timeout. However, since the NodeManager is no longer running, the executors cannot be properly terminated.
Consequently, the driver is unable to detect the LOST event for the executors. As a result, they remain in the driver's executor list and continue to appear on the driver UI.
This behavior is expected when the YARN NodeManager is stopped. Once the NodeManager starts again, the executors will be removed.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot