Why Does Yarn Not Release the Blacklist Even All Nodes Are Added to the Blacklist?
Question
Why does Yarn not release the blacklist even all nodes are added to the blacklist?
Answer
In Yarn, when the number of application nodes added to the blacklist by ApplicationMaster (AM) reaches a certain proportion (the default value is 33% of the total number of nodes), the AM automatically releases the blacklist. In this way, all available nodes are added to the blacklist and tasks can obtain node resources.
Assume that there are 8 nodes in a cluster and they are divided in to pool A and pool B by NodeLabel. There are two nodes in pool B. A user submits a task App1 to pool B, but there is not sufficient HDFS space and App1 fails to run. As a result, two nodes in pool B are added to the blacklist by the AM of App1. According to the preceding principles, 2 is less than the 33% of 8. Therefore, Yarn does not release the blacklist, and App1 cannot obtain resources and keeps running. Even if the node that is added to the blacklisted is recovered, App1 still cannot obtain resources.
The preceding principles do not apply to resource pool scenarios. You can change the value of yarn.resourcemanager.am-scheduling.node-blacklisting-disable-threshold to (nodes number of the pool / total nodes ) x 33% to solve this problem. The path is Client installation path/Yarn/config/yarn-site.xml.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.