Configuring Spark Dynamic Resource Scheduling in YARN Mode
Scenario
Resources are a key factor that affects Spark execution efficiency. When a long-running service (such as the JDBCServer) is allocated with multiple executors without tasks but resources of other applications are insufficient, resources are wasted and scheduled improperly.
Dynamic resource scheduling can add or remove executors of applications in real time based on the task load. In this way, resources are dynamically scheduled to applications.
Procedure
- Configure the external shuffle service.
- Log in to FusionInsight Manager, and choose Cluster > Name of the desired cluster > Service > Spark2x > Configuration > All Configurations. Enter the spark.dynamicAllocation.enabled parameter name in the search box and set it to true to enable dynamic resource scheduling.
Configuration Item |
Description |
Default Value |
---|---|---|
spark.dynamicAllocation.minExecutors |
Indicates the minimum number of executors. |
0 |
spark.dynamicAllocation.initialExecutors |
Indicates the number of initial executors. |
0 |
spark.dynamicAllocation.maxExecutors |
Indicates the maximum number of executors. |
2048 |
spark.dynamicAllocation.schedulerBacklogTimeout |
Indicates the first timeout period for scheduling. |
1s |
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout |
Indicates the second and later timeout interval for scheduling. |
1s |
spark.dynamicAllocation.executorIdleTimeout |
Indicates the idle timeout interval for common executors. |
60s |
spark.dynamicAllocation.cachedExecutorIdleTimeout |
Indicates the idle timeout interval for executors with cached blocks. |
|
The external shuffle service must be configured before using the dynamic resource scheduling function.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot