Configuring Dynamic Resource Scheduling in Yarn Mode
Scenario
Resources are a key factor that affects Spark execution efficiency. If multiple executors are allocated to a long-term service (for example, JDBCServer) that has no task but resources of other applications are insufficient, these resources are wasted and improperly scheduled.
Dynamic resource scheduling can add or remove executors of applications in real time based on the task load. In this way, resources are dynamically scheduled to applications.
Procedure
- You need to configure the external shuffle service first. For details, see Using the External Shuffle Service to Improve Performance.
- In the spark-defaults.conf file, add the spark.dynamicAllocation.enabled configuration item and set its value to true to enable dynamic resource scheduling. This function is disabled by default.
- Table 1 lists some optional configuration items.
Table 1 Parameters for dynamic resource scheduling Configuration Item
Description
Default Value
spark.dynamicAllocation.minExecutors
Minimum number of executors
0
spark.dynamicAllocation.initialExecutors
Initial number of executors
spark.dynamicAllocation.minExecutors
spark.dynamicAllocation.maxExecutors
Maximum number of executors
Integer.MAX_VALUE
spark.dynamicAllocation.schedulerBacklogTimeout
First timeout interval for scheduling
1(s)
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout
Second and later timeout interval for scheduling
spark.dynamicAllocation.schedulerBacklogTimeout
spark.dynamicAllocation.executorIdleTimeout
Idle timeout interval for common executors
60(s)
spark.dynamicAllocation.cachedExecutorIdleTimeout
Idle timeout interval for executors with cached blocks
Integer.MAX_VALUE
- The external shuffle service must be configured before dynamic resource scheduling is enabled. If the external shuffle service is not configured, shuffle files are lost when an executor is killed.
- If spark.executor.instances or --num-executors specifies the number of Executor, the dynamic resource allocation will not take effect even if it is enabled.
- After dynamic resource scheduling is enabled, a task may be allocated to an executor to be removed, resulting in a task failure. After the same task fails for four times (can be configured by the spark.task.maxFailures parameter), the job fails. In practice, it is unlikely that a task is allocated to executors to be removed. In addition, the probability of job failure can be reduced by increasing the value of spark.task.maxFailures.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.