Why Does a MapReduce Task Stay Unchanged for a Long Time?
Question
MapReduce job is not progressing for long time
Answer
This is because of less memory. When the memory is less, the time taken by the job to copy the map output increases significantly.
In order to reduce the waiting time, increase the heap memory.
The task configuration can be optimized based on the number of mappers and the data size of each mapper. Optimize the following parameters in the client installation path/Yarn/config/mapred-site.xml file based on the size of the input data:
- mapreduce.reduce.memory.mb
- mapreduce.reduce.java.opts
Example: If the data size is 5 GB with 10 mappers, then the ideal heap memory would be 1.5 GB. Increase the heap memory size according with the increase in data size.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.