"GC overhead" Is Displayed on the Client When Tasks Are Submitted Using the Hadoop Jar Command
Symptom
When a user submits a task on the client, the client returns a memory overflow error.
Cause Analysis
According to the error stack, the memory overflows when the HDFS files are read during task submission. Generally, the memory is insufficient because the task needs to read a large number of small files.
Solution
- Check whether multiple HDFS files need to be read for the started MapReduce tasks. If yes, reduce the file quantity by combining the small-sized files in advance or using combineInputFormat.
- Increase the memory when the hadoop command is run. The memory is set on the client. Change the value of -Xmx in CLIENT_GC_OPTS in the Client installation directory/HDFS/component_env file to a larger value, for example, 512 MB. Run the source component_env command for the modification to take effect.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.