Task Execution Fails Because the Input File Number Exceeds the Threshold
Symptom
When Hive performs a query operation, error message "Job Submission failed with exception 'java.lang.RuntimeException(input file number exceeded the limits in the conf;input file num is: 2380435,max heap memory is: 16892035072,the limit conf is: 500000/4)'" is displayed. The value in the error message varies depending on the actual situation. The error details are as follows:
ERROR : Job Submission failed with exception 'java.lang.RuntimeException(input file numbers exceeded the limits in the conf; input file num is: 2380435 , max heap memory is: 16892035072 , the limit conf is: 500000/4)' java.lang.RuntimeException: input file numbers exceeded the limits in the conf; input file num is: 2380435 , max heap memory is: 16892035072 , the limit conf is: 500000/4 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.checkFileNum(ExecDriver.java:545) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:430) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:158) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1965) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1723) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1475) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1283) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1278) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:167) at org.apache.hive.service.cli.operation.SQLOperation.access$200(SQLOperation.java:75) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:245) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:258) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=1)
Cause Analysis
MRS uses the ratio of maximum files to the maximum HiveServer heap memory to determine the number of input files allowed in a MapReduce job submission. Default value 500000/4 indicates that each 4 GB of heap memory allows a maximum of 500,000 input files. An error occurs if the number of input files exceeds this limit.
Solution
- Log in to FusionInsight Manager and choose Services > Hive > Configurations > All Configurations.
- Search for hive.mapreduce.input.files2memory and set it to a proper value based on the actual memory and task.
- Save the configuration and restart the affected services or instances.
- If the fault persists, adjust the GC parameter of the HiveServer based on service requirements.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot