文档首页/ MapReduce服务 MRS/ 组件操作指南(普通版)/ 使用MapReduce/ MapReduce常见问题/ MapReduce任务运行失败,ApplicationMaster出现物理内存溢出异常
更新时间:2023-05-11 GMT+08:00

MapReduce任务运行失败,ApplicationMaster出现物理内存溢出异常

问题

HBase bulkload任务有210000个map和10000个reduce,MapReduce任务运行失败,ApplicationMaster出现物理内存溢出异常。

For more detailed output, check the application tracking page:https://bigdata-55:8090/cluster/app/application_1449841777199_0003 
Then click on links to logs of each attempt.
Diagnostics: Container [pid=21557,containerID=container_1449841777199_0003_02_000001]  is running beyond physical memory limits 
Current usage: 1.0 GB of 1 GB physical memory used; 3.6 GB of 5 GB virtual memory used. Killing container.
Dump of the process-tree for container_1449841777199_0003_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 21584 21557 21557 21557 (java) 12342 1627 3871748096 271331 ${BIGDATA_HOME}/jdk1.8.0_51//bin/java 
-Djava.io.tmpdir=/srv/BigData/hadoop/data1/nm/localdir/usercache/hbase/appcache/application_1449841777199_0003/container_1449841777199_0003_02_000001/tmp -Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/srv/BigData/hadoop/data1/nm/containerlogs/application_1449841777199_0003/container_1449841777199_0003_02_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
-Dhadoop.root.logfile=syslog -Xmx784m org.apache.hadoop.mapreduce.v2.app.MRAppMaster
|- 21557 21547 21557 21557 (bash) 0 0 13074432 368 /bin/bash -c ${BIGDATA_HOME}/jdk1.8.0_51//bin/java
-Djava.io.tmpdir=/srv/BigData/hadoop/data1/nm/localdir/usercache/hbase/appcache/application_1449841777199_0003/container_1449841777199_0003_02_000001/tmp -Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=/srv/BigData/hadoop/data1/nm/containerlogs/application_1449841777199_0003/container_1449841777199_0003_02_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
-Dhadoop.root.logfile=syslog -Xmx784m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1>/srv/BigData/hadoop/data1/nm/containerlogs/application_1449841777199_0003/container_1449841777199_0003_02_000001/stdout 
2>/srv/BigData/hadoop/data1/nm/containerlogs/application_1449841777199_0003/container_1449841777199_0003_02_000001/stderr
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt. Failing the application.

回答

这是性能规格的问题,MapReduce任务运行失败的根本原因是由于ApplicationMaster的内存溢出导致的,即物理内存溢出导致被NodeManager kill。

解决方案:

将ApplicationMaster的内存配置调大,在客户端“客户端安装路径/Yarn/config/mapred-site.xml”配置文件中优化如下参数:

  • “yarn.app.mapreduce.am.resource.mb”
  • “yarn.app.mapreduce.am.command-opts”,该参数中-Xmx值建议为0.8*“yarn.app.mapreduce.am.resource.mb”

参考规格:

ApplicationMaster配置如下时,可以同时支持并发Container数为2.4万个。

  • “yarn.app.mapreduce.am.resource.mb”=2048
  • “yarn.app.mapreduce.am.command-opts”该参数中-Xmx=1638m