Updated on 2024-10-08 GMT+08:00

AM Optimization for Big MapReduce Tasks

Scenario

A big task containing 100,000 Map tasks fails. The query result shows that ApplicationMaster (AM) responds slowly and finally times out.

When the number of tasks increases, the number of objects managed by the AM increases, which requires much more memory for management. The default memory heap for AM is 1 GB.

Procedure

You can improve the AM performance by setting the following parameters.

Navigation path for setting parameters:

Adjust the following parameters in the mapred-site.xml configuration file on the client to adjust the following parameters: The mapred-site.xml configuration file is in the conf directory of the client installation path, for example, /opt/client/Yarn/config.

Parameter

Description

Default Value

yarn.app.mapreduce.am.resource.mb

This parameter must be greater than the heap size specified by yarn.app.mapreduce.am.command-opts. Unit: MB

1536

yarn.app.mapreduce.am.command-opts

Indicates the JVM startup parameters loaded to MapReduce ApplicationMaster.

  • For versions earlier than MRS 3.x: -Xmx1024m -XX:CMSFullGCsBeforeCompaction=1 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -verbose:gc
  • MRS 3.x or later: -Xmx1024m -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -verbose:gc -Djava.security.krb5.conf=${KRB5_CONFIG} -Dhadoop.home.dir=${BIGDATA_HOME}/FusionInsight_HD_xxx/install/FusionInsight-Hadoop-xxx/hadoop