Updated on 2022-08-12 GMT+08:00

Configuring the Log Archiving and Clearing Mechanism

Scenario

Job and task logs are generated during execution of a MapReduce application.

  • Job logs are generated by the MRApplicationMaster, which record details about the start and running time of jobs and each task, Counter value, and other information. After being analyzed by HistoryServer, the job logs are used to view job execution details.
  • A task log records the log information generated by each task running in a container. By default, task logs are stored only on the local disk of each NodeManager. After the log aggregation function is enabled, the NodeManager merges local task logs and writes them into HDFS after job execution completes.

The job logs and task logs of the MapReduce are stored on HDFS (when the log aggregation function is enabled). If the mechanism for periodically archiving and deleting log files is not configured for a cluster with a large number of computation tasks, the log files will occupy large memory space of HDFS and increase the cluster load.

Log archive is implemented by Hadoop Archives. The number (number of Map tasks) of concurrent archiving tasks started by the Hadoop Archives is related to the total size of log files to be archived. The formula is as follows: Number of concurrent archive tasks = Total size of log files to be archived/Size of archive files.

Configuration

Go to the All Configurations page of the MapReduce service. For details, see Modifying Cluster Service Configuration Parameters.

Enter a parameter name in the search box. In addition, you need to configure the following information in the mapred-site.xml configuration file in the Client installation directory/HDFS/hadoop/etc/hadoop/ directory on the MapReduce client node:

Table 1 Parameter description

Parameter

Description

Default Value

mapreduce.jobhistory.cleaner.enable

Whether to enable the job log file deletion function.

true

mapreduce.jobhistory.cleaner.interval-ms

Period for starting a log file cleanup. Only log files whose retention period is longer than the time specified by mapreduce.jobhistory.max-age-ms can be deleted.

86,400,000 ms (1 day)

mapreduce.jobhistory.max-age-ms

Log files whose retention period is longer than the retention period in milliseconds specified by this parameter will be deleted.

1,296,000,000 ms (15 days)

You can configure the following parameters in the yarn-site.xml file on the ResourceManager, NodeManager, and MapReduce HistoryServer nodes. The yarn.nodemanager.remote-app-log-dir and yarn.nodemanager.remote-app-log-archive-dir parameters need to be configured on the Yarn client, and the configurations of the ResourceManager, NodeManager, and MapReduce HistoryServer nodes must be the same as those on the Yarn client.

Table 2 Parameter description

Parameter

Description

Default Value

yarn.nodemanager.remote-app-log-dir

Indicates the HDFS path for aggregating the MapReduce job logs.

/tmp/logs

yarn.nodemanager.remote-app-log-archive-dir

Indicates the HDFS path for archiving the MapReduce job logs.

/tmp/archived

yarn.log-aggregation.archive.files.minimum

Indicates the minimum number of archived MapReduce job log files. The archiving task starts when the number of files in the yarn.nodemanager.remote-app-log-dir folder is greater than or equal to the value of this parameter.

This parameter applies to MRS 3.x.

5,000

yarn.log-aggregation.archive-check-interval-seconds

Indicates the MapReduce job log archiving interval, in seconds. Log files are archived only when the number of log files reaches the value of yarn.log-aggregation.archive.files.minimum. The archiving function is disabled when the period is set to 0 or -1.

This parameter applies to MRS 3.x.

-1

yarn.log-aggregation.retain-seconds

Indicates the retention period on HDFS for archiving the MapReduce job logs. The value -1 indicates that log files are stored permanently.

1,296,000

yarn.log-aggregation.retain-check-interval-seconds

Indicates the check period (in seconds) of the MapReduce job log deletion task. If this parameter is set to -1, the check period is one tenth of the log retention period.

86400