Help Center/ MapReduce Service/ Troubleshooting/ Using Yarn/ Disk Space of a Node Is Used Up Due to Oversized Aggregated Logs of Yarn

Updated on 2024-09-18 GMT+08:00

View PDF

Disk Space of a Node Is Used Up Due to Oversized Aggregated Logs of Yarn

Issue

The disk usage of the cluster is high.

Symptom

On the host management page of Manager, the disk usage is too high.
Only a few tasks are running on the Yarn web UI.
After the hdfs dfs -du -h / command is executed on the master node of the cluster, the command output shows that the following files consume a large amount of disk space.
The log aggregation configuration parameters of the Mapreduce service is as follows.

Cause Analysis

Jobs are submitted too frequently, and the time for deleting aggregated log files is set to 1296000, that is, aggregated logs are retained for 15 days. As a result, aggregated logs cannot be released within a short period of time, exhausting the disk space.

Procedure

Log in to FusionInsight Manager and go to the Yarn configuration parameter page.
- MRS Manager: Log in to MRS Manager, choose Services > Mapreduce > Service Configuration, and select All from the Type drop-down list.
- FusionInsight Manager: Log in to FusionInsight Manager and choose Cluster > Services > Mapreduce. On the Mapreduce page, choose Configurations > All Configurations.
Search for the yarn.log-aggregation.retain-seconds parameter and decrease its value based on site requirements, for example, to 259200. In this case, the aggregated logs are retained for three days, and the disk space is automatically released after the retention period expires.
Click Save Configuration. If a dialog box is displayed, deselect Restart the affected services or instances.
Restart the service whose configuration has expired during off-peak hours. The restart will interrupt upper-layer services and affect cluster management, maintenance, and services.
1. Log in to Manager.
2. Restart the MapReduce and Yarn services.