Inconsistency Between df and du Command Output on the Core Node

Issue

The capacity displayed in the df command output on the Core node is inconsistent with that displayed in the du command output.

Symptom

After the df and du commands are executed, the values of the Core node capacity displayed are different.

The disk usage of the /srv/BigData/hadoop/data1/ directory queried by running the df -h command differs greatly from that queried by running the du -sh /srv/BigData/hadoop/data1/ command. The difference is greater than 10 GB.

Cause Analysis

The lsof |grep deleted command output indicates that a large number of log files in the directory are in the deleted state.

When some Spark tasks are running for a long time, some containers in the tasks keep running and logs are continuously generated. When printing logs, the executor of Spark uses the log4j log scrolling function to output logs to the stdout file. The container also monitors this file. As a result, the file is monitored by two processes at the same time. When one process scrolls according to the configuration, the earliest log file is deleted, but the other process still occupies the file handle. As a result, a file in the deleted state is generated.

Procedure

Change the output directory name for executor logs of Spark.

Open the log configuration file. By default, the configuration file is located in <Client address>/Spark/spark/conf/log4j-executor.properties.
Change the name of the log output file.
For example, change log4j.appender.sparklog.File = ${spark.yarn.app.container.log.dir}/stdout to log4j.appender.sparklog.File = ${spark.yarn.app.container.log.dir}/stdout.log.
Save the configuration and exit.
Submit the tasks again.