Inconsistency Between df and du Command Output on the Core Node
Symptom
After the df and du commands are executed, the values of the core node capacity displayed are different.
The disk usage of the /srv/BigData/hadoop/data1/ directory queried by running the df -h command differs greatly from that queried by running the du -sh /srv/BigData/hadoop/data1/ command. The difference is greater than 10 GB.
Cause Analysis
The lsof |grep deleted command output indicates that a large number of log files in the directory are in the deleted state.
When some Spark tasks are running for a long time, some containers in the tasks keep running and logs are continuously generated. When printing logs, the executor of Spark uses the log4j log scrolling function to output logs to the stdout file. The container also monitors this file. As a result, the file is monitored by two processes at the same time. When one process scrolls according to the configuration, the earliest log file is deleted, but the other process still occupies the file handle. As a result, a file in the deleted state is generated.
Procedure
Change the output directory name for executor logs of Spark.
- Open the log configuration file. By default, the configuration file is located in <Client installation directory>/Spark/spark/conf/log4j-executor.properties.
- Change the name of the log output file.
For example:
log4j.appender.sparklog.File = ${spark.yarn.app.container.log.dir}/stdout
is changed to
log4j.appender.sparklog.File = ${spark.yarn.app.container.log.dir}/stdout.log
- Save the configuration and exit.
- Submit the task again.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.