What Should I Do If a Large Number of Directories Whose Names Start with blockmgr- or spark- Exist in the /tmp Directory on the Client Installation Node?
Question
After the system runs for a long time, there are many directories whose names start with blockmgr- or spark- in the /tmp directory on the node where the client is installed.
Answer
During the running of Spark tasks, the driver creates a local temporary directory whose name starts with spark- for storing service JAR packages and configuration files. In addition, the driver creates a local temporary directory with the name starting with blockmgr- for storing block data. The two directories are automatically deleted when the Spark application running is finished.
The path for storing the two directories is preferentially specified by the environment variable SPARK_LOCAL_DIRS. If the environment variable is not configured, use the value of spark.local.dir as the path for storing the directories. If the environment variable and the preceding parameter both are not configured, use the value of java.io.tmpdir. By default, spark.local.dir is set to /tmp on the client. Therefore, the /tmp directory is used by default.
In some special cases, for example, the driver process does not exit normally, for example, the kill -9 command ends the process, or the Java virtual machine crashes. As a result, the directory cannot be deleted and remains in the system.
Currently, only the driver processes in yarn-client mode and local mode may confront the preceding problem. In yarn-cluster mode, the temporary directory of the process in the container is configured as the temporary directory of the container. When the container exits, the container automatically clears the directory. Therefore, this problem does not occur in yarn-cluster mode.
Solution
In Linux, you can configure automatic directory clearing for the /tmp temporary directory. Alternatively, you can change the value of spark.local.dir in the spark-defaults.conf configuration file on the client, specify the temporary directory to a specified directory, and configure a clear mechanism for the directory.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.