Overview of HDFS File System Directories
This section describes the directory structure in HDFS, as shown in the following table.
Path |
Type |
Function |
Whether the Directory Can Be Deleted |
Deletion Consequence |
---|---|---|---|---|
/tmp/spark/sparkhive-scratch |
Fixed directory |
Stores temporary files of metastore sessions in Spark JDBCServer. |
No |
Failed to run the task. |
/tmp/sparkhive-scratch |
Fixed directory |
Stores temporary files of metastore session that are executed using Spark CLI. |
No |
Failed to run the task. |
/tmp/carbon/ |
Fixed directory |
Stores the abnormal data in this directory if abnormal CarbonData data exists during data import. |
Yes |
Error data is lost. |
/tmp/Loader-${Job name}_${MR job ID} |
Temporary directory |
Stores the region information about Loader HBase bulkload jobs. The data is automatically deleted after the job running is completed. |
No |
Failed to run the Loader HBase Bulkload job. |
/tmp/logs |
Fixed directory |
Stores the collected MR task logs. |
Yes |
MR task logs are lost. |
/tmp/archived |
Fixed directory |
Archives the MR task logs on HDFS. |
Yes |
MR task logs are lost. |
/tmp/hadoop-yarn/staging |
Fixed directory |
Stores the run logs, summary information, and configuration attributes of ApplicationMaster running jobs. |
No |
Services are running improperly. |
/tmp/hadoop-yarn/staging/history/done_intermediate |
Fixed directory |
Stores temporary files in the /tmp/hadoop-yarn/staging directory after all tasks are executed. |
No |
MR task logs are lost. |
/tmp/hadoop-yarn/staging/history/done |
Fixed directory |
The periodic scanning thread periodically moves the done_intermediate log file to the done directory. |
No |
MR task logs are lost. |
/tmp/mr-history |
Fixed directory |
Stores the historical record files that are pre-loaded. |
No |
Historical MR task log data is lost. |
/tmp/hive |
Fixed directory |
Stores Hive temporary files. |
No |
Failed to run the Hive task. |
/tmp/hive-scratch |
Fixed directory |
Stores temporary data (such as session information) generated during Hive running. |
No |
Failed to run the current task. |
/user/{user}/.sparkStaging |
Fixed directory |
Stores temporary files of the SparkJDBCServer application. |
No |
Failed to start the executor. |
/user/spark/jars |
Fixed directory |
Stores running dependency packages of the Spark executor. |
No |
Failed to start the executor. |
/user/loader |
Fixed directory |
Stores dirty data of Loader jobs and data of HBase jobs. |
No |
Failed to execute the HBase job. Or dirty data is lost. |
/user/loader/etl_dirty_data_dir |
||||
/user/loader/etl_hbase_putlist_tmp |
||||
/user/loader/etl_hbase_tmp |
||||
/user/mapred |
Fixed directory |
Stores Hadoop-related files. |
No |
Failed to start Yarn. |
/user/hive |
Fixed directory |
Stores Hive-related data by default, including the depended Spark lib package and default table data storage path. |
No |
User data is lost. |
/user/omm-bulkload |
Temporary directory |
Stores HBase batch import tools temporarily. |
No |
Failed to import HBase tasks in batches. |
/user/hbase |
Temporary directory |
Stores HBase batch import tools temporarily. |
No |
Failed to import HBase tasks in batches. |
/sparkJobHistory |
Fixed directory |
Stores Spark event log data. |
No |
The History Server service is unavailable, and the task fails to be executed. |
/flume |
Fixed directory |
Stores data collected by Flume from HDFS. |
No |
Flume runs improperly. |
/mr-history/tmp |
Fixed directory |
Stores logs generated by MapReduce jobs. |
Yes |
Log information is lost. |
/mr-history/done |
Fixed directory |
Stores logs managed by MR JobHistory Server. |
Yes |
Log information is lost. |
/tenant |
Created when a tenant is added. |
Directory of a tenant in the HDFS. By default, the system automatically creates a folder in the /tenant directory based on the tenant name. For example, the default HDFS storage directory for ta1 is tenant/ta1. When a tenant is created for the first time, the system creates the /tenant directory in the HDFS root directory. You can customize the storage path. |
No |
The tenant account is unavailable. |
/apps{1~5}/ |
Fixed directory |
Stores the Hive package used by WebHCat. |
No |
Failed to run the WebHCat tasks. |
/hbase |
Fixed directory |
Stores HBase data. |
No |
HBase user data is lost. |
/hbaseFileStream |
Fixed directory |
Stores HFS files. |
No |
The HFS file is lost and cannot be restored. |
/ats/active |
Fixed directory |
HDFS path used to store the timeline data of running applications. |
No |
Failed to run the tez task after the directory deletion. |
/ats/done |
Fixed directory |
HDFS path used to store the timeline data of completed applications. |
No |
Automatically created after the deletion. |
/flink |
Fixed directory |
Stores the checkpoint task data. |
No |
Failed to run tasks after the deletion. |
Path |
Type |
Function |
Whether the Directory Can Be Deleted |
Deletion Consequence |
---|---|---|---|---|
/tmp/spark2x/sparkhive-scratch |
Fixed directory |
Stores temporary files of metastore session in Spark2x JDBCServer. |
No |
Failed to run the task. |
/tmp/sparkhive-scratch |
Fixed directory |
Stores temporary files of metastore sessions that are executed in CLI mode using Spark2x CLI. |
No |
Failed to run the task. |
/tmp/logs/ |
Fixed directory |
Stores container log files. |
Yes |
Container log files cannot be viewed. |
/tmp/carbon/ |
Fixed directory |
Stores the abnormal data in this directory if abnormal CarbonData data exists during data import. |
Yes |
Error data is lost. |
/tmp/Loader-${Job name}_${MR job ID} |
Temporary directory |
Stores the region information about Loader HBase bulkload jobs. The data is automatically deleted after the job running is completed. |
No |
Failed to run the Loader HBase Bulkload job. |
/tmp/hadoop-omm/yarn/system/rmstore |
Fixed directory |
Stores the ResourceManager running information. |
Yes |
Status information is lost after ResourceManager is restarted. |
/tmp/archived |
Fixed directory |
Archives the MR task logs on HDFS. |
Yes |
MR task logs are lost. |
/tmp/hadoop-yarn/staging |
Fixed directory |
Stores the run logs, summary information, and configuration attributes of ApplicationMaster running jobs. |
No |
Services are running improperly. |
/tmp/hadoop-yarn/staging/history/done_intermediate |
Fixed directory |
Stores temporary files in the /tmp/hadoop-yarn/staging directory after all tasks are executed. |
No |
MR task logs are lost. |
/tmp/hadoop-yarn/staging/history/done |
Fixed directory |
The periodic scanning thread periodically moves the done_intermediate log file to the done directory. |
No |
MR task logs are lost. |
/tmp/mr-history |
Fixed directory |
Stores the historical record files that are pre-loaded. |
No |
Historical MR task log data is lost. |
/tmp/hive-scratch |
Fixed directory |
Stores temporary data (such as session information) generated during Hive running. |
No |
Failed to run the current task. |
/user/{user}/.sparkStaging |
Fixed directory |
Stores temporary files of the SparkJDBCServer application. |
No |
Failed to start the executor. |
/user/spark2x/jars |
Fixed directory |
Stores running dependency packages of the Spark2x executor. |
No |
Failed to start the executor. |
/user/loader |
Fixed directory |
Stores dirty data of Loader jobs and data of HBase jobs. |
No |
Failed to execute the HBase job. Or dirty data is lost. |
/user/loader/etl_dirty_data_dir |
||||
/user/loader/etl_hbase_putlist_tmp |
||||
/user/loader/etl_hbase_tmp |
||||
/user/oozie |
Fixed directory |
Stores dependent libraries required for Oozie running, which needs to be manually uploaded. |
No |
Failed to schedule Oozie. |
/user/mapred/hadoop-mapreduce-3.1.1.tar.gz |
Fixed files |
Stores JAR files used by the distributed MR cache. |
No |
The MR distributed cache function is unavailable. |
/user/hive |
Fixed directory |
Stores Hive-related data by default, including the depended Spark lib package and default table data storage path. |
No |
User data is lost. |
/user/omm-bulkload |
Temporary directory |
Stores HBase batch import tools temporarily. |
No |
Failed to import HBase tasks in batches. |
/user/hbase |
Temporary directory |
Stores HBase batch import tools temporarily. |
No |
Failed to import HBase tasks in batches. |
/spark2xJobHistory2x |
Fixed directory |
Stores Spark2x eventlog data. |
No |
The History Server service is unavailable, and the task fails to be executed. |
/flume |
Fixed directory |
Stores data collected by Flume from HDFS. |
No |
Flume runs improperly. |
/mr-history/tmp |
Fixed directory |
Stores logs generated by MapReduce jobs. |
Yes |
Log information is lost. |
/mr-history/done |
Fixed directory |
Stores logs managed by MR JobHistory Server. |
Yes |
Log information is lost. |
/tenant |
Created when a tenant is added. |
Directory of a tenant in the HDFS. By default, the system automatically creates a folder in the /tenant directory based on the tenant name. For example, the default HDFS storage directory for ta1 is tenant/ta1. When a tenant is created for the first time, the system creates the /tenant directory in the HDFS root directory. You can customize the storage path. |
No |
The tenant account is unavailable. |
/apps{1~5}/ |
Fixed directory |
Stores the Hive package used by WebHCat. |
No |
Failed to run the WebHCat tasks. |
/hbase |
Fixed directory |
Stores HBase data. |
No |
HBase user data is lost. |
/hbaseFileStream |
Fixed directory |
Stores HFS files. |
No |
The HFS file is lost and cannot be restored. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.