Updated on 2024-11-29 GMT+08:00

Overview of HDFS File System Directories

This section describes the directory structure in HDFS, as shown in the following table.

Table 1 Directory structure of the HDFS file system

Path

Type

Function

Whether the Directory Can Be Deleted

Deletion Consequence

/tmp/logs/

Fixed directory

Stores container log files.

Yes

Container log files cannot be viewed.

/tmp/carbon/

Fixed directory

Stores the abnormal data in this directory if abnormal CarbonData data exists during data import.

Yes

Error data is lost.

/tmp/Loader-${Job name}_${MR job ID}

Temporary directory

Stores the region information about Loader HBase bulkload jobs. The data is automatically deleted after the job running is completed.

No

Failed to run the Loader HBase Bulkload job.

/tmp/hadoop-omm/yarn/system/rmstore

Fixed directory

Stores the ResourceManager running information.

Yes

Status information is lost after ResourceManager is restarted.

/tmp/archived

Fixed directory

Archives the MR task logs on HDFS.

Yes

MR task logs are lost.

/tmp/hadoop-yarn/staging

Fixed directory

Stores the run logs, summary information, and configuration attributes of ApplicationMaster running jobs.

No

Services are running improperly.

/tmp/hadoop-yarn/staging/history/done_intermediate

Fixed directory

Stores temporary files in the /tmp/hadoop-yarn/staging directory after all tasks are executed.

No

MR task logs are lost.

/tmp/hadoop-yarn/staging/history/done

Fixed directory

The periodic scanning thread periodically moves the done_intermediate log file to the done directory.

No

MR task logs are lost.

/tmp/mr-history

Fixed directory

Stores the historical record files that are pre-loaded.

No

Historical MR task log data is lost.

/tmp/solr

Temporary directory

Stores the Solr temporary index data.

No

Failed to perform HDFS index tasks of Solr in batches.

/tmp/hive-scratch

Fixed directory

Stores temporary data (such as session information) generated during Hive running.

No

Failed to run the current task.

/user/{user}/.sparkStaging

Fixed directory

Stores temporary files of the SparkJDBCServer application.

No

Failed to start the executor.

/user/spark/jars

Fixed directory

Stores the dependency packages for running the Spark executor.

No

Failed to start the executor.

/user/loader

Fixed directory

Stores dirty data of Loader jobs and data of HBase jobs.

No

Failed to execute the HBase job. Or dirty data is lost.

/user/loader/etl_dirty_data_dir

/user/loader/etl_hbase_putlist_tmp

/user/loader/etl_hbase_tmp

/user/oozie

Fixed directory

Stores dependent libraries required for Oozie running, which needs to be manually uploaded.

No

Failed to schedule Oozie.

/user/mapred/hadoop-mapreduce-3.1.1.tar.gz

Fixed files

Stores JAR files used by the distributed MR cache.

No

The MR distributed cache function is unavailable.

/user/solr

Fixed directory

Stores Solr historical data.

No

Historical Solr data is lost.

/user/hive

Fixed directory

Stores Hive-related data by default, including the depended Spark lib package and default table data storage path.

No

User data is lost.

/user/omm-bulkload

Temporary directory

Stores HBase batch import tools temporarily.

No

Failed to import HBase tasks in batches.

/user/hbase

Temporary directory

Stores HBase batch import tools temporarily.

No

Failed to import HBase tasks in batches.

/sparkJobHistory

Fixed directory

Stores Spark eventlog data.

No

The History Server service is unavailable, and the task fails to be executed.

/flume

Fixed directory

Stores data collected by Flume from HDFS.

No

Flume runs improperly.

/mr-history/tmp

Fixed directory

Stores logs generated by MapReduce jobs.

Yes

Log information is lost.

/mr-history/done

Fixed directory

Stores logs managed by MR JobHistory Server.

Yes

Log information is lost.

/tenant

Created when a tenant is added.

Directory of a tenant in the HDFS. By default, the system automatically creates a folder in the /tenant directory based on the tenant name. For example, the default HDFS storage directory for ta1 is tenant/ta1. When a tenant is created for the first time, the system creates the /tenant directory in the HDFS root directory. You can customize the storage path.

No

The tenant account is unavailable.

/solr

Fixed directory

Stores Solr data.

No

Solr runs improperly.

/apps{1~5}/

Fixed directory

Stores the Hive package used by WebHCat.

No

Failed to run the WebHCat tasks.

/hbase

Fixed directory

Stores HBase data.

No

HBase user data is lost.

/hbaseFileStream

Fixed directory

Stores HFS files.

No

The HFS file is lost and cannot be restored.