Storage Resource
Overview
As a distributed file storage service in a big data cluster, HDFS stores all the user data of the upper-layer applications in the big data cluster, including the data written to HBase tables or Hive tables.
Directories are used as the basic unit of HDFS storage resource allocation. HDFS supports the conventional hierarchical file structure. Users can create directories and create, delete, move, or rename files in directories. Tenants can obtain storage resources by specifying directories in the HDFS file system.
Scheduling Mechanism
HDFS directories can be stored on nodes with specified labels or disks of specified hardware types. For example:
- When both real-time query and data analysis tasks are running in one cluster, the real-time query tasks are deployed on some nodes; therefore, the queried data must be stored on these nodes.
- Based on actual service requirements, key data needs to be stored on nodes with high reliability.
Administrators can flexibly configure HDFS data storage policies based on actual service requirements and data features to store data on specified nodes.
For tenants, storage resources indicate the HDFS resources occupied by them. They can implement storage resource scheduling by storing data of specified directories in storage paths configured by tenants to ensure data isolation between tenants.
Users can add or delete HDFS storage directories of tenants and set the file quantity quota and storage capacity quota of directories to manage storage resources.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot