Introduction to HDFS
Introduction to HDFS
Hadoop distribute file system (HDFS) is a distributed file system with high fault tolerance. HDFS supports data access with high throughput and applies to processing of large data sets.
HDFS applies to the following application scenarios:
- Massive data processing (higher than the TB or PB level).
- Scenarios that require high throughput.
- Scenarios that require high reliability.
- Scenarios that require good scalability.
Introduction to HDFS Interface
HDFS can be developed by using Java language. For details of API interface, see Java API Introduction.
Basic Concepts
- Colocation
Colocation is used to store associated data or the data to be associated on the same storage node. The HDFS Colocation stores files to be associated on a same data node so that data can be obtained from the same data node during associated operations. This greatly reduces network bandwidth consumption.
- Client
The HDFS can be accessed from the Java application programming interface (API), C API, Shell, HTTP REST API and web user interface (WebUI). For details, see HDFS Common API Introduction and HDFS Shell Command Introduce.
- JAVA API
Provides an application interface for the HDFS. This guide describes how to use the Java API to develop HDFS applications.
- C API
Provides an application interface for the HDFS. This guide describes how to use the C API to develop HDFS applications.
- Shell
- HTTP REST API
Additional interfaces except Shell, Java API and C API. You can use the interfaces to monitor HDFS status.
- WEB UI
- JAVA API
- keytab file
The keytab file is a key file that stores user information. Applications use the key file for API authentication on FusionInsight MRS.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot