HDFS Application Development Overview
Introduction to HDFS
Hadoop distribute file system (HDFS) is a distributed file system with high fault tolerance. HDFS supports data access with high throughput and applies to processing of large data sets.
HDFS applies to the following application scenarios:
- Massive data processing (higher than the TB or PB level).
- Scenarios that require high throughput.
- Scenarios that require high reliability.
- Scenarios that require good scalability.
Introduction to HDFS Interface
HDFS can be developed by using Java language. For details of API reference, see HDFS Java APIs.
HDFS Basic Concepts
- Colocation
Colocation is used to store associated data or the data to be associated on the same storage node. The HDFS Colocation stores files to be associated on a same data node so that data can be obtained from the same data node during associated operations. This greatly reduces network bandwidth consumption.
- Client
The HDFS can be accessed from the Java application programming interface (API), C API, Shell, HTTP REST API and web user interface (WebUI).
For details, see Common API Introduction and HDFS Shell Command Introduce.
- JAVA API
Provides an application interface for the HDFS. This guide describes how to use the Java API to develop HDFS applications.
- C API
Provides an application interface for the HDFS. This guide describes how to use the C API to develop HDFS applications.
- Shell
- HTTP REST API
Additional interfaces except Shell, Java API and C API. You can use the interfaces to monitor HDFS status.
- WEB UI
- JAVA API
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.