Updated on 2022-09-14 GMT+08:00

Introduction to HDFS

HDFS

Hadoop distribute file system (HDFS) is a distributed file system with high fault tolerance running on universal hardware. HDFS supports data access with high throughput and applies to processing of large-scale data sets.

HDFS applies to the following application scenarios:

  • Massive data processing (higher than the TB or PB level)
  • Scenarios that require high throughput
  • Scenarios that require high reliability
  • Scenarios that require excellent scalability

HDFS APIs

HDFS applications can be developed using Java. For details about APIs, see Java APIs.