Updated on 2022-07-11 GMT+08:00

Application Development Overview

HBase Introduction

HBase is a column-oriented scalable distributed storage system featuring high reliability and high performance. HBase is designed to break through the limitation when relational databases are used to process massive data.

HBase applies to the following application scenarios:

  • Massive data processing (higher than the TB or PB level).
  • Scenarios that require high throughput.
  • Scenarios that require efficient random read of massive data.
  • Scenarios that require good scalability.
  • Structured and unstructured data is concurrently processed.
  • The Atomicity, Consistency, Isolation, Durability (ACID) feature supported by traditional relational databases is not required.
  • HBase tables provide the following features:
    • Large: One table contains a hundred million rows and one million columns.
    • Column-oriented: Storage and rights control is implemented based on columns (families), and columns (families) are independently retrieved.
    • Sparse: Null columns do not occupy storage space, so a table is sparse.

Interface Type Introduction

The Java language is recommended for HBase application development because HBase is developed based on Java and Java is a concise, universal, and easy-to-understand language.

HBase adopts the same interfaces as those of Apache HBase.

Table 1 describes the functions that HBase can provide by invoking interfaces.

Table 1 Functions provided by HBase interfaces

Function

Description

Data CRUD function

Data creating, retrieving, updating, and deleting

Advanced feature

Filter, secondary index, and coprocessor

Management function

Table management and cluster management