Updated on 2025-04-14 GMT+08:00

HBase Application Development Overview

HBase Introduction

HBase is a column-oriented scalable distributed storage system featuring high reliability and high performance. HBase is designed to break through the limitation when relational databases are used to process massive data.

HBase applies to the following application scenarios:

  • Massive data processing (higher than the TB or PB level).
  • Scenarios that require high throughput.
  • Scenarios that require efficient random read of massive data.
  • Scenarios that require good scalability.
  • Structured and unstructured data is concurrently processed.
  • The Atomicity, Consistency, Isolation, Durability (ACID) feature supported by traditional relational databases is not required.
  • HBase tables provide the following features:
    • Large: One table contains a hundred million rows and one million columns.
    • Column-oriented: Storage and rights control is implemented based on columns (families), and columns (families) are independently retrieved.
    • Sparse: Null columns do not occupy storage space, so a table is sparse.

HBase Interface Type Introduction

The Java language is recommended for HBase application development because HBase is developed based on Java and Java is a concise, universal, and easy-to-understand language.

HBase adopts the same interfaces as those of Apache HBase.

Table 1 describes the functions that HBase can provide by invoking interfaces.

Table 1 Functions provided by HBase interfaces

Function

Description

Data CRUD function

Data creating, retrieving, updating, and deleting.

Advanced feature

Filter, secondary index, and coprocessor.

Management function

Table management and cluster management.

Common Concepts

  • Filter

    Filters provide powerful features to help users improve the table data processing efficiency of HBase. Users can use the filters predefined in HBase and customized filters.

  • Coprocessor

    Coprocessors enable users to perform region-level operations and provide functions similar to those of triggers in relational database management systems (RDBMSs).

  • keytab file

    The keytab file is a key file that stores user information. In security mode, applications use the key file for API authentication on HBase.

  • Client

    Users can access the server from the client through the Java API, HBase Shell or WebUI to read and write HBase tables. The HBase client in this document indicates the HBase client installation package, see HBase External Interfaces.