Updated on 2024-04-29 GMT+08:00

HBase Cluster Management

Introduction to HBase

HBase is a column-oriented distributed cloud storage system that features enhanced reliability, excellent performance, and elastic scalability. It applies to the storage of massive amounts of data and distributed computing. You can use HBase to build a storage system capable of storing terabytes to petabytes of data. With HBase, you can filter and analyze data with ease and get responses in milliseconds, rapidly mining data value.

HBase applies to the following scenarios:

  • Mass data storage

    HBase applies to TB- or even PB-level data storage and provides dynamic scaling capabilities so that you can adjust cluster resources to meet specific performance or capacity requirements.

  • Real-time query

    The columnar and key-value storage models apply to the ad-hoc query of enterprise user details. The primary key–based low-latency point query reduces the response latency to seconds or even milliseconds, facilitating real-time data analysis.

For details about HBase architecture and principles, visit https://hbase.apache.org/book.html.

Currently, CloudTable HBase does not have a security authentication mechanism. If HBase with an authentication mechanism is required, you are advised to use the HBase component in the MapReduce Service (MRS).

HBase Cluster Management Functions

CloudTable is a distributed and scalable key-value data storage service provided by Huawei Cloud. CloudTable provides the following functions of HBase cluster management on the web-based console:

  • Creating a cluster: You can create a cluster on the CloudTable console. It supports charging based on the number of compute units you selected when creating the cluster and the actual storage capacity. You can independently choose and install the advanced features, which are charged separately. You will be notified of renewal if your balance is insufficient for fee deduction. Cluster resources will be frozen during a retention period and unfrozen after your renewal. CloudTable helps you reduce costs as much as possible by adopting an architecture with computing isolated from storage and dynamically adjusting compute resources.
  • Expanding the cluster capacity: Compute units of a cluster can be increased.

    Increasing compute units: You can dynamically increase the number of compute units based on site requirements or service conditions to ensure read and write performance. The cluster automatically implements load balancing to ensure service continuity and smooth capacity expansion. An extra fee will be generated when you increase compute units.

  • Managing a cluster: You can manage a created cluster.
    • Metric monitoring: The system collects monitoring data during cluster running, reports the data to Cloud Eye (CES), and displays the cluster running status in graphics. When a metric is spotted as abnormal, a message is sent for notification so that users and administrators can handle this problem in a timely manner.
    • Deleting a cluster: You can delete a cluster that is no longer needed. This is a high-risk operation. Deleting a cluster may cause data loss. Therefore, before deleting a cluster, ensure that no service is running and all data has been saved.
    • Restarting a cluster: You need to restart a cluster if HBase parameters of this cluster have been modified or the system runs slowly due to long-time running. Restart may cause data loss in running services. If you have to restart a cluster, ensure that there is no running service and all data has been saved.
    • Querying alarms: If either the system or a cluster is faulty, CloudTable will collect fault information and report it to the network management system. Maintenance personnel will then be able to locate the faults.
    • Querying logs: Cluster, job, and configuration operations are recorded, helping locate faults in case of cluster operating exceptions.

Advantages

  • Native HBase APIs: CloudTable HBase is designed to be compatible with native HBase APIs, ensuring high availability of the architecture through the separation of computing and storage for enhanced reliability, along with in-depth kernel optimization.
  • Ease of use: Secondary indexes are supported to meet non-primary key query requirements.
  • Low costs: Cold and hot data can be segregated to fulfill the needs of data archiving and the storage of historical data with infrequent access, thereby minimizing storage expenses.
  • Stability and Reliability: CloudTable HBase provides stable and reliable performance through hotspot diagnosis and self-healing mechanism.
  • Visualized monitoring and O&M: CloudTable HBase offers visualized monitoring and user-defined alarm rules, simplifying system operation and maintenance.