Updated on 2024-10-09 GMT+08:00

Configuring Separate Storage for HBase Cold and Hot Data

In a big data storage scenario, HBase table data such as order data or monitoring data grows over time. As your business develops, such data can be of a large volume and rarely used. Companies may want to use cost-effective storage to store this type of data to reduce costs.

HBase separates cold data from hot data and stores them on different media. Cold data is stored in OBS and hot data is stored in HDFS, reducing storage costs.

This function is supported only by MRS 3.3.0 or later.

  • IOPS of data reading in OBS decreases. As a result, OBS is suitable for infrequent queries only.
  • It is not a good choice to use OBS for a large number of concurrent read requests. Otherwise, exceptions may occur.

Principles

HBase supports separate cold and hot storage of data in the same table. After a user configures the time boundary between hot and cold data, HBase determines whether data is hot or cold based on the timestamp (ms) and the time boundary configured by the user. New data is stored in the hot storage and is gradually moved to the cold storage over time. You can change the time boundary for separating cold and hot data as you need. Data can be moved from the cold storage to the hot storage or vice versa.

Figure 1 HBase cold and hot separation principle

Configuring Separate Storage for HBase Cold and Hot Data

You can modify the HBase configuration on FusionInsight Manager to enable the cold and hot data separation feature so that cold data can be stored in OBS and hot data can be stored in HDFS.

  1. Interconnecting the Guardian Service with OBS. For details, see "Interconnecting the Guardian Service with OBS" section.
  2. Log in to FusionInsight Manager, choose Cluster > Services > HBase, and click Configuration. Search for and modify the following parameters:

    • fs.coldFS: OBS file system name, for example, obs://OBS parallel file system name
    • hbase.fs.hot.cold.enabled: The default value is false. Set this parameter to true.
    • fs.obs.buffer.dir: Set this parameter to the directory of the locally mounted data disk, for example, /srv/BigData/data1/tmp/HBase/obs.

  3. Click Save.
  4. Click Dashboard and click More > Restart Service to restart the HBase service. After the service is restarted, hot-cold data separation is enabled.
  5. Set the cold and hot boundary for the table data. For details, see Cold-Hot Separation Commands.