Designing a Primary Key

Data access efficiency in HBase depends on a proper partitioning policy. However, HBase uses lexicographic order to sort data partitions. If the primary key is not properly designed, data hotspots may occur. That is, a large amount of data is typically produced on a few nodes, affecting system performance. Primary key prefixes can distribute data evenly to avoid hotspots and facilitate data access.

Scenarios

If you use auto-increment integers as primary keys in relational databases and directly apply the primary key configuration to GeminiDB HBase API, data may be unevenly distributed among as data volumes increase.

Assume that the primary key covers integers 0 to 1999999 as data volumes increase. If only partition keys [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] are set when you pre-partition GeminiDB HBase instances, the partition in lexicographic order [1, 2) (partition 2) will produce more data than others. All data with primary keys 1000000-1999999 (prefix 1) will be stored in this partition, which is the hotspot partition.

Although GeminiDB HBase API automatically splits partition when the data volume in a single partition reaches a threshold, there is a delay during the split. So it cannot meet performance requirements for real time high-throughput requests. You are advised to distribute data and create pre-partitions instead of directly using the auto-increment integer primary keys of relational databases. This can avoid hotspot partitions and provide more stable performance for high-throughput requests.

Solution

If your ID is a continuously increasing long integer and is directly written as a row key of HBase, all new data will be written to the same region server, causing serious hotspot issues. You need to add a bucket or salt value prefix before the row key to evenly distribute the write traffic to regions.

First, you need to add a bucket prefix before an actual ID. You can obtain the bucket prefix by performing the modulo operation on an ID. For example, you can divide the ID by 10 to obtain a number (bucket prefix) ranging from 0 to 9. Each piece of data is written with its own bucket prefix. In this way, the same ID is stored in different regions, achieving even distribution. When an HBase table is created, the table is divided into pre-partitions based on the bucket prefix. For example, the table can be divided into 10 or 100 regions, each of which corresponds to a bucket prefix.

You can design the primary key in the format of [Bucket prefix]|[Reversed ID or ID]. Bucket prefix indicates a remainder of the ID divided by 10. For example, if the remainder of 1420008 divided by 10 is 8, the new primary key is 8|1420008. In this case, the primary key is in partition 8 instead of partition 1. When you query primary key 1420008, 8|1420008 needs to be transferred for single-point queries.

If the continuously increasing long integer ID is directly written as the HBase row key, all new data is produced to the same region server, causing serious hotspot issues (write-intensive operations and performance bottleneck). To solve these issues, you need to add a bucket prefix or salt value prefix before the row key. Data is scattered to evenly distribute write traffic to regions.

Generation logic of a bucket prefix
A bucket prefix is generated based on the modulo operation on the original ID to ensure that the prefix value range is controllable and evenly distributed. For example, if data is bucketed by 10, the remainder of the original ID divided by 10 is an integer ranging from 0 to 9. For example, if the original ID is 1420008, the remainder of 1420008 divided by 10 is 8, which is the bucket prefix. When each piece of data is written, the prefix must be added before the original ID to form a row key consisting of the prefix and ID. For example, the new row key of 1420008 is 8|1420008.
Evenly distributing traffic in pre-partitions
You need to plan pre-partitions based on the bucket prefix during HBase table creation.
- If 10 buckets (prefixes 0–9) are used, create 10 regions in advance. Each region corresponds to a bucket prefix, for example, prefix 0 in region 1, prefix 1 in region 2, and prefix 9 in region 10.
- As the data volume grows, you can create 100 buckets (prefixes 0–99) and 100 regions.
  In this way, row keys with different prefixes are accurately allocated to corresponding regions (for example, 8|1420008 is allocated to the region with prefix 8). Write traffic will be evenly distributed in each region.
Row key format and query adaptation
- Recommended row key format
  Use
```
[BucketPrefix]|[ID]
```
  or
```
[BucketPrefix]|[Reversed ID]
```
  | is a separator used to distinguish the prefix from the original ID.
  - To improve the ID range query, you can reverse the original ID (for example, reverse 1420008 to 8000241) to form 8|8000241, preventing partial hotspots caused by auto-increment IDs under the same prefix.
  - To execute only a single-point query, use [Bucket prefix]|[Original ID] (for example, 8|1420008).
- Querying adaptation rules
  When querying the original ID (for example, 1420008), you need to calculate the bucket prefix (for example, 8) using the same modulo logic. Then use the complete composite row key (8|1420008) to execute a single-point query. This ensures data in the corresponding region can be accurately located, avoiding full table scanning.

The combination of bucket prefixes and pre-partitions can avoid HBase hotspots caused by auto-increment IDs, ensuring even traffic distribution while maintaining I/O efficiency. This method is suitable for high-throughput requests and large data volumes.

To distribute data in more pre-partitions, set the pre-partition key to [000,002,...,999] to generate 1,000 pre-partitions. Divide the primary key by 1000 and add the remainder before the primary key.

Parent Topic: Best Practices

Previous topic: Best Practices

Next topic: How Do I Set Pre-partition Keys When Creating a Table on a GeminiDB HBase Instance?