Hash Partitioning

Hash partitioning is a method of mapping data to partitions based on the built-in hash algorithm of GaussDB and the partition key. If the partition key value range has no data skew, the hash algorithm can evenly distribute data rows among partitions, ensuring that the sizes of partitions are roughly the same. This is an ideal method for achieving even data distribution among partitions.

Hash partitioning is also an easy-to-use alternative to range partitioning, especially when the data to be partitioned is not historical data or there is no obvious partition key available for partitioning. The following is an example:

gaussdb=# CREATE TABLE bmsql_order_line (
    ol_w_id          INTEGER   NOT NULL,
    ol_d_id          INTEGER   NOT NULL,
    ol_o_id          INTEGER   NOT NULL,
    ol_number        INTEGER   NOT NULL,
    ol_i_id          INTEGER   NOT NULL,
    ol_delivery_d    TIMESTAMP,
    ol_amount        DECIMAL(6,2),
    ol_supply_w_id   INTEGER,
    ol_quantity      INTEGER,
    ol_dist_info     CHAR(24)
)
-- Define 100 partitions.
PARTITION BY HASH(ol_d_id)
(
    PARTITION p0,
    PARTITION p1,
    PARTITION p2,
    …
    PARTITION p99
);
-- Drop the table.
gaussdb=# DROP TABLE bmsql_order_line;

In the preceding example, the ol_d_id column of the bmsql_order_line table serves as the basis for partitioning. The ol_d_id column is an identifier attribute column. It does not have the time characteristics and cannot distinguish data in a specific dimension. In this case, using the hash partitioning policy to partition the table is an ideal choice.

Compared with other partitioning types, hash partitioning requires only the partition key and the number of partitions to be specified. Before creating a hash partition, ensure that the partition key does not have serious data skew (that is, one or more values are highly repeated). Hash partitioning can ensure that data in each partition is evenly distributed, greatly improving the usability of partitioned tables.

Parent topic: Partitioning Policy

Previous topic: Range Partitioning

Next topic: List Partitioning