Help Center/ Data Warehouse Service / Best Practices/ Data Development/ Using Bitmap Indexes to Reduce Space and Write Costs in Point Queries
Updated on 2025-10-10 GMT+08:00

Using Bitmap Indexes to Reduce Space and Write Costs in Point Queries

Scenarios

Point queries on high-selectivity columns such as primary keys, order numbers, and user IDs are common in real-world applications. Indexing these columns is the standard approach to optimize query performance

However, in large data volume, wide table structure, or batch import scenarios, the conventional indexing mechanism gradually exposes the following problems:

  • The index maintenance is costly and the space usage is high.
  • The import or update performance is limited, and the batch write overhead is costly.
  • For columns with many duplicate values, there is a low index hit ratio.

To resolve these problems, DWS provides lightweight column-level bitmap indexes for HStore tables. Such indexes speed up queries and minimize space and write costs. They are especially suitable for optimizing the performance of point queries on columns.

In HStore tables, bitmap indexes, dictionary encoding, and Bloom filters are used based on data distribution in CUs to construct lightweight index structures and compression solutions. Combined with the column-store execution mechanism of DWS, bitmap indexes can effectively improve point queries and significantly reduce the overall I/O cost while maintaining extremely low space overhead.

Bitmap indexes are especially suitable for high-frequency point queries, high-repetition columns s, and mixed hot and cold columns. They improve performance while effectively controlling storage costs.

Constraints

  • This function is implemented based on the hybrid row-column storage and is supported only by 9.1.1.100 and later versions.
  • For details about other constraints on hybrid row-column storage, see Constraints.

Syntax Reference

To use a bitmap index, specify the bitmap column (using the bitmap_columns parameter) when creating a table. In this way, the bitmap index mapping can be generated.

Create a hybrid row-column store table and specify the bitmap column. For more information, see CREATE TABLE. (Only 9.1.1.100 and later versions support this function.)

1
2
3
4
5
6
7
8
9
CREATE TABLE < table name> (
   <Column definition>
)
WITH (
orientation = column,          -- Use the column storage architecture.
enable_hstore_opt = on,         -- Enable hstore_opt.
    storage_mode = 'mix'           --Create a hybrid row-column store table.
bitmap_columns = 'a'            --Specify the bitmap column (for example, column a).
);

Examples

  1. Use the client to connect to DWS and create a data table.

    Wait for about 1 minute because there is a large amount of data.
    1
    2
    3
    4
    5
    6
    7
    8
    DROP TABLE IF EXISTS data;
    CREATE TABLE data(a INT, b BIGINT, c VARCHAR(10), d VARCHAR(10));
    INSERT INTO data values(generate_series(1,60000),1,'asdfasdf','gergqer');
    INSERT INTO data select * from data;
    INSERT INTO data select * from data;
    INSERT INTO data select * from data;
    INSERT INTO data select * from data;
    INSERT INTO data select * from data;
    

  2. Create a simple hybrid row-column store table.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    DROP TABLE IF EXISTS mixmode_test1;
    CREATE TABLE mixmode_test1 (
        a INT,
        b BIGINT,
        c VARCHAR(10),
        d VARCHAR(10)
    )
    WITH (
        orientation = column,
        enable_hstore_opt = on,
        storage_mode = 'mix',
    bitmap_columns = 'a' --Specify column a as the bitmap column.
    );
    

  3. Create a B-tree index.

    1
    CREATE INDEX idx_data_a_btree ON mixmode_test1(a);
    

  4. Import data to the hybrid row-column store table.

    1
    INSERT INTO mixmode_test1 SELECT * FROM data;
    

  5. Use index scan to search for the record where a = 42 and print the execution plan.

    1
    2
    3
    4
    SET enable_seqscan = off; -- Disable sequential scan.
    SET enable_indexscan = on; -- Use index scan.
    SET enable_fast_query_shipping = off; -- The optimizer uses the distributed framework, that is, the execution plan is generated on a CN and then sent to the DNs for execution.
    EXPLAIN PERFORMANCE SELECT * FROM mixmode_test1 WHERE a = 42;
    

    According to the execution plan results, the executor selects index scan.

  6. Use sequential scan to search for records where a = 42 and print the execution plan.

    1
    2
    3
    4
    5
    SET enable_seqscan = on; -- Enable sequential scan.
    SET enable_indexscan = off; -- Disable index scan.
    SET enable_bitmapscan = off; -- Disable bitmap scan.
    SET enable_fast_query_shipping = off; -- The optimizer uses the distributed framework, that is, the execution plan is generated on a CN and then sent to the DNs for execution.
    EXPLAIN PERFORMANCE SELECT * FROM mixmode_test1 WHERE a = 42;
    

    According to the execution plan result, sequential scan is selected in the bitmap_column point query execution plan. The auto index filters out most data.