Using Hybrid Row-Column Storage

Scenarios

In real-world service scenarios, a table must efficiently handle both high-concurrency point queries and updates (for example, for orders, bills, and logs) and support fast batch queries and aggregation for analytical workloads (for example, statistical reports and trend analysis). Traditional row storage and column storage have their own advantages:

Row storage is suitable for real-time scenarios with frequent point queries and writes, for example, querying order details by order number.
Column storage is suitable for batch analysis queries, such as calculating the total transaction amount within a period of time.

Hybrid row-column storage provided by DWS stores both row- and column-format data in one table. The data of the two formats is independently maintained and synchronously updated. The query optimizer selects the optimal access mode based on the actual query path.

Its core advantages are as follows:

No need to split tables or copy data: One table supports both detailed query and batch analysis.
Automatic selection of the optimal path: Row storage is used for point queries, and column storage is used for querying aggregation fields No additional development is required.
Optimal performance: OLAP query efficiency is ensured without sacrificing OLTP performance.
Compatible with original row-store table design: You can quickly enjoy the performance benefits of column-store and row-store optimizations without adjusting the application structure.
Unified management and reduced O&M costs: One physical table covers two scenarios, simplifying data synchronization, backup, and permission control.

This section describes how to design and use hybrid row-column storage structures based on typical scenarios and performance comparison, helping you process data efficiently.

Comparison of Row Storage, Column Storage, Hybrid Row-Column Storage

**Table 1** Comparison of table storage modes
Dimension	Row Storage (orientation='row')	Column Storage (HStore Opt tables)	Hybrid Row-Column Storage (storage_mode='mix')
Storage architecture	Native row storage (not based on column storage)	HStore Opt column storage	Coexistence of the row storage and HStore Opt column storage
Point query (primary key)	Excellent (5)	Poor (2)	Good (4)
Batch import performance	Poor (2)	Excellent (5)	Good (4)
Real-time import performance	Excellent (5)	Excellent (5)	Excellent (5)
Aggregation/Analysis performance	Very Poor (1)	Excellent (5)	Excellent (5)
Occupied Space	High (1)	Very low (5)	Normal (3)
Space bloat	Excellent (5)	Excellent (5)	Good (4)
DWS column-store special optimization	Not supported	Supported	Supported

Constraints

Only clusters of version 9.1.1.100 or later support this function.
HStore tables are used, that is, enable_hstore_opt is set to on.
This function is unavailable for V3 tables with decoupled storage and compute.
The enable_light_update option of HStore tables cannot be enabled.
Binlog tables and materialized views are not supported.
When a small number of data is copied to a database in real time, the performance of hybrid row-column storage deteriorates by 10% compared with that of the HStore table.
A hybrid row-column store table can be exchanged only with another hybrid row-column store table. After capacity expansion and redistribution, the exchange is not allowed.

Usage Suggestions

In only OLTP scenarios (such as high-frequency point queries and writes), if the space usage is sensitive, hybrid row-column store tables are recommended to balance space usage and point query performance. If the space usage is not sensitive and robust performance is required, row-store tables are recommended.
In only OLAP scenarios (such as statistics analysis and report), HStore tables are recommended.
In both detailed and statistical queries where hot and cold columns are difficult to distinguish, hybrid row-column store tables are recommended to achieve excellent performance in each scenario.
For real-time data import, PBE addBatch is recommended.
You are advised not to modify column definitions unless necessary. If data rewriting is triggered, the row-store part of a hybrid row-column store table will be rewritten, which is time-consuming.

Syntax Reference

Create a hybrid row-column store table. For more information, see CREATE TABLE. (Only 9.1.1.100 and later versions support this function.)

     
          CREATE TABLE < table name> (
   <Column definition>
)
WITH (
orientation = column,          -- Use the column storage architecture.
enable_hstore_opt = on,         -- Enable hstore_opt.
    storage_mode = 'mix'           --Create a hybrid row-column store table.
);

Example:

-- Hybrid row-column storage stores data in both row and column formats. It is suitable for mixed services of detailed analysis and analysis.
CREATE TABLE tbl_mix (
    a INT,
    b TEXT
)
WITH (
    orientation = column,
    enable_hstore_opt = on,
    storage_mode = 'mix'
);

Example of Using Hybrid Row-Column Storage

Create a simple data table.

       
            DROP TABLE IF EXISTS data;
CREATE TABLE data(a INT, b BIGINT, c VARCHAR(10), d VARCHAR(10));
INSERT INTO data values(generate_series(1,100),1,'asdfasdf','gergqer');

Expand the table data to 200,000 records.

       
            INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
INSERT INTO data SELECT * FROM data;
SELECT COUNT(*) FROM data;

Click to enlarge

Create a simple hybrid row-column store table.

       
            DROP TABLE IF EXISTS rowmode_test1;
CREATE TABLE rowmode_test1 (
    a INT,
    b BIGINT,
    c VARCHAR(10),
    d VARCHAR(10)
)
WITH (
    orientation = column,
    enable_hstore_opt = on,
    storage_mode = 'mix'
);

Import data to the hybrid row-column store table.

       
            INSERT INTO rowmode_test1 SELECT * FROM data;

Parent Topic: Data Development

Previous topic: Converting a Time Series Table to an HStore Table

Next topic: Using Bitmap Indexes to Reduce Space and Write Costs in Point Queries

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot