Support and Constraints

A hybrid data warehouse is compatible with all column-store syntax.

**Table 1** Supported syntax
Syntax	Supported
CREATE TABLE	Yes
CREATE TABLE LIKE	Yes
DROP TABLE	Yes
INSERT	Yes
COPY	Yes
SELECT	Yes
TRUNCATE	Yes
EXPLAIN	Yes
ANALYZE	Yes
VACUUM	Yes
ALTER TABLE DROP PARTITION	Yes
ALTER TABLE ADD PARTITION	Yes
ALTER TABLE SET WITH OPTION	Yes
ALTER TABLE DROP COLUMN	Yes
ALTER TABLE ADD COLUMN	Yes
ALTER TABLE ADD NODELIST	Yes
ALTER TABLE CHANGE OWNER	Yes
ALTER TABLE RENAME COLUMN	Yes
ALTER TABLE TRUNCATE PARTITION	Yes
CREATE INDEX	Yes
DROP INDEX	Yes
DELETE	Yes
Other ALTER TABLE syntax	Yes
ALTER INDEX	Yes
MERGE	Yes
SELECT INTO	Yes
UPDATE	Yes
CREATE TABLE AS	Yes

Constraints

To use HStore tables, use the following parameter settings, or the performance of HStore tables will deteriorate significantly:
In clusters of version 9.1.1.100 and later versions, the recommended parameter settings are as follows: autovacuum_max_workers_hstore=3, autovacuum_max_workers=2, autovacuum_max_workers_col=2, autovacuum=true.
In version 8.2.1 and later, you can now clear the dirty data from column-store indexes. This is especially beneficial when dealing with frequent data updates and imports into the database. By efficiently managing the index space, it improves both the import and query performance.
In clusters of version 9.1.1.100, dirty page clearing in hstore tables depends on the column-store vacuum and asynchronous sorting mechanisms. By default, a CU with less than 1,000 rows is regarded as a small CU. Small CUs on dirty pages are merged through asynchronous sorting, and non-small CUs are rewritten to new files through column-store vacuum. Column-store vacuum is controlled by the colvacuum_threshold_scale_factor parameter.
When using HStore asynchronous sorting, pay attention to the following:
DML operations on certain data may be blocked during asynchronous sorting. The maximum blocking granularity is the row threshold for asynchronous sorting. This function is not recommended for frequent DML operations.

Differences Between Column-Store Tables and Delta Tables

**Table 2** Differences between the delta tables of HStore and column-store tables
Data Warehouse Type	Column-Store Delta Table	HStore Delta Table	HStore Opt Delta Table
Table structure	Same as that defined for the column-store primary table	Different from that defined for the primary table	Different from that defined for the primary table but same as that defined for the HStore table
Function	Used to temporarily store a small batch of inserted data. After the data size reaches the threshold, the data will be merged to the primary table. In this way, data will not be directly inserted to the primary table or generate a large number of small CUs.	Used to persistently stores UPDATE, DELETE, and INSERT information. It is used to restore the memory structure that manages concurrent updates, such as the memory update chain, in the case of a fault.	Used to persistently stores UPDATE, DELETE, and INSERT information. It is used to restore the memory structure that manages concurrent updates, such as the memory update chain, in the case of a fault. It is further optimized compared with HStore.
Weakness	If data is not merged in a timely manner, the delta table will grow large and affect query performance. In addition, the table cannot solve lock conflicts during concurrent updates.	The merge operation depends on the background AUTOVACUUM.	The merge operation depends on the background AUTOVACUUM.
Specifications differences	Concurrent requests in the same CU are not supported. It is applicable to the scenario where there are not many concurrent updates.	Insertion and update restrictions: MERGE INTO does not support concurrent updates of the same row or repeated updates of the same key. Concurrent UPDATE or DELETE operations on the same row are not supported. Otherwise, an error is reported. Index and query restrictions: Indexes do not support array condition filtering, IN expression filtering, partial indexes, or expression indexes. Indexes cannot be invalidated. Table structure and operation restrictions: Ensure that the tables to be exchanged are HStore tables during partition exchange or relfilenode operations. The distribution column cannot be modified using the UPDATE command. You are not advised to modify the partition column using the UPDATE command. (No error is reported, but the performance is poor.)	Insertion and update restrictions: MERGE INTO does not support concurrent updates of the same row or repeated updates of the same key. Concurrent UPDATE or DELETE operations on the same row are not supported. Otherwise, an error is reported. hstore_opt does not support cross-partition upserts. Index and query restrictions: Bitmap indexes are supported. Global dictionaries are supported. bitmap_columns must be specified during table creation and cannot be modified after being set. The opt version does not support transparent parameter transmission during SMP streaming. In multi-table join queries that require partition pruning, avoid using replicated tables or setting query_dop. Table structure and operation restrictions: Distribution columns and partition columns cannot be modified using UPDATE. The enable_hstore_opt attribute must be set when the table is created and cannot be changed after being set.
Data import suggestions	For optimal data import, query performance, and space utilization, it is recommended to choose the HStore Opt table. In scenarios involving micro-batch copying with high performance demands and no data updates, you can choose the HStore table. Similarities between HStore and HStore Opt tables: The performance of importing data using UPDATE is poor. You are advised to use UPSERT to import data. When using DELETE to import data, use index scanning. The JDBC batch method is recommended. Use MERGE INTO to import data records to the database when the data volume exceeds 1 million per DN and there is no concurrent data. Do not modify or add data in cold partitions.
Point query suggestions	Generally, the HStore Opt table is recommended for point queries. Similarities between HStore and HStore Opt tables: Create a level-2 partition on the column where the equal-value filter condition is most frequently used and distinct values are evenly distributed. Suggestions on using HStore tables for point queries: Accelerating indexes other than primary keys may have poor effect. You are advised not to enable index acceleration. If the data type is numeric or strings less than 16 bytes, Turbo acceleration is recommended. Suggestions on using HStore Opt tables: For equal-value filter columns not in level-2 partitions, if the columns involved in the filter criteria are basically fixed in the query, use the CB-tree index. If the columns change continuously, you are advised to use the GIN index. Do not select more than five index columns. For all string columns involving equivalent filtering, bitmap indexes can be specified during table creation. The number of columns is not limited, but cannot be modified later. Specify columns that can be filtered by time range as the partition columns. If the number of returned data records exceeds 100,000 per DN, index scanning may not significantly enhance performance. In this case, you are advised to use the GUC parameter enable_seqscan to test the performance then determine which optimization method to use.

Parent Topic: Hybrid Data Warehouse

Previous topic: Introduction to Hybrid Data Warehouse

Next topic: Hybrid Data Warehouse Binlog

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

For any further questions, feel free to contact us through the chatbot.

Chatbot