Step 3: Selecting Storage and Compression Modes
Selecting a Storage Mode
GaussDB(DWS) supports hybrid row-column storage. You can create row- or column-store tables as needed in your business scenarios.
Generally, if a table contains many columns (called a wide table) and its query involves only a few columns, column storage is recommended. If a table contains only a few columns and a query includes most of the fields, row storage is recommended.
| Storage Mode | Scenario |
|---|---|
| Row storage | Point query returned with a few records. This is a simple index-based query. In this scenario, data is frequently added, modified, and deleted. |
| Column storage | Statistics analysis queries where tables are frequently grouped and joined. Ad hoc queries where query conditions are uncertain and no index can be determined for row-store tables. |
Sample tables used in this tutorial are typical multi-column TPC-DS tables where many statistical analysis queries are performed. Therefore, the column storage mode is recommended.
1 | WITH (ORIENTATION = column)
|
Selecting a Compression Level
In scenarios where I/O is large (much data is read and written) and CPU is sufficient (little data is computed), select a high compression ratio. In scenarios where I/O is small and CPU is insufficient, select a low compression ratio. Based on this principle, you are advised to select different compression ratios and test and compare the results to select the optimal compression ratio as required. Specify a compressions ratio using the COMPRESSION parameter. The supported values are as follows:
- The valid value of column-store tables is YES, NO, LOW, MIDDLE, or HIGH, and the default value is LOW.
- The valid values of row-store tables are YES and NO, and the default is NO.
The service scenarios applicable to each compression level are described in the following table.
| Compression Level | Application Scenario |
|---|---|
| LOW | The system CPU usage is high and the disk storage space is sufficient. |
| MIDDLE | The system CPU usage is moderate and the disk storage space is insufficient. |
| HIGH | The system CPU usage is low and the disk storage space is insufficient. |
No compression ratio is specified in Step 1: Creating an Initial Table and Loading Sample Data, and the low compression ratio is selected by GaussDB(DWS) by default. Specify COMPRESSION to MIDDLE, and compare the result to that when COMPRESSION is set to LOW.
The following is an example of selecting a storage mode and the MIDDLE compression ratio for a table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | CREATE TABLE store_sales
(
ss_sold_date_sk integer ,
ss_sold_time_sk integer ,
ss_item_sk integer not null,
ss_customer_sk integer ,
ss_cdemo_sk integer ,
ss_hdemo_sk integer ,
ss_addr_sk integer ,
ss_store_sk integer ,
ss_promo_sk integer ,
ss_ticket_number bigint not null,
ss_quantity integer ,
ss_wholesale_cost decimal(7,2) ,
ss_list_price decimal(7,2) ,
ss_sales_price decimal(7,2) ,
ss_ext_discount_amt decimal(7,2) ,
ss_ext_sales_price decimal(7,2) ,
ss_ext_wholesale_cost decimal(7,2) ,
ss_ext_list_price decimal(7,2) ,
ss_ext_tax decimal(7,2) ,
ss_coupon_amt decimal(7,2) ,
ss_net_paid decimal(7,2) ,
ss_net_paid_inc_tax decimal(7,2) ,
ss_net_profit decimal(7,2)
)
WITH (ORIENTATION = column,COMPRESSION=middle);
|
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.