Help Center/ GaussDB(DWS)/ Best Practices/ Table Optimization Practices/ Step 2: Testing System Performance of the Initial Table and Establishing a Baseline
Updated on 2024-03-13 GMT+08:00

Step 2: Testing System Performance of the Initial Table and Establishing a Baseline

Before and after tuning table structures, test and record the following information to compare differences in system performance:

  • Load time
  • Storage space occupied by tables
  • Query performance

The examples in this practice are based on a dws.d2.xlarge cluster consisting of eight nodes. Because system performance is affected by many factors, clusters of the same flavor may have different results.

Model

dws.d2.xlarge VM

CPU

4*CPU E5-2680 v2 @ 2.80GHZ

Memory

32 GB

Network

1 GB

Disk

1.63 TB

Number of Nodes

8

Record the results using the following benchmark table.

Benchmark

Before

After

Loading time (11 tables)

341584 ms

-

Occupied storage space

Store_Sales

-

-

Date_Dim

-

-

Store

-

-

Item

-

-

Time_Dim

-

-

Promotion

-

-

Customer_Demographics

-

-

Customer_Address

-

-

Household_Demographics

-

-

Customer

-

-

Income_Band

-

-

Total storage space

-

-

Query execution time

Query 1

-

-

Query 2

-

-

Query 3

-

-

Total execution time

-

-

Perform the following steps to test the system performance before tuning to establish a benchmark:

  1. Enter the cumulative load time for all the 11 tables in the benchmarks table in the Before column.
  2. Record the storage space usage of each table.

    Determine how much disk space is used for each table using the pg_size_pretty function and record the results in base tables.

    1
    SELECT T_NAME, PG_SIZE_PRETTY(PG_RELATION_SIZE(t_name)) FROM (VALUES('store_sales'),('date_dim'),('store'),('item'),('time_dim'),('promotion'),('customer_demographics'),('customer_address'),('household_demographics'),('customer'),('income_band')) AS names1(t_name);
    

    The following information is displayed:

             t_name         | pg_size_pretty
    ------------------------+----------------
     store_sales            | 42 GB
     date_dim               | 11 MB
     store                  | 232 kB
     item                   | 110 MB
     time_dim               | 11 MB
     promotion              | 256 kB
     customer_demographics  | 171 MB
     customer_address       | 170 MB
     household_demographics | 504 kB
     customer               | 441 MB
     income_band            | 88 kB
    (11 rows)

  3. Test query performance.

    Run the following queries and record the time spent on each query. The execution durations of the same query can be different, depending on the OS cache during execution. You are advised to perform several rounds of tests and select a group with average values.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    \timing on
    SELECT * FROM (SELECT  COUNT(*)  
    FROM store_sales 
        ,household_demographics  
        ,time_dim, store 
    WHERE ss_sold_time_sk = time_dim.t_time_sk    
        AND ss_hdemo_sk = household_demographics.hd_demo_sk  
        AND ss_store_sk = s_store_sk 
        AND time_dim.t_hour = 8 
        AND time_dim.t_minute >= 30 
        AND household_demographics.hd_dep_count = 5 
        AND store.s_store_name = 'ese' 
    ORDER BY COUNT(*) 
     ) LIMIT 100;
    
    SELECT * FROM (SELECT  i_brand_id brand_id, i_brand brand, i_manufact_id, i_manufact,
    SUM(ss_ext_sales_price) ext_price
     FROM date_dim, store_sales, item,customer,customer_address,store
     WHERE d_date_sk = ss_sold_date_sk
       AND ss_item_sk = i_item_sk
       AND i_manager_id=8
       AND d_moy=11
       AND d_year=1999
       AND ss_customer_sk = c_customer_sk 
       AND c_current_addr_sk = ca_address_sk
       AND substr(ca_zip,1,5) <> substr(s_zip,1,5) 
       AND ss_store_sk = s_store_sk 
     GROUP BY i_brand
          ,i_brand_id
          ,i_manufact_id
          ,i_manufact
     ORDER BY ext_price desc
             ,i_brand
             ,i_brand_id
             ,i_manufact_id
             ,i_manufact
     ) LIMIT 100;
    
    SELECT * FROM (SELECT  s_store_name, s_store_id,
            SUM(CASE WHEN (d_day_name='Sunday') THEN ss_sales_price ELSE null END) sun_sales,
            SUM(CASE WHEN (d_day_name='Monday') THEN ss_sales_price ELSE null END) mon_sales,
            SUM(CASE WHEN (d_day_name='Tuesday') THEN ss_sales_price ELSE  null END) tue_sales,
            SUM(CASE WHEN (d_day_name='Wednesday') THEN ss_sales_price ELSE null END) wed_sales,
            SUM(CASE WHEN (d_day_name='Thursday') THEN ss_sales_price ELSE null END) thu_sales,
            SUM(CASE WHEN (d_day_name='Friday') THEN ss_sales_price ELSE null END) fri_sales,
            SUM(CASE WHEN (d_day_name='Saturday') THEN ss_sales_price ELSE null END) sat_sales
     FROM date_dim, store_sales, store
     WHERE d_date_sk = ss_sold_date_sk AND
           s_store_sk = ss_store_sk AND
           s_gmt_offset = -5 AND
           d_year = 2000 
     GROUP BY s_store_name, s_store_id
     ORDER BY s_store_name, s_store_id,sun_sales,mon_sales,tue_sales,wed_sales,thu_sales,fri_sales,sat_sales
      ) LIMIT 100;
    

After the preceding statistics are collected, the benchmark table is as follows:

Benchmark

Before

After

Loading time (11 tables)

341584 ms

-

Occupied storage space

Store_Sales

42 GB

-

Date_Dim

11 MB

-

Store

232 KB

-

Item

110 MB

-

Time_Dim

11 MB

-

Promotion

256 KB

-

Customer_Demographics

171 MB

-

Customer_Address

170 MB

-

Household_Demographics

504 KB

-

Customer

441 MB

-

Income_Band

88 KB

-

Total storage space

42 GB

-

Query execution time

Query 1

14552.05 ms

-

Query 2

27952.36 ms

-

Query 3

17721.15 ms

-

Total execution time

60225.56 ms

-