Help Center/ GaussDB(DWS)/ Performance Tuning/ Optimization Cases/ Case: Adding NOT NULL for JOIN Columns
Updated on 2023-03-30 GMT+08:00

Case: Adding NOT NULL for JOIN Columns

Symptom

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
SELECT
 * 
FROM
( ( SELECT
  STARTTIME STTIME,
  SUM(NVL(PAGE_DELAY_MSEL,0)) PAGE_DELAY_MSEL,
  SUM(NVL(PAGE_SUCCEED_TIMES,0)) PAGE_SUCCEED_TIMES,
  SUM(NVL(FST_PAGE_REQ_NUM,0)) FST_PAGE_REQ_NUM,
  SUM(NVL(PAGE_AVG_SIZE,0)) PAGE_AVG_SIZE,
  SUM(NVL(FST_PAGE_ACK_NUM,0)) FST_PAGE_ACK_NUM,
  SUM(NVL(DATATRANS_DW_DURATION,0)) DATATRANS_DW_DURATION,
  SUM(NVL(PAGE_SR_DELAY_MSEL,0)) PAGE_SR_DELAY_MSEL 
 FROM
  PS.SDR_WEB_BSCRNC_1DAY SDR
  INNER JOIN (SELECT
      BSCRNC_ID,
      BSCRNC_NAME,
      ACCESS_TYPE,
      ACCESS_TYPE_ID 
     FROM
      nethouse.DIM_LOC_BSCRNC 
     GROUP BY
      BSCRNC_ID,
      BSCRNC_NAME,
      ACCESS_TYPE,
      ACCESS_TYPE_ID) DIM 
  ON SDR.BSCRNC_ID = DIM.BSCRNC_ID 
  AND DIM.ACCESS_TYPE_ID IN (0,1,2) 
  INNER JOIN nethouse.DIM_RAT_MAPPING RAT 
  ON (RAT.RAT = SDR.RAT)
 WHERE
  ( (STARTTIME >= 1461340800 
  AND STARTTIME < 1461427200) ) 
  AND RAT.ACCESS_TYPE_ID IN (0,1,2) 
  --and SDR.BSCRNC_ID is not null
 GROUP BY
  STTIME ) ) ;

Figure 1 shows the execution plan.

Figure 1 Adding NOT NULL for JOIN columns (1)

Optimization Analysis

  1. As shown in Figure 1, the sequential scan phase is time consuming.
  2. The JOIN performance is poor because a large number of null values exist in the JOIN column BSCRNC_ID of the PS.SDR_WEB_BSCRNC_1DAY table.

    Therefore, you are advised to manually add NOT NULL for JOIN columns in the statement, as shown below:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    SELECT
     * 
    FROM
    ( ( SELECT
      STARTTIME STTIME,
      SUM(NVL(PAGE_DELAY_MSEL,0)) PAGE_DELAY_MSEL,
      SUM(NVL(PAGE_SUCCEED_TIMES,0)) PAGE_SUCCEED_TIMES,
      SUM(NVL(FST_PAGE_REQ_NUM,0)) FST_PAGE_REQ_NUM,
      SUM(NVL(PAGE_AVG_SIZE,0)) PAGE_AVG_SIZE,
      SUM(NVL(FST_PAGE_ACK_NUM,0)) FST_PAGE_ACK_NUM,
      SUM(NVL(DATATRANS_DW_DURATION,0)) DATATRANS_DW_DURATION,
      SUM(NVL(PAGE_SR_DELAY_MSEL,0)) PAGE_SR_DELAY_MSEL 
     FROM
      PS.SDR_WEB_BSCRNC_1DAY SDR
      INNER JOIN (SELECT
          BSCRNC_ID,
          BSCRNC_NAME,
          ACCESS_TYPE,
          ACCESS_TYPE_ID 
         FROM
          nethouse.DIM_LOC_BSCRNC 
         GROUP BY
          BSCRNC_ID,
          BSCRNC_NAME,
          ACCESS_TYPE,
          ACCESS_TYPE_ID) DIM 
      ON SDR.BSCRNC_ID = DIM.BSCRNC_ID 
      AND DIM.ACCESS_TYPE_ID IN (0,1,2) 
      INNER JOIN nethouse.DIM_RAT_MAPPING RAT 
      ON (RAT.RAT = SDR.RAT)
     WHERE
      ( (STARTTIME >= 1461340800 
      AND STARTTIME < 1461427200) ) 
      AND RAT.ACCESS_TYPE_ID IN (0,1,2) 
      and SDR.BSCRNC_ID is not null
     GROUP BY
      STTIME ) ) A;
    

    The following figure shows the execution plan.

    Figure 2 Adding NOT NULL for JOIN columns (2)