Updated on 2023-10-23 GMT+08:00

Step 4: Importing Data to GaussDB

  1. Run the following statements to create the target table product_info in GaussDB to store imported data:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    openGauss=# DROP TABLE IF EXISTS product_info;
    openGauss=# CREATE TABLE product_info
    (
        product_price                integer        not null,
        product_id                   char(30)       not null,
        product_time                 date           ,
        product_level                char(10)       ,
        product_name                 varchar(200)   ,
        product_type1                varchar(20)    ,
        product_type2                char(10)       ,
        product_monthly_sales_cnt    integer        ,
        product_comment_time         date           ,
        product_comment_num          integer        ,
        product_comment_content      varchar(200)                   
    ) 
    WITH (
    orientation = column,
    compression=middle
    ) 
    DISTRIBUTE BY hash (product_id);
    

  2. (Optional) This step is not required in this example because no index is created in 1. If the target table has indexes, the index information will be incrementally updated during import, affecting data import performance. You are advised to delete the indexes from the target table before the import. You can create the indexes again after the import is complete.

    1. Assume that there is an ordinary index product_idx in the product_id column of the target table product_info. Delete the index in the table.
      1
      openGauss=# DROP INDEX product_idx;
      
    2. After importing the data, create the index again.
      1
      openGauss=# CREATE INDEX product_idx ON product_info(product_id);
      
    3. Set enable_stream_operator to on.
      1
      openGauss=# set enable_stream_operator=on;;
      
    • You can temporarily add the GUC parameter maintenance_work_mem or psort_work_mem to accelerate index recreation.
    • To import foreign tables in parallel, you must enable the stream operator.
    • If enable_stream_operator is set to on, the performance is affected. If there are other SQL statements to be executed in the session, you are advised to set enable_stream_operator to off. If there is no SQL statement to be executed in the session, disconnect the session.

  3. Import data from source data files to the product_info table through the foreign table product_info_ext.

    1
    openGauss=# INSERT INTO product_info SELECT * FROM product_info_ext ;
    
    If information similar to the following is displayed, the data has been imported:
    1
    INSERT 0 20
    

  4. Run SELECT to view the data imported to the target table product_info in GaussDB.

    1
    openGauss=# SELECT count(*) FROM product_info;
    

    If the following information is displayed, the import is successful:

    1
    2
    3
    4
    count 
    -------
         20
    (1 row)