Help Center/ GaussDB(DWS)/ Best Practices/ Import and Export/ Transferring Data Between OBS Buckets and GaussDB(DWS) Clusters
Updated on 2025-07-30 GMT+08:00

Transferring Data Between OBS Buckets and GaussDB(DWS) Clusters

This guide shows how to upload sample data to OBS and access it through an OBS foreign table. You can import OBS data into GaussDB(DWS) or export data from GaussDB(DWS) to OBS.

  • Data can be imported to OBS in TXT, CSV, ORC, PARQUET, CARBONDATA, or JSON formats.
  • Data can be exported from OBS in the following formats:

OBS is an object-based storage service that offers secure, reliable, and cost-effective data storage for various users, websites, enterprises, and developers. You can use OBS Console or OBS Browser to access and manage data stored on OBS from any computer connected to the Internet. For details, see Object Storage Service Documentation.

This practice takes approximately 1 hour. The basic procedure is as follows:

Preparations

Step 1: Prepare OBS Data

  1. Download the data sample file.
  2. Log in to OBS Console and click the dws-demo01 bucket in the bucket list.
  3. Select Objects on the left, click Create Folder, and name the folder obs-dws.
  4. Go to the obs-dws folder, click Upload Object, and upload the sample file downloaded in 1 to the obs-dws folder.
  5. Obtain the OBS endpoint.

    1. Go back to the obs-dwst page and click Overview on the left.
    2. Record the endpoint from the Domain Name Details list, for example, obs.ap-southeast-1.myhuaweicloud.com.

Step 2: Create an OBS Foreign Server and a Foreign Table

A foreign server is a virtual link that helps organize and control external data sources like databases and file systems in a database system or data warehouse. It enables unified access to diverse distributed data and plays a key role in data integration, real-time analysis, and data virtualization.

To access OBS bucket data through a foreign server, set up a foreign OBS server and provide the endpoint, Access Key (AK), and Secret Key (SK) for OBS.

  1. Run the following SQL statement to create a foreign server after connecting to the database.

    Set ADDRESS to the address obtained in 5, and set ACCESS_KEY and SECRET_ACCESS_KEY to the AK and SK obtained in Preparations.
    1
    2
    3
    4
    5
    6
    7
    CREATE SERVER obs_server FOREIGN DATA WRAPPER DFS_FDW 
    OPTIONS ( 
      ADDRESS 'obs.aaaaa.bbbbb.com', 
      ACCESS_KEY 'xxxxxxxxx', 
      SECRET_ACCESS_KEY 'yyyyyyyyyyyyy', 
      TYPE 'OBS'
      );
    

  2. Run the following SQL statement to create a schema named dws_data:

    1
    CREATE SCHEMA dws_data;
    

  3. Switch to the newly created schema and create a foreign table.

    Replace '/obs-demo01/obs-dws/' with your actual OBS path where the data files are stored. Make sure the OBS bucket and the GaussDB(DWS) cluster are in the same region. In this example, the OBS path is the obs-dws folder in the obs-demo01 bucket.

    SERVER obs_server indicates the name of the foreign server created in 1, for example, obs_server.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    CREATE FOREIGN TABLE dws_data.obs_pq_order
    ( order_idVARCHAR(14)PRIMARY KEY NOT ENFORCED,
    order_channel VARCHAR(32),
    order_timeTIMESTAMP,
    cust_codeVARCHAR(6),
    pay_amountDOUBLE PRECISION,
    real_payDOUBLE PRECISION )
    SERVER obs_server 
    OPTIONS (
    foldername '/obs-demo01/obs-dws/',
    format 'parquet',
    encoding 'utf8' )
    READ ONLY
    DISTRIBUTE BY roundrobin;
    

Step 3: Access and Import OBS Bucket Data to a GaussDB(DWS) Cluster

  1. Use the foreign table created in the preceding steps to directly access data in the OBS bucket.

    1
    SELECT * FROM dws_data.obs_pq_order;
    

  2. Execute the SELECT statement with conditions through a foreign table.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    SELECT COUNT(*) FROM dws_data.obs_pq_order;
    
    SELECT order_id, order_channel, order_time, cust_code FROM dws_data.obs_pq_order;
    
    SELECT TO_CHAR(order_time, 'Month, YYYY') AS order_month, cust_code, COUNT(*) AS order_cnt
    FROM dws_data.obs_pq_order
    WHERE DATE_PART('Year', order_time) = 2023
    GROUP BY TO_CHAR(order_time, 'Month, YYYY'), cust_code
    HAVING COUNT(*) >= 10;
    

  3. Create a local table to import data to a GaussDB(DWS) cluster through an OBS foreign table.

    1
    2
    3
    4
    5
    6
    CREATE TABLE dws_data.dws_monthly_order
    ( order_monthCHAR(8),
    cust_codeVARCHAR(6),
    order_countINT,
    total_pay_amountDOUBLE PRECISION,
    total_real_payDOUBLE PRECISION );
    

  4. Use the OBS foreign table data to calculate the monthly order details of 2023 and import the result to the local table of the GaussDB(DWS) cluster.

    1
    2
    3
    4
    5
    6
    7
    8
    INSERT INTO dws_data.dws_monthly_order
    ( order_month, cust_code, order_count
    , total_pay_amount, total_real_pay )
    SELECT TO_CHAR(order_time, 'MON-YYYY'), cust_code, COUNT(*)
    , SUM(pay_amount), SUM(real_pay)
    FROM dws_data.obs_pq_order
    WHERE DATE_PART('Year', order_time) = 2023
    GROUP BY TO_CHAR(order_time, 'MON-YYYY'), cust_code;
    

  5. Check the table data import status.

    1
    SELECT * FROM dws_data.dws_monthly_order;
    

Step 4: Export Data from a GaussDB(DWS) Table to an OBS Bucket

  1. Create a local table.

    1
    2
    3
    4
    5
    6
    7
    CREATE TABLE dws_data.dws_order
    ( order_idVARCHAR(14)PRIMARY KEY,
    order_channel VARCHAR(32),
    order_timeTIMESTAMP,
    cust_codeVARCHAR(6),
    pay_amountDOUBLE PRECISION,
    real_payDOUBLE PRECISION );
    

  2. Insert three data records.

    1
    2
    3
    4
    INSERT INTO dws_data.dws_order 
    VALUES ('20230627000001', 'webShop', TIMESTAMP '2023-06-27 10:00:00', 'CUST1', 1000, 1000)
    , ('20230627000002', 'webShop', TIMESTAMP '2023-06-27 11:00:00', 'CUST2', 5000, 5000)
    , ('20240309000003', 'webShop', TIMESTAMP '2024-03-09 13:00:00', 'CUST1', 2000, 2000);
    

  3. Create a foreign table for exporting data to an OBS bucket.

    Replace '/obs-demo01/obs-dws/' with your actual OBS bucket path.
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    CREATE FOREIGN TABLE dws_data.obs_orc_order
    ( order_idVARCHAR(14)PRIMARY KEY NOT ENFORCED,
    order_channel VARCHAR(32),
    order_timeTIMESTAMP,
    cust_codeVARCHAR(6),
    pay_amountDOUBLE PRECISION,
    real_payDOUBLE PRECISION )
    SERVER obs_server 
    OPTIONS (
    foldername '/obs-demo01/obs-dws/',
    format 'ORC',
    encoding 'utf8' )
    WRITE ONLY
    DISTRIBUTE BY roundrobin;
    

  4. Write the local table data to the OBS foreign table.

    1
    2
    3
    INSERT INTO dws_data.obs_orc_order ( order_id, order_channel, order_time, cust_code, pay_amount, real_pay )
    SELECT order_id, order_channel, order_time, cust_code, pay_amount, real_pay
      FROM dws_data.dws_order;
    

  5. Query whether the data import is successful.

    1
    SELECT * FROM dws_data.obs_orc_order;
    

  6. Log in to the OBS console and check whether the data file exists in the path of the OBS bucket.
  7. Check whether extra data can be inserted into the same foreign table.

    1
    2
    3
    INSERT INTO dws_data.obs_orc_order ( order_id, order_channel, order_time, cust_code, pay_amount, real_pay )
    SELECT order_id, order_channel, order_time, cust_code, pay_amount, real_pay
      FROM dws_data.dws_order;
    

    The error "the file path specified in the foreign table is not empty" appears. If you delete the data file in the OBS file path, you can insert data again.

  8. Check whether foreign table data can be updated and deleted.

    1
    2
    UPDATE dws_data.obs_orc_order  SET pay_amount = 3000, real_pay = 3000 WHERE order_id   = '20240309000003';
    DELETE FROM dws_data.obs_orc_order WHERE order_id = '20240309000003';
    

    According to the command output, foreign table data cannot be updated or deleted.