Help Center > > Developer Guide> Tutorial: Importing Data from OBS to a Cluster> Step 1: Uploading Data to OBS

Step 1: Uploading Data to OBS

Updated at: Jul 15, 2020 GMT+08:00

Before importing data from OBS to a cluster, prepare source data files and upload these files to OBS. If the data files have been stored on OBS, perform only 2 in Uploading Data to OBS.

Preparing Data Files

You can import data files in TEXT, CSV, ORC, or CARBONDATA format to DWS. This tutorial uses data in CSV format as an example. The method is the same for TEXT and FIXED data except that the parameter settings of foreign tables are different. For details, see About Parallel Data Import from OBS.

To demonstrate how to import multiple files, this tutorial uses the following three CSV data files as an example. Generally, the source data files are exported from a database. In this tutorial, the CSV source data files are manually created.

  • Data file product_info0.csv

    The file contains the following data:

    100,XHDK-A,2017-09-01,A,2017 Shirt Women,red,M,328,2017-09-04,715,good!
    205,KDKE-B,2017-09-01,A,2017 T-shirt Women,pink,L,584,2017-09-05,40,very good!
    300,JODL-X,2017-09-01,A,2017 T-shirt men,red,XL,15,2017-09-03,502,Bad.
    310,QQPX-R,2017-09-02,B,2017 jacket women,red,L,411,2017-09-05,436,It's nice.
    150,ABEF-C,2017-09-03,B,2017 Jeans Women,blue,M,123,2017-09-06,120,good.
  • Data file product_info1.csv

    The file contains the following data:

    200,BCQP-E,2017-09-04,B,2017 casual pants men,black,L,997,2017-09-10,301,good quality.
    250,EABE-D,2017-09-10,A,2017 dress women,black,S,841,2017-09-15,299,This dress fits well.
    108,CDXK-F,2017-09-11,A,2017 dress women,red,M,85,2017-09-14,22,It's really amazing to buy.
    450,MMCE-H,2017-09-11,A,2017 jacket women,white,M,114,2017-09-14,22,very good.
    260,OCDA-G,2017-09-12,B,2017 woolen coat women,red,L,2004,2017-09-15,826,Very comfortable.
  • Data file product_info2.csv

    The file contains the following data:

    980,"ZKDS-J",2017-09-13,"B","2017 Women's Cotton Clothing","red","M",112,,,
    98,"FKQB-I",2017-09-15,"B","2017 new shoes men","red","M",4345,2017-09-18,5473
    50,"DMQY-K",2017-09-21,"A","2017 pants men","red","37",28,2017-09-25,58,"good","good","good"
    80,"GKLW-l",2017-09-22,"A","2017 Jeans Men","red","39",58,2017-09-25,72,"Very comfortable."
    30,"HWEC-L",2017-09-23,"A","2017 shoes women","red","M",403,2017-09-26,607,"good!"
    40,"IQPD-M",2017-09-24,"B","2017 new pants Women","red","M",35,2017-09-27,52,"very good."
    50,"LPEC-N",2017-09-25,"B","2017 dress Women","red","M",29,2017-09-28,47,"not good at all."
    60,"NQAB-O",2017-09-26,"B","2017 jacket women","red","S",69,2017-09-29,70,"It's beautiful."
    70,"HWNB-P",2017-09-27,"B","2017 jacket women","red","L",30,2017-09-30,55,"I like it so much"
    80,"JKHU-Q",2017-09-29,"C","2017 T-shirt","red","M",90,2017-10-02,82,"very good."

CSV is short for Comma Separated Values. A .csv file is similar to a .txt or .doc file. It can also be considered a text file containing records, which are separated into columns by commas (,) or tabs. The column sequence in each record is the same. In Windows, .csv files can be opened in different applications, such as Notepad, Excel, and Notepad++.

The following describes how to generate a .csv file in Windows:

  1. Create a text file and open it in Notepad++. Copy the sample data into it. Then, check the total number of rows and check whether the data of rows is correctly separated.
  2. Choose Format > Encode in UTF-8 without BOM.
  3. Choose File > Save as.
  4. In the displayed dialog box, enter the file name and click Save.

    To identify the file type, use the file name extension .csv when entering the file name.

Uploading Data to OBS

  1. Store the three CSV source data files in the OBS bucket.

    1. Log in to the OBS management console.

      Click Service List and choose Object Storage Service to open the OBS management console.

    2. Create a bucket.

      For details about how to create a bucket, see Creating a Bucket in the Object Storage Service Console Operation Guide.

      For example, create two buckets named mybucket and mybucket02.

    3. Create a folder.

      For details, see Creating a Folder in the Object Storage Service Console Operation Guide

      For example:

      • Create a folder named input_data in the mybucket OBS bucket.
      • Create a folder named input_data in the mybucket02 OBS bucket.
    4. Upload the files.

      For details, see Uploading an Object in the Object Storage Service Console Operation Guide.

      For example:

      • Upload the following data files to the input_data folder in the mybucket OBS bucket:
      • Upload the following data file to the input_data folder in the mybucket02 OBS bucket:

  2. Grant the OBS bucket read permission for the user who will import data.

    When importing data from OBS to a cluster, the user must have the read permission for the OBS buckets where the source data files are located. You can configure the ACL for the OBS buckets to grant the read permission to a specific user.

    For details, see Configuring a Bucket ACL in the Object Storage Service Console Operation Guide.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?

Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel