หน้านี้ยังไม่พร้อมใช้งานในภาษาท้องถิ่นของคุณ เรากำลังพยายามอย่างหนักเพื่อเพิ่มเวอร์ชันภาษาอื่น ๆ เพิ่มเติม ขอบคุณสำหรับการสนับสนุนเสมอมา

Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
On this page

Step 1: Creating an Initial Table and Loading Sample Data

Updated on 2024-11-08 GMT+08:00

Supported Regions

Table 1 describes the regions where OBS data has been uploaded.

Table 1 Regions and OBS bucket names

Region

OBS Bucket

CN North-Beijing1

dws-demo-cn-north-1

CN North-Beijing2

dws-demo-cn-north-2

CN North-Beijing4

dws-demo-cn-north-4

CN North-Ulanqab1

dws-demo-cn-north-9

CN East-Shanghai1

dws-demo-cn-east-3

CN East-Shanghai2

dws-demo-cn-east-2

CN South-Guangzhou

dws-demo-cn-south-1

CN South-Guangzhou-InvitationOnly

dws-demo-cn-south-4

CN-Hong Kong

dws-demo-ap-southeast-1

AP-Singapore

dws-demo-ap-southeast-3

AP-Bangkok

dws-demo-ap-southeast-2

LA-Santiago

dws-demo-la-south-2

AF-Johannesburg

dws-demo-af-south-1

LA-Mexico City1

dws-demo-na-mexico-1

LA-Mexico City2

dws-demo-la-north-2

RU-Moscow2

dws-demo-ru-northwest-2

LA-Sao Paulo1

dws-demo-sa-brazil-1

Create a group of tables without specifying their storage modes, distribution keys, distribution modes, or compression modes. Load sample data into these tables.

  1. (Optional) Create a cluster.

    If a cluster is available, skip this step. For how to create a cluster, see Creating a DWS 2.0 Cluster.

    Furthermore, connect to the cluster and test the connection. For details, see Methods of Connecting to a Cluster.

    This practice uses an 8-node cluster as an example. You can also use a four-node cluster to perform the test.

  2. Create an SS test table store_sales.

    NOTE:

    If SS tables already exist in the current database, run the DROP TABLE statement to delete these tables first.

    For example, delete the store_sales table.

    1
    DROP TABLE store_sales;
    

    Do not configure the storage mode, distribution key, distribution mode, or compression mode when you create this table.

    Run the CREATE TABLE command to create the 11 tables in Figure 3. This section only provides the syntax for creating the store_sales table. To create all tables, copy the syntax in Creating an Initial Table.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    CREATE TABLE store_sales
    (
        ss_sold_date_sk           integer                       ,
        ss_sold_time_sk           integer                       ,
        ss_item_sk                integer               not null,
        ss_customer_sk            integer                       ,
        ss_cdemo_sk               integer                       ,
        ss_hdemo_sk               integer                       ,
        ss_addr_sk                integer                       ,
        ss_store_sk               integer                       ,
        ss_promo_sk               integer                       ,
        ss_ticket_number          bigint               not null,
        ss_quantity               integer                       ,
        ss_wholesale_cost         decimal(7,2)                  ,
        ss_list_price             decimal(7,2)                  ,
        ss_sales_price            decimal(7,2)                  ,
        ss_ext_discount_amt       decimal(7,2)                  ,
        ss_ext_sales_price        decimal(7,2)                  ,
        ss_ext_wholesale_cost     decimal(7,2)                  ,
        ss_ext_list_price         decimal(7,2)                  ,
        ss_ext_tax                decimal(7,2)                  ,
        ss_coupon_amt             decimal(7,2)                  ,
        ss_net_paid               decimal(7,2)                  ,
        ss_net_paid_inc_tax       decimal(7,2)                  ,
        ss_net_profit             decimal(7,2)                  
    ) ;
    

  3. Load sample data into these tables.

    An OBS bucket provides sample data used for this practice. The bucket can be read by all authenticated cloud users. Perform the following operations to load the sample data:

    1. Create a foreign table for each table.

      GaussDB(DWS) uses the foreign data wrappers (FDWs) provided by PostgreSQL to import data in parallel. To use FDWs, create FDW tables first (also called foreign tables). This section only provides the syntax for creating the obs_from_store_sales_001 foreign table corresponding to the store_sales table. To create all foreign tables, copy the syntax in Creating a Foreign Table.

      NOTE:
      • Note that <obs_bucket_name> in the following statement indicates the OBS bucket name. Only some regions are supported. For details about the supported regions and OBS bucket names, see Table 1. GaussDB(DWS) clusters do not support cross-region access to OBS bucket data.
      • The columns of the foreign table must be the same as that of the corresponding ordinary table. In this example, store_sales and obs_from_store_sales_001 should have the same columns.
      • The foreign table syntax obtains the sample data used for this practice from the OBS bucket. To load other sample data, modify SERVER gsmpp_server OPTIONS as needed. For details, see About Parallel Data Import from OBS.
      • Hardcoded or plaintext AK/SK is risky. For security, encrypt your AK/SK and store them in the configuration file or environment variables.
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      CREATE FOREIGN TABLE obs_from_store_sales_001
      (
          ss_sold_date_sk           integer                       ,
          ss_sold_time_sk           integer                       ,
          ss_item_sk                integer               not null,
          ss_customer_sk            integer                       ,
          ss_cdemo_sk               integer                       ,
          ss_hdemo_sk               integer                       ,
          ss_addr_sk                integer                       ,
          ss_store_sk               integer                       ,
          ss_promo_sk               integer                       ,
          ss_ticket_number          bigint               not null,
          ss_quantity               integer                       ,
          ss_wholesale_cost         decimal(7,2)                  ,
          ss_list_price             decimal(7,2)                  ,
          ss_sales_price            decimal(7,2)                  ,
          ss_ext_discount_amt       decimal(7,2)                  ,
          ss_ext_sales_price        decimal(7,2)                  ,
          ss_ext_wholesale_cost     decimal(7,2)                  ,
          ss_ext_list_price         decimal(7,2)                  ,
          ss_ext_tax                decimal(7,2)                  ,
          ss_coupon_amt             decimal(7,2)                  ,
          ss_net_paid               decimal(7,2)                  ,
          ss_net_paid_inc_tax       decimal(7,2)                  ,
          ss_net_profit             decimal(7,2)                  
      )
      -- Configure OBS server information and data format details.
      SERVER gsmpp_server
      OPTIONS (
      LOCATION 'obs://<obs_bucket_name>/tpcds/store_sales',
      FORMAT 'text',
      DELIMITER '|',
      ENCODING 'utf8',
      NOESCAPING 'true',
      ACCESS_KEY 'access_key_value_to_be_replaced',
      SECRET_ACCESS_KEY 'secret_access_key_value_to_be_replaced',
      REJECT_LIMIT 'unlimited',
      CHUNKSIZE '64'
      )
      -- If create foreign table failed,record error message
      WITH err_obs_from_store_sales_001;
      
    2. Set ACCESS_KEY and SECRET_ACCESS_KEY parameters as needed in the foreign table creation statement, and run this statement in a client tool to create a foreign table.

      For the values of ACCESS_KEY and SECRET_ACCESS_KEY, see Creating Access Keys (AK and SK).

    3. Import data.
      Create the insert.sql script containing the following statements and execute it:
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      \timing on
      \parallel on 4
      INSERT INTO store_sales SELECT * FROM obs_from_store_sales_001;
      INSERT INTO date_dim SELECT * FROM obs_from_date_dim_001;
      INSERT INTO store SELECT * FROM obs_from_store_001;
      INSERT INTO item SELECT * FROM obs_from_item_001;
      INSERT INTO time_dim SELECT * FROM obs_from_time_dim_001;
      INSERT INTO promotion SELECT * FROM obs_from_promotion_001;
      INSERT INTO customer_demographics SELECT * from obs_from_customer_demographics_001 ;
      INSERT INTO customer_address SELECT * FROM obs_from_customer_address_001 ;
      INSERT INTO household_demographics SELECT * FROM obs_from_household_demographics_001;
      INSERT INTO customer SELECT * FROM obs_from_customer_001;
      INSERT INTO income_band SELECT * FROM obs_from_income_band_001;
      \parallel off
      

      The returned result is as follows:

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      SET
      Timing is on.
      SET
      Time: 2.831 ms
      Parallel is on with scale 4.
      Parallel is off.
      INSERT 0 402
      Time: 1820.909 ms
      INSERT 0 73049
      Time: 2715.275 ms
      INSERT 0 86400
      Time: 2377.056 ms
      INSERT 0 1000
      Time: 4037.155 ms
      INSERT 0 204000
      Time: 7124.190 ms
      INSERT 0 7200
      Time: 2227.776 ms
      INSERT 0 1920800
      Time: 8672.647 ms
      INSERT 0 20
      Time: 2273.501 ms
      INSERT 0 1000000
      Time: 11430.991 ms
      INSERT 0 1981703
      Time: 20270.750 ms
      INSERT 0 287997024
      Time: 341395.680 ms
      total time: 341584  ms
      
    4. Calculate the total time spent in creating the 11 tables. The result will be recorded as the loading time in the benchmark table in 1 in the next section.
    5. Run the following command to verify that each table is loaded correctly and records lines into the table:
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      SELECT COUNT(*) FROM store_sales;
      SELECT COUNT(*) FROM date_dim;
      SELECT COUNT(*) FROM store;
      SELECT COUNT(*) FROM item;
      SELECT COUNT(*) FROM time_dim;
      SELECT COUNT(*) FROM promotion;
      SELECT COUNT(*) FROM customer_demographics;
      SELECT COUNT(*) FROM customer_address;
      SELECT COUNT(*) FROM household_demographics;
      SELECT COUNT(*) FROM customer;
      SELECT COUNT(*) FROM income_band;
      

      The number of rows in each SS table is as follows:

      Table name

      Number of Rows

      Store_Sales

      287997024

      Date_Dim

      73049

      Store

      402

      Item

      204000

      Time_Dim

      86400

      Promotion

      1000

      Customer_Demographics

      1920800

      Customer_Address

      1000000

      Household_Demographics

      7200

      Customer

      1981703

      Income_Band

      20

  4. Run the ANALYZE command to update statistics.

    1
    ANALYZE;
    

    If ANALYZE is returned, the execution is successful.

    1
    ANALYZE
    

    The ANALYZE statement collects statistics about table content in databases, which will be stored in the PG_STATISTIC system catalog. Then, the query optimizer uses the statistics to work out the most efficient execution plan.

    After executing batch insertions and deletions, you are advised to run the ANALYZE statement on the table or the entire library to update statistics.

เราใช้คุกกี้เพื่อปรับปรุงไซต์และประสบการณ์การใช้ของคุณ การเรียกดูเว็บไซต์ของเราต่อแสดงว่าคุณยอมรับนโยบายคุกกี้ของเรา เรียนรู้เพิ่มเติม

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback