Bu sayfa henüz yerel dilinizde mevcut değildir. Daha fazla dil seçeneği eklemek için yoğun bir şekilde çalışıyoruz. Desteğiniz için teşekkür ederiz.

Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ GaussDB(DWS)/ Best Practices/ Data Migration/ Migrating Data Between GaussDB(DWS) Clusters Using GDS

Migrating Data Between GaussDB(DWS) Clusters Using GDS

Updated on 2024-10-29 GMT+08:00

This practice demonstrates how to migrate 15 million rows of data between two GaussDB(DWS) clusters within minutes based on the high concurrency of GDS import and export.

NOTE:
  • This function is supported only by clusters of version 8.1.2 or later.
  • GDS is a high-concurrency import and export tool developed by GaussDB(DWS). For more information, visit GDS Usage Guide.
  • This section describes only the operation practice. For details about GDS interconnection and syntax description, see GDS-based Cross-Cluster Interconnection.

This practice takes about 90 minutes. The cloud services used in this practice are GaussDB(DWS), Elastic Cloud Server (ECS), and Virtual Private Cloud (VPC). The basic process is as follows:

  1. Prerequisites
  2. Step 1: Creating Two GaussDB(DWS) Clusters
  3. Step 2: Preparing Source Data
  4. Step 3: Installing and Starting the GDS Server
  5. Step 4: Implementing Data Interconnection Across GaussDB(DWS) Clusters

Supported Regions

Table 1 describes the regions where OBS data has been uploaded.

Table 1 Regions and OBS bucket names

Region

OBS Bucket

CN North-Beijing1

dws-demo-cn-north-1

CN North-Beijing2

dws-demo-cn-north-2

CN North-Beijing4

dws-demo-cn-north-4

CN North-Ulanqab1

dws-demo-cn-north-9

CN East-Shanghai1

dws-demo-cn-east-3

CN East-Shanghai2

dws-demo-cn-east-2

CN South-Guangzhou

dws-demo-cn-south-1

CN South-Guangzhou-InvitationOnly

dws-demo-cn-south-4

CN-Hong Kong

dws-demo-ap-southeast-1

AP-Singapore

dws-demo-ap-southeast-3

AP-Bangkok

dws-demo-ap-southeast-2

LA-Santiago

dws-demo-la-south-2

AF-Johannesburg

dws-demo-af-south-1

LA-Mexico City1

dws-demo-na-mexico-1

LA-Mexico City2

dws-demo-la-north-2

RU-Moscow2

dws-demo-ru-northwest-2

LA-Sao Paulo1

dws-demo-sa-brazil-1

Constraints

In this practice, two sets of GaussDB(DWS) and ECS services are deployed in the same region and VPC to ensure network connectivity.

Prerequisites

  • You have obtained the AK and SK of the account.
  • You have created a VPC and subnet. For details, see Creating a VPC.

Step 1: Creating Two GaussDB(DWS) Clusters

Create two GaussDB(DWS) clusters. For details, see Creating a Cluster. You are advised to create the clusters in the CN-Hong Kong region. Name the two clusters dws-demo01 and dws-demo02.

Step 2: Preparing Source Data

  1. On the cluster management page of the GaussDB(DWS) console, locate the row that contains the dws-demo01 cluster and click Login in the Operation column.

    NOTE:

    This practice uses version 8.1.3.x as an example. 8.1.2 and earlier versions do not support this login mode. You can use Data Studio to connect to a cluster. For details, see Using Data Studio to Connect to a Cluster.

  2. After the login is successful, the SQL editor is displayed.
  3. Copy the following SQL statements to the SQL window and click Execute SQL to create the test TPC-H table ORDERS.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    CREATE TABLE ORDERS
     ( 
     O_ORDERKEY BIGINT NOT NULL , 
     O_CUSTKEY BIGINT NOT NULL , 
     O_ORDERSTATUS CHAR(1) NOT NULL , 
     O_TOTALPRICE DECIMAL(15,2) NOT NULL , 
     O_ORDERDATE DATE NOT NULL , 
     O_ORDERPRIORITY CHAR(15) NOT NULL , 
     O_CLERK CHAR(15) NOT NULL , 
     O_SHIPPRIORITY BIGINT NOT NULL , 
     O_COMMENT VARCHAR(79) NOT NULL)
     with (orientation = column)
     distribute by hash(O_ORDERKEY)
     PARTITION BY RANGE(O_ORDERDATE)
     ( 
     PARTITION O_ORDERDATE_1 VALUES LESS THAN('1993-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_2 VALUES LESS THAN('1994-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_3 VALUES LESS THAN('1995-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_4 VALUES LESS THAN('1996-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_5 VALUES LESS THAN('1997-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_6 VALUES LESS THAN('1998-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_7 VALUES LESS THAN('1999-01-01 00:00:00')
     );
    

  4. Run the SQL statements below to create an OBS foreign table.

    Replace AK and SK with the actual AK and SK of the account. <obs_bucket_name> is obtained from Supported Regions.
    NOTE:

    Hardcoded or plaintext AK/SK is risky. For security, encrypt your AK/SK and store them in the configuration file or environment variables.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    CREATE FOREIGN TABLE ORDERS01
     (
    LIKE orders
     ) 
     SERVER gsmpp_server 
     OPTIONS (
     ENCODING 'utf8',
     LOCATION 'obs://<obs_bucket_name>/tpch/orders.tbl',
     FORMAT 'text',
     DELIMITER '|',
    ACCESS_KEY 'access_key_value_to_be_replaced',
    SECRET_ACCESS_KEY 'secret_access_key_value_to_be_replaced',
     CHUNKSIZE '64',
     IGNORE_EXTRA_DATA 'on'
     );
    

  5. Run the SQL statement below to import data from the OBS foreign table to the source GaussDB(DWS) cluster. The import takes about 2 minutes.

    NOTE:

    If an import error occurs, the AK and SK values of the foreign table are incorrect. In this case, run DROP FOREIGN TABLE order01 to delete the foreign table, create a foreign table again, and run the following statement to import data again.

    1
    INSERT INTO orders SELECT * FROM orders01;
    

  6. Repeat the preceding steps to log in to the destination cluster dws-demo02 and run the following SQL statements to create the target table orders.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    CREATE TABLE ORDERS
     ( 
     O_ORDERKEY BIGINT NOT NULL , 
     O_CUSTKEY BIGINT NOT NULL , 
     O_ORDERSTATUS CHAR(1) NOT NULL , 
     O_TOTALPRICE DECIMAL(15,2) NOT NULL , 
     O_ORDERDATE DATE NOT NULL , 
     O_ORDERPRIORITY CHAR(15) NOT NULL , 
     O_CLERK CHAR(15) NOT NULL , 
     O_SHIPPRIORITY BIGINT NOT NULL , 
     O_COMMENT VARCHAR(79) NOT NULL)
     with (orientation = column)
     distribute by hash(O_ORDERKEY)
     PARTITION BY RANGE(O_ORDERDATE)
     ( 
     PARTITION O_ORDERDATE_1 VALUES LESS THAN('1993-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_2 VALUES LESS THAN('1994-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_3 VALUES LESS THAN('1995-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_4 VALUES LESS THAN('1996-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_5 VALUES LESS THAN('1997-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_6 VALUES LESS THAN('1998-01-01 00:00:00'), 
     PARTITION O_ORDERDATE_7 VALUES LESS THAN('1999-01-01 00:00:00')
     );
    

Step 3: Installing and Starting the GDS Server

  1. Create an ECS by referring to Purchasing an ECS. Note that the ECS and GaussDB(DWS) instances must be created in the same region and VPC. In this example, the CentOS 7.6 version is selected as the ECS image.
  2. Download the GDS package.

    1. Log in to the GaussDB(DWS) console.
    2. In the navigation tree on the left, choose Management > Client Connections.
    3. Select the GDS client of the target version from the drop-down list of CLI Client.

      Select a version based on the cluster version and the OS where the client is installed.

    4. Click Download.

  3. Use the SFTP tool to upload the downloaded client (for example, dws_client_8.2.x_redhat_x64.zip) to the /opt directory of the ECS.
  4. Log in to the ECS as the root user and run the following commands to go to the /opt directory and decompress the client package.

    1
    2
    cd /opt
    unzip dws_client_8.2.x_redhat_x64.zip
    

  5. Create a GDS user and the user group to which the user belongs. This user is used to start GDS and read source data.

    1
    2
    groupadd gdsgrp
    useradd -g gdsgrp gds_user
    

  6. Change the owner of the GDS package directory and source data file directory to the GDS user.

    1
    2
    chown -R gds_user:gdsgrp /opt/gds/bin
    chown -R gds_user:gdsgrp /opt
    

  7. Switch to user gds.

    1
    su - gds_user
    

  8. Run the following commands to go to the gds directory and execute environment variables.

    1
    2
    cd /opt/gds/bin
    source gds_env
    

  9. Run the following command to start GDS. You can view the private IP address of the ECS on the ECS console.

    1
    /opt/gds/bin/gds -d /opt -p Private IP address of the ECS:5000 -H 0.0.0.0/0 -l /opt/gds/bin/gds_log.txt -D -t 2
    

  10. Enable the network port between the ECS and GaussDB(DWS).

    The GDS server (ECS in this practice) needs to communicate with GaussDB(DWS). The default security group of the ECS does not allow inbound traffic from GDS port 5000 and GaussDB(DWS) port 8000. Perform the following steps:

    1. Return to the ECS console and click the ECS name to go to the ECS details page.
    2. Click the Security Groups tab and click Manage Rule.
    3. Choose Inbound Rules and click Add Rule. Set Priority to 1, set Protocol & Port to 5000, and click OK.

    4. Repeat the preceding steps to add an inbound rule of 8000.

Step 4: Implementing Data Interconnection Across GaussDB(DWS) Clusters

  1. Create a server.

    1. Obtain the private IP address of the source GaussDB(DWS) cluster. Specifically, go to the GaussDB(DWS) console, choose Dedicated Clusters > Clusters, and click the source cluster name dws-demo01.
    2. Go to the cluster details page and record the private network IP address.

    3. Switch back to the GaussDB(DWS) console and click Log In in the Operation column of the destination cluster dws-demo02. The SQL window is displayed.

      Run the commands below to create a server.

      In the commands, Private network IP address of the source GaussDB(DWS) cluster is obtained in the previous step, Private IP address of the ECS is obtained from the ECS console, and Login password of user dbadmin is set when the GaussDB(DWS) cluster is created.

      1
      2
      3
      4
      5
      6
      7
      8
      9
      CREATE SERVER server_remote FOREIGN DATA WRAPPER GC_FDW OPTIONS
       (
       address 'Private network IP address of the source GaussDB(DWS) cluster:8000',
       dbname 'gaussdb',
       username 'dbadmin',
       password 'Login password of user dbadmin',
       syncsrv 'gsfs://Private IP address of the ECS:5000'
       )
       ;
      

  2. Create a foreign table for interconnection.

    In the SQL window of the destination cluster dws-demo02, run the following statements to create a foreign table for interconnection:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    CREATE FOREIGN TABLE ft_orders
     (
     O_ORDERKEY BIGINT , 
     O_CUSTKEY BIGINT , 
     O_ORDERSTATUS CHAR(1) , 
     O_TOTALPRICE DECIMAL(15,2) , 
     O_ORDERDATE DATE , 
     O_ORDERPRIORITY CHAR(15) , 
     O_CLERK CHAR(15) , 
     O_SHIPPRIORITY BIGINT , 
     O_COMMENT VARCHAR(79) 
    
     ) 
     SERVER server_remote 
     OPTIONS 
     (
     schema_name 'public',
     table_name 'orders',
     encoding 'SQL_ASCII'
     );
    

  3. Import all table data.

    In the SQL window, run the SQL statement below to import full data from the ft_orders foreign table: Wait for about 1 minute.

    1
    INSERT INTO orders SELECT * FROM ft_orders;
    

    Run the following SQL statement to verify that 15 million rows of data are successfully imported.

    1
    SELECT count(*) FROM orders;
    

  4. Import data based on filter criteria.

    1
    INSERT INTO orders SELECT * FROM ft_orders WHERE o_orderkey < '10000000';
    

Sitemizi ve deneyiminizi iyileştirmek için çerezleri kullanırız. Sitemizde tarama yapmaya devam ederek çerez politikamızı kabul etmiş olursunuz. Daha fazla bilgi edinin

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback