Updated on 2022-07-26 GMT+08:00

Installing and Starting GDS

Installing and Starting GDS

  1. Log in to the GaussDB(DWS) console.
  2. In the navigation tree on the left, click Connections.
  3. Select the GaussDB(DWS) client from the Client drop-down list.

    Select a version based on the cluster version and the OS where the client is installed.

  4. Click Download.
  5. Upload the GDS tool package to the /opt directory on the ECS. In this example, the tool package of Euler Kunpeng is uploaded.
  6. Go to the directory and decompress the package.

    1
    2
    cd /opt/
    unzip dws_client_8.1.x_euler_kunpeng_x64.zip
    

  7. Create a user (gds_user) and the user group (gdsgrp) to which the user belongs. This user is used to start GDS and must have the permission for reading the source data file directory.

    1
    2
    groupadd gdsgrp
    useradd -g gdsgrp gds_user
    

  8. Change the owner of the GDS package and source data file directory to gds_user and change the user group to gdsgrp.

    1
    2
    3
    chown -R gds_user:gdsgrp /opt/
    chown -R gds_user:gdsgrp /data1
    chown -R gds_user:gdsgrp /data2
    

  9. Switch to user gds_user.

    1
    su - gds_user
    

  10. Execute the script on which the environment depends (applicable only to version 8.1.x).

    1
    2
    cd /opt/gds/bin
    source gds_env
    

  11. Start GDS.

    1
    2
    3
    4
    /opt/gds/bin/gds -d /data1/script/tpch-kit/tpch1000X -p 192.168.0.90:5000 -H 192.168.0.0/24 -l /opt/gds/gds01_log.txt -D #Used in the TPC-H test.
    /opt/gds/bin/gds -d /data2/script/tpch-kit/tpch1000X -p 192.168.0.90:5001 -H 192.168.0.0/24 -l /opt/gds/gds02_log.txt -D #Used in the TPC-H test.
    /opt/gds/bin/gds -d /data1/script/tpcds-kit/tpcds1000X/ -p 192.168.0.90:5002 -H 192.168.0.0/24 -l /opt/gds/gds03_log.txt -D #Used in the TPC-DS test.
    /opt/gds/bin/gds -d /data2/script/tpcds-kit/tpcds1000X/ -p 192.168.0.90:5003 -H 192.168.0.0/24 -l /opt/gds/gds04_log.txt -D  #Used in the TPC-DS test.
    
    • Replace the italic part in the command with the actual values. If data shards are stored in multiple data disk directories, start same number of GDSs as the directories.
    • If TPC-H and TPC-DS data is tested at the same time, you need to start the preceding four GDSs. If only TPC-DS or TPC-H data is tested, start the corresponding GDS.
    • -d dir: directory for storing data files that contain data to be imported.
    • -p ip:port: listening IP address and port for GDS. Replace the IP address with the private network IP address of ECS to ensure that DWS can communicate with GDS through this IP address. The port numbers are 5000 and 5001 for TPC-H and 5002 and 5003 for TPC-DS.
    • -H address_string: servers that are allowed to connect to and use GDS. The value must be in CIDR format. Set this parameter to the internal network segment of the data warehouse cluster, for example, 192.168.0.0/24. The ECS where GDS is located and data warehouse are in the same VPC and can communicate with each other through the internal network.
    • -l log_file: GDS log directory and log file name.
    • -D: GDS in daemon mode. This command can be used only in the Linux operating system.