Updated on 2024-09-06 GMT+08:00

Testing Parallel Query Performance

This section describes how to use the TPC-H test tool to test the performance of 22 parallel queries.

The test instance information is as follows:

  • Instance specifications: 32 vCPUs | 256 GB
  • Kernel version: 2.0.26.1
  • Concurrent threads: 16
  • Data volume: 100 GB

Procedure

  1. Generate test data.

    1. Download the shared source code in the TPC-H test from https://github.com/electrum/tpch-dbgen.
    2. Find the makefile.suite file, modify its contents as follows, and save the modifications:
      CC      = gcc
      # Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
      #                                  SQLSERVER, SYBASE, ORACLE
      # Current values for MACHINE are:  ATT, DOS, HP, IBM, ICL, MVS,
      #                                  SGI, SUN, U2200, VMS, LINUX, WIN32
      # Current values for WORKLOAD are:  TPCH
      DATABASE= SQLSERVER
      MACHINE = LINUX
      WORKLOAD = TPCH
    3. In the root directory of the source code, run the following command to compile and generate the data tool dbgen:

      make -f makefile.suite

    4. Run the following command to generate 100 GB data:

      ./dbgen -s 100

  2. Log in to the target GaussDB(for MySQL) instance, create a database, and run the following command to create a table:

    CREATE TABLE nation  ( N_NATIONKEY  INTEGER NOT NULL,
                           N_NAME       CHAR(25) NOT NULL,
                           N_REGIONKEY  INTEGER NOT NULL,
                           N_COMMENT    VARCHAR(152));
    CREATE TABLE region  ( R_REGIONKEY  INTEGER NOT NULL,
                           R_NAME       CHAR(25) NOT NULL,
                           R_COMMENT    VARCHAR(152));
    CREATE TABLE part    ( P_PARTKEY     INTEGER NOT NULL,
                           P_NAME        VARCHAR(55) NOT NULL,
                           P_MFGR        CHAR(25) NOT NULL,
                           P_BRAND       CHAR(10) NOT NULL,
                           P_TYPE        VARCHAR(25) NOT NULL,
                           P_SIZE        INTEGER NOT NULL,
                           P_CONTAINER   CHAR(10) NOT NULL,
                           P_RETAILPRICE DECIMAL(15,2) NOT NULL,
                           P_COMMENT     VARCHAR(23) NOT NULL );
    CREATE TABLE supplier  ( S_SUPPKEY     INTEGER NOT NULL,
                             S_NAME        CHAR(25) NOT NULL,
                             S_ADDRESS     VARCHAR(40) NOT NULL,
                             S_NATIONKEY   INTEGER NOT NULL,
                             S_PHONE       CHAR(15) NOT NULL,
                             S_ACCTBAL     DECIMAL(15,2) NOT NULL,
                             S_COMMENT     VARCHAR(101) NOT NULL);
    CREATE TABLE partsupp  ( PS_PARTKEY     INTEGER NOT NULL,
                             PS_SUPPKEY     INTEGER NOT NULL,
                             PS_AVAILQTY    INTEGER NOT NULL,
                             PS_SUPPLYCOST  DECIMAL(15,2)  NOT NULL,
                             PS_COMMENT     VARCHAR(199) NOT NULL );
    CREATE TABLE customer  ( C_CUSTKEY     INTEGER NOT NULL,
                             C_NAME        VARCHAR(25) NOT NULL,
                             C_ADDRESS     VARCHAR(40) NOT NULL,
                             C_NATIONKEY   INTEGER NOT NULL,
                             C_PHONE       CHAR(15) NOT NULL,
                             C_ACCTBAL     DECIMAL(15,2)   NOT NULL,
                             C_MKTSEGMENT  CHAR(10) NOT NULL,
                             C_COMMENT     VARCHAR(117) NOT NULL);
    CREATE TABLE orders  ( O_ORDERKEY       INTEGER NOT NULL,
                           O_CUSTKEY        INTEGER NOT NULL,
                           O_ORDERSTATUS    CHAR(1) NOT NULL,
                           O_TOTALPRICE     DECIMAL(15,2) NOT NULL,
                           O_ORDERDATE      DATE NOT NULL,
                           O_ORDERPRIORITY  CHAR(15) NOT NULL,
                           O_CLERK          CHAR(15) NOT NULL,
                           O_SHIPPRIORITY   INTEGER NOT NULL,
                           O_COMMENT        VARCHAR(79) NOT NULL);
    CREATE TABLE lineitem ( L_ORDERKEY    INTEGER NOT NULL,
                            L_PARTKEY     INTEGER NOT NULL,
                            L_SUPPKEY     INTEGER NOT NULL,
                            L_LINENUMBER  INTEGER NOT NULL,
                            L_QUANTITY    DECIMAL(15,2) NOT NULL,
                            L_EXTENDEDPRICE  DECIMAL(15,2) NOT NULL,
                            L_DISCOUNT    DECIMAL(15,2) NOT NULL,
                            L_TAX         DECIMAL(15,2) NOT NULL,
                            L_RETURNFLAG  CHAR(1) NOT NULL,
                            L_LINESTATUS  CHAR(1) NOT NULL,
                            L_SHIPDATE    DATE NOT NULL,
                            L_COMMITDATE  DATE NOT NULL,
                            L_RECEIPTDATE DATE NOT NULL,
                            L_SHIPINSTRUCT CHAR(25) NOT NULL,
                            L_SHIPMODE     CHAR(10) NOT NULL,
                            L_COMMENT      VARCHAR(44) NOT NULL);

  3. Run the following command to import the generated data to the table:

    load data INFILE '/path/customer.tbl' INTO TABLE customer FIELDS TERMINATED BY '|';
    load data INFILE '/path/region.tbl' INTO TABLE region FIELDS TERMINATED BY '|';
    load data INFILE '/path/nation.tbl' INTO TABLE nation FIELDS TERMINATED BY '|';
    load data INFILE '/path/supplier.tbl' INTO TABLE supplier FIELDS TERMINATED BY '|';
    load data INFILE '/path/part.tbl' INTO TABLE part FIELDS TERMINATED BY '|';
    load data INFILE '/path/partsupp.tbl' INTO TABLE partsupp FIELDS TERMINATED BY '|';
    load data INFILE '/path/orders.tbl' INTO TABLE orders FIELDS TERMINATED BY '|';
    load data INFILE '/path/lineitem.tbl' INTO TABLE lineitem FIELDS TERMINATED BY '|';

  4. Create an index for the table.

    alter table region add primary key (r_regionkey);
    alter table nation add primary key (n_nationkey);
    alter table part add primary key (p_partkey);
    alter table supplier add primary key (s_suppkey);
    alter table partsupp add primary key (ps_partkey,ps_suppkey);
    alter table customer add primary key (c_custkey);
    alter table lineitem add primary key (l_orderkey,l_linenumber);
    alter table orders add primary key (o_orderkey);

  5. Obtain 22 query statements from https://github.com/dragansah/tpch-dbgen/tree/master/tpch-queries and perform corresponding operations.

Test Results

Based on 16-thread parallel execution, the performance of 17 query statements is greatly improved. The query speed of all statements is improved by more than 10 times on average. The following figure shows the TPC-H performance test results.

Figure 1 Test results