All Products
Search
Document Center

PolarDB:TPC-H test

Last Updated:Dec 28, 2023

This topic describes the design, process, and results of the TPC-H test for PolarDB-X.

Background information

TPC-H is a benchmark that is widely used to evaluate the analytic query capabilities of databases. It is developed and released by Transaction Processing Performance Council (TPC). In the TPC-H benchmark, 22 complex SQL queries are performed on eight tables. Most queries contain joins on several tables, subqueries, and GROUP BY clauses.

Note

The TPC-H performance tests described in this topic are implemented based on the TPC-H benchmark test but cannot meet all requirements of the TPC-H benchmark test. Therefore, the test results described in this topic cannot be compared with the published results of the TPC-H benchmark test.

Test design

  • Amount of test data

    The test results in this topic are obtained based on 100 GB of data, which indicates that the scalar factor is 100. The following list describes the amount of data in each major table:

    • The LINEITEM table contains 600 million rows of data.

    • The ORDERS table contains 150 million rows of data.

    • The PART_SUPP table contains 80 million rows of data.

  • Instance specifications for the TPC-H test

    Node specifications

    Number of nodes

    Size of data set

    8 vCPUs, 64 GB of memory

    6

    100 GB

  • ECS instance type for stress testing

    ecs.g7.4xlarge (16 vCPUs, 64 GB of memory, and a disk larger than 200 GB)

Procedure

  1. Create an Elastic Compute Service (ECS) instance for stress testing.

    Create an ECS instance that is used to prepare data and perform stress testing. The disk size of the ECS instance must be larger than 200 GB. The data sets in the CSV format generated by the tools are stored in the disk of the ECS instance.

    Note

    The ECS instance must be deployed in a virtual private cloud (VPC). Record the name and the ID of the VPC for future use. You must deploy all database instances that are described in subsequent steps in this VPC.

  2. Create a PolarDB-X instance for stress testing.

    1. Create a PolarDB-X instance. For more information about how to create a PolarDB-X instance, see Create a PolarDB-X instance.

      Note

      Make sure that the ECS instance and the PolarDB-X instance are in the same VPC.

    2. Create a database to be tested in the instance. In this example, a database named tpch_100g is created. For more information about how to create a database, see Create a database.

      CREATE DATABASE tpch_100g MODE = 'auto';
    3. Execute the following statements to create tables in the tpch_100g database:

      CREATE TABLE `customer` (
        `c_custkey` int(11) NOT NULL,
        `c_name` varchar(25) NOT NULL,
        `c_address` varchar(40) NOT NULL,
        `c_nationkey` int(11) NOT NULL,
        `c_phone` varchar(15) NOT NULL,
        `c_acctbal` decimal(15,2) NOT NULL,
        `c_mktsegment` varchar(10) NOT NULL,
        `c_comment` varchar(117) NOT NULL,
        PRIMARY KEY (`c_custkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`c_custkey`)
        PARTITIONS 192;
      
      CREATE TABLE `lineitem` (
        `l_orderkey` bigint(20) NOT NULL,
        `l_partkey` int(11) NOT NULL,
        `l_suppkey` int(11) NOT NULL,
        `l_linenumber` bigint(20) NOT NULL,
        `l_quantity` decimal(15,2) NOT NULL,
        `l_extendedprice` decimal(15,2) NOT NULL,
        `l_discount` decimal(15,2) NOT NULL,
        `l_tax` decimal(15,2) NOT NULL,
        `l_returnflag` varchar(1) NOT NULL,
        `l_linestatus` varchar(1) NOT NULL,
        `l_shipdate` date NOT NULL,
        `l_commitdate` date NOT NULL,
        `l_receiptdate` date NOT NULL,
        `l_shipinstruct` varchar(25) NOT NULL,
        `l_shipmode` varchar(10) NOT NULL,
        `l_comment` varchar(44) NOT NULL,
        KEY `IDX_LINEITEM_PARTKEY` (`l_partkey`),
       KEY `IDX_SUPPKEY` (`l_suppkey`),
        KEY `IDX_LINEITEM_SHIPDATE` (`l_shipdate`),
        PRIMARY KEY (`l_orderkey`,`l_linenumber`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`l_orderkey`)
        PARTITIONS 192;
      
      CREATE TABLE `orders` (
        `o_orderkey` bigint(20) NOT NULL,
        `o_custkey` int(11) NOT NULL,
        `o_orderstatus` varchar(1) NOT NULL,
        `o_totalprice` decimal(15,2) NOT NULL,
        `o_orderdate` date NOT NULL,
        `o_orderpriority` varchar(15) NOT NULL,
        `o_clerk` varchar(15) NOT NULL,
        `o_shippriority` bigint(20) NOT NULL,
        `o_comment` varchar(79) NOT NULL,
        PRIMARY KEY (`O_ORDERKEY`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`o_orderkey`)
        PARTITIONS 192;
      
      CREATE TABLE `part` (
        `p_partkey` int(11) NOT NULL,
        `p_name` varchar(55) NOT NULL,
        `p_mfgr` varchar(25) NOT NULL,
        `p_brand` varchar(10) NOT NULL,
        `p_type` varchar(25) NOT NULL,
        `p_size` int(11) NOT NULL,
        `p_container` varchar(10) NOT NULL,
        `p_retailprice` decimal(15,2) NOT NULL,
        `p_comment` varchar(23) NOT NULL,
        PRIMARY KEY (`p_partkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`p_partkey`)
        PARTITIONS 192;
      
      CREATE TABLE `partsupp` (
        `ps_partkey` int(11) NOT NULL,
        `ps_suppkey` int(11) NOT NULL,
        `ps_availqty` int(11) NOT NULL,
        `ps_supplycost` decimal(15,2) NOT NULL,
        `ps_comment` varchar(199) NOT NULL,
        KEY `IDX_PARTSUPP_SUPPKEY` (`PS_SUPPKEY`),
        PRIMARY KEY (`ps_partkey`,`ps_suppkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`ps_partkey`)
        PARTITIONS 192;
      
      CREATE TABLE `supplier` (
        `s_suppkey` int(11) NOT NULL,
        `s_name` varchar(25) NOT NULL,
        `s_address` varchar(40) NOT NULL,
        `s_nationkey` int(11) NOT NULL,
        `s_phone` varchar(15) NOT NULL,
        `s_acctbal` decimal(15,2) NOT NULL,
        `s_comment` varchar(101) NOT NULL,
        PRIMARY KEY (`s_suppkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`s_suppkey`)
       PARTITIONS 192;
      
      CREATE TABLE `nation` (
        `n_nationkey` int(11) NOT NULL,
        `n_name` varchar(25) NOT NULL,
        `n_regionkey` int(11) NOT NULL,
        `n_comment` varchar(152) DEFAULT NULL,
        PRIMARY KEY (`n_nationkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;
      
      CREATE TABLE `region` (
        `r_regionkey` int(11) NOT NULL,
        `r_name` varchar(25) NOT NULL,
        `r_comment` varchar(152) DEFAULT NULL,
        PRIMARY KEY (`r_regionkey`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;
  3. Adjust instance parameters.

    Note

    To achieve optimal performance in stress testing, adjust some parameters of the compute node of the PolarDB-X instance.

    1. Change the values of the XPROTO_MAX_DN_CONCURRENT and XPROTO_MAX_DN_WAIT_CONNECTION parameters to 4000. For more information, see Parameter settings.

    2. Connect to the PolarDB-X instance by using a command-line client. Then, execute the following SQL statements in the same session to disable logging and CPU statistic sampling:

      set GLOBAL RECORD_SQL = false;
      set GLOBAL MPP_METRIC_LEVEL = 0;
      set GLOBAL ENABLE_CPU_PROFILE = false;
      set GLOBAL ENABLE_SORT_AGG=false;
      set GLOBAL MPP_PARALLELISM=192;
      set GLOBAL ENABLE_MASTER_MPP=true;
      set GLOBAL GROUP_PARALLELISM=8;
  4. Prepare data for stress testing.

    1. Run the following command to download and decompress the tpchData.tar.gz script:

      tar xzvf tpchData.tar.gz
      cd tpchData/
      vi params.conf

      Specify the connection information about the PolarDB-X instance in the configuration file param.conf.

      #!/bin/bash
      
      ### remote generating directory
      export remoteGenDir=./
      
      ### target path
      export targetPath=../tpch/tpchRaw
      
      ### cores per worker, default value is 1
      export coresPerWorker=`cat /proc/cpuinfo| grep "processor"| wc -l`
      
      ### threads per worker, default value is 1
      export threadsPerWorker=`cat /proc/cpuinfo| grep "processor"| wc -l`
      #export threadsPerWorker=1
      
      export hint=""
      
      
      export insertMysql="mysql -h{HOST} -P{PORT} -u{USER} -p{PASSWORD} -Ac --local-infile tpch_100g -e"

      You must specify the following parameters in the configuration file:

      • {HOST}: the host of the PolarDB-X instance.

      • {PORT}: the port number used to access the PolarDB-X instance

      • {USER}: the username used to log on to the PolarDB-X instance.

      • {PASSWORD}: the password used to log on to the PolarDB-X instance.

      To more efficiently generate data, set the threadsPerWorker parameter in the script to a greater value, such as the number of vCPUs on the ECS instance that is used for stress testing.

    2. Run the script to generate 100 GB of data by using multiple processes.

      cd datagen
      sh generateTPCH.sh 100

      You can view the generated data in the tpch/tpchRaw/SF100/ directory.

      ls ../tpch/tpchRaw/SF100/
      customer  lineitem  nation  orders  part  partsupp  region  supplier
    3. Import the data to the PolarDB-X instance.

      cd ../loadTpch
      sh loadTpch.sh 100
    4. Verify data integrity.

      Connect to the PolarDB-X instance by using a command-line client and check whether the amount of data stored in each table meets your expectation.

      MySQL [tpch_100g]> select (select count(*) from customer) as customer_cnt,
       (select count(*)  from lineitem) as lineitem_cnt,
       (select count(*)  from nation) as nation_cnt,
       (select count(*)  from orders) as order_cnt,
       (select count(*) from part) as part_cnt,
       (select count(*) from partsupp) as partsupp_cnt,
       (select count(*) from region) as region_cnt,
       (select count(*) from supplier) as supplier_cnt;
      +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+
      | customer_cnt | lineitem_cnt | nation_cnt | order_cnt | part_cnt | partsupp_cnt | region_cnt | supplier_cnt |
      +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+
      |     15000000 |    600037902 |         25 | 150000000 | 20000000 |     80000000 |          5 |      1000000 |
      +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+
    5. Collect statistics.

      Connect to the PolarDB-X instance by using a command-line client and run the analyze table command to collect the statistics of each table.

      analyze table customer;
      analyze table lineitem;
      analyze table nation;
      analyze table orders;
      analyze table part;
      analyze table partsupp;
      analyze table region;
      analyze table supplier;
  5. Perform stress testing.

    1. Download and decompress the test script package tpch-queries.tar.gz.

      tar xzvf tpch-queries.tar.gz
    2. Run the script to perform queries and record the time taken to run each query.

      cd tpch-queries
      'time' -f "%e" sh all_query.sh {HOST} {USER} {PASSWORD} {DB} {PORT}

Test results (5.4.15-16704996)

The following figure shows the test results. Unit: seconds.

456789

Q1

48.45

Q2

2.06

Q3

14.49

Q4

3.35

Q5

9.11

Q6

10.32

Q7

29.77

Q8

10.82

Q9

52.06

Q10

6.47

Q11

9.76

Q12

11.88

Q13

6.69

Q14

2.86

Q15

6.8

Q16

1.92

Q17

2.12

Q18

13.35

Q19

4.26

Q20

13.66

Q21

18.69

Q22

3.22

Total duration

280.21

Test results (5.4.15-16715927)

The following figure shows the test results. Unit: seconds.

456789

Q1

49.62

Q2

2.08

Q3

14.67

Q4

3.36

Q5

9.26

Q6

10.48

Q7

30.34

Q8

10.94

Q9

50.84

Q10

6.37

Q11

9.27

Q12

12.17

Q13

7.06

Q14

2.9

Q15

6.76

Q16

2.01

Q17

2.1

Q18

13.7

Q19

4.05

Q20

13.62

Q21

18.42

Q22

3.33

Total duration

280.21

Test results (5.4.16-16717637)

The following figure shows the test results. Unit: seconds.

456789

Q1

49.37

Q2

2.18

Q3

14.82

Q4

3.34

Q5

9.4

Q6

10.21

Q7

30.77

Q8

10.99

Q9

51.17

Q10

6.58

Q11

9.91

Q12

11.97

Q13

7.97

Q14

3.06

Q15

6.89

Q16

2.08

Q17

2.02

Q18

13.83

Q19

4.3

Q20

13.6

Q21

18.78

Q22

3.31

Total duration

286.55

Test results (5.4.17-16835173)

The following figure shows the test results. Unit: seconds.

456789

Q1

47.37

Q2

1.84

Q3

15.31

Q4

3.33

Q5

9.17

Q6

9.58

Q7

30.32

Q8

10.95

Q9

52.17

Q10

7.04

Q11

9.43

Q12

12.11

Q13

7.41

Q14

3.08

Q15

6.87

Q16

2.04

Q17

2.08

Q18

14.06

Q19

4.43

Q20

13

Q21

18.9

Q22

3.39

Total duration

283.88

Test results (5.4.18-16977056)

The following figure shows the test results. Unit: seconds.

UUUUUUUUU.png

Q1

50.63

Q2

3.15

Q3

15.36

Q4

3.62

Q5

10.06

Q6

10.03

Q7

30.51

Q8

10.91

Q9

49.74

Q10

8.05

Q11

4.1

Q12

12.2

Q13

5.45

Q14

2.68

Q15

7.02

Q16

2.18

Q17

2.16

Q18

14.2

Q19

3.66

Q20

14.07

Q21

19.04

Q22

3.46

Total duration

282.28