This topic describes the design, process, and results of the TPC-H test for PolarDB-X.
Background information
TPC-H is a benchmark that is widely used to evaluate the analytic query capabilities of databases. It is developed and released by Transaction Processing Performance Council (TPC). In the TPC-H benchmark, 22 complex SQL queries are performed on eight tables. Most queries contain joins on several tables, subqueries, and GROUP BY clauses.
The TPC-H performance tests described in this topic are implemented based on the TPC-H benchmark test but cannot meet all requirements of the TPC-H benchmark test. Therefore, the test results described in this topic cannot be compared with the published results of the TPC-H benchmark test.
Test design
Amount of test data
The test results in this topic are obtained based on 100 GB of data, which indicates that the scalar factor is 100. The following list describes the amount of data in each major table:
The LINEITEM table contains 600 million rows of data.
The ORDERS table contains 150 million rows of data.
The PART_SUPP table contains 80 million rows of data.
Instance specifications for the TPC-H test
Node specifications
Number of nodes
Size of data set
8 vCPUs, 64 GB of memory
6
100 GB
ECS instance type for stress testing
ecs.g7.4xlarge (16 vCPUs, 64 GB of memory, and a disk larger than 200 GB)
Procedure
Create an Elastic Compute Service (ECS) instance for stress testing.
Create an ECS instance that is used to prepare data and perform stress testing. The disk size of the ECS instance must be larger than 200 GB. The data sets in the CSV format generated by the tools are stored in the disk of the ECS instance.
NoteThe ECS instance must be deployed in a virtual private cloud (VPC). Record the name and the ID of the VPC for future use. You must deploy all database instances that are described in subsequent steps in this VPC.
Create a PolarDB-X instance for stress testing.
Create a PolarDB-X instance. For more information about how to create a PolarDB-X instance, see Create a PolarDB-X instance.
NoteMake sure that the ECS instance and the PolarDB-X instance are in the same VPC.
Create a database to be tested in the instance. In this example, a database named tpch_100g is created. For more information about how to create a database, see Create a database.
CREATE DATABASE tpch_100g MODE = 'auto';
Execute the following statements to create tables in the tpch_100g database:
CREATE TABLE `customer` ( `c_custkey` int(11) NOT NULL, `c_name` varchar(25) NOT NULL, `c_address` varchar(40) NOT NULL, `c_nationkey` int(11) NOT NULL, `c_phone` varchar(15) NOT NULL, `c_acctbal` decimal(15,2) NOT NULL, `c_mktsegment` varchar(10) NOT NULL, `c_comment` varchar(117) NOT NULL, PRIMARY KEY (`c_custkey`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`c_custkey`) PARTITIONS 192; CREATE TABLE `lineitem` ( `l_orderkey` bigint(20) NOT NULL, `l_partkey` int(11) NOT NULL, `l_suppkey` int(11) NOT NULL, `l_linenumber` bigint(20) NOT NULL, `l_quantity` decimal(15,2) NOT NULL, `l_extendedprice` decimal(15,2) NOT NULL, `l_discount` decimal(15,2) NOT NULL, `l_tax` decimal(15,2) NOT NULL, `l_returnflag` varchar(1) NOT NULL, `l_linestatus` varchar(1) NOT NULL, `l_shipdate` date NOT NULL, `l_commitdate` date NOT NULL, `l_receiptdate` date NOT NULL, `l_shipinstruct` varchar(25) NOT NULL, `l_shipmode` varchar(10) NOT NULL, `l_comment` varchar(44) NOT NULL, KEY `IDX_LINEITEM_PARTKEY` (`l_partkey`), KEY `IDX_SUPPKEY` (`l_suppkey`), KEY `IDX_LINEITEM_SHIPDATE` (`l_shipdate`), PRIMARY KEY (`l_orderkey`,`l_linenumber`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`l_orderkey`) PARTITIONS 192; CREATE TABLE `orders` ( `o_orderkey` bigint(20) NOT NULL, `o_custkey` int(11) NOT NULL, `o_orderstatus` varchar(1) NOT NULL, `o_totalprice` decimal(15,2) NOT NULL, `o_orderdate` date NOT NULL, `o_orderpriority` varchar(15) NOT NULL, `o_clerk` varchar(15) NOT NULL, `o_shippriority` bigint(20) NOT NULL, `o_comment` varchar(79) NOT NULL, PRIMARY KEY (`O_ORDERKEY`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`o_orderkey`) PARTITIONS 192; CREATE TABLE `part` ( `p_partkey` int(11) NOT NULL, `p_name` varchar(55) NOT NULL, `p_mfgr` varchar(25) NOT NULL, `p_brand` varchar(10) NOT NULL, `p_type` varchar(25) NOT NULL, `p_size` int(11) NOT NULL, `p_container` varchar(10) NOT NULL, `p_retailprice` decimal(15,2) NOT NULL, `p_comment` varchar(23) NOT NULL, PRIMARY KEY (`p_partkey`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`p_partkey`) PARTITIONS 192; CREATE TABLE `partsupp` ( `ps_partkey` int(11) NOT NULL, `ps_suppkey` int(11) NOT NULL, `ps_availqty` int(11) NOT NULL, `ps_supplycost` decimal(15,2) NOT NULL, `ps_comment` varchar(199) NOT NULL, KEY `IDX_PARTSUPP_SUPPKEY` (`PS_SUPPKEY`), PRIMARY KEY (`ps_partkey`,`ps_suppkey`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`ps_partkey`) PARTITIONS 192; CREATE TABLE `supplier` ( `s_suppkey` int(11) NOT NULL, `s_name` varchar(25) NOT NULL, `s_address` varchar(40) NOT NULL, `s_nationkey` int(11) NOT NULL, `s_phone` varchar(15) NOT NULL, `s_acctbal` decimal(15,2) NOT NULL, `s_comment` varchar(101) NOT NULL, PRIMARY KEY (`s_suppkey`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY KEY(`s_suppkey`) PARTITIONS 192; CREATE TABLE `nation` ( `n_nationkey` int(11) NOT NULL, `n_name` varchar(25) NOT NULL, `n_regionkey` int(11) NOT NULL, `n_comment` varchar(152) DEFAULT NULL, PRIMARY KEY (`n_nationkey`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast; CREATE TABLE `region` ( `r_regionkey` int(11) NOT NULL, `r_name` varchar(25) NOT NULL, `r_comment` varchar(152) DEFAULT NULL, PRIMARY KEY (`r_regionkey`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;
Adjust instance parameters.
NoteTo achieve optimal performance in stress testing, adjust some parameters of the compute node of the PolarDB-X instance.
Change the values of the XPROTO_MAX_DN_CONCURRENT and XPROTO_MAX_DN_WAIT_CONNECTION parameters to 4000. For more information, see Parameter settings.
Connect to the PolarDB-X instance by using a command-line client. Then, execute the following SQL statements in the same session to disable logging and CPU statistic sampling:
set GLOBAL RECORD_SQL = false; set GLOBAL MPP_METRIC_LEVEL = 0; set GLOBAL ENABLE_CPU_PROFILE = false; set GLOBAL ENABLE_SORT_AGG=false; set GLOBAL MPP_PARALLELISM=192; set GLOBAL ENABLE_MASTER_MPP=true; set GLOBAL GROUP_PARALLELISM=8;
Prepare data for stress testing.
Run the following command to download and decompress the tpchData.tar.gz script:
tar xzvf tpchData.tar.gz cd tpchData/ vi params.conf
Specify the connection information about the PolarDB-X instance in the configuration file param.conf.
#!/bin/bash ### remote generating directory export remoteGenDir=./ ### target path export targetPath=../tpch/tpchRaw ### cores per worker, default value is 1 export coresPerWorker=`cat /proc/cpuinfo| grep "processor"| wc -l` ### threads per worker, default value is 1 export threadsPerWorker=`cat /proc/cpuinfo| grep "processor"| wc -l` #export threadsPerWorker=1 export hint="" export insertMysql="mysql -h{HOST} -P{PORT} -u{USER} -p{PASSWORD} -Ac --local-infile tpch_100g -e"
You must specify the following parameters in the configuration file:
{HOST}: the host of the PolarDB-X instance.
{PORT}: the port number used to access the PolarDB-X instance
{USER}: the username used to log on to the PolarDB-X instance.
{PASSWORD}: the password used to log on to the PolarDB-X instance.
To more efficiently generate data, set the threadsPerWorker parameter in the script to a greater value, such as the number of vCPUs on the ECS instance that is used for stress testing.
Run the script to generate 100 GB of data by using multiple processes.
cd datagen sh generateTPCH.sh 100
You can view the generated data in the tpch/tpchRaw/SF100/ directory.
ls ../tpch/tpchRaw/SF100/ customer lineitem nation orders part partsupp region supplier
Import the data to the PolarDB-X instance.
cd ../loadTpch sh loadTpch.sh 100
Verify data integrity.
Connect to the PolarDB-X instance by using a command-line client and check whether the amount of data stored in each table meets your expectation.
MySQL [tpch_100g]> select (select count(*) from customer) as customer_cnt, (select count(*) from lineitem) as lineitem_cnt, (select count(*) from nation) as nation_cnt, (select count(*) from orders) as order_cnt, (select count(*) from part) as part_cnt, (select count(*) from partsupp) as partsupp_cnt, (select count(*) from region) as region_cnt, (select count(*) from supplier) as supplier_cnt; +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+ | customer_cnt | lineitem_cnt | nation_cnt | order_cnt | part_cnt | partsupp_cnt | region_cnt | supplier_cnt | +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+ | 15000000 | 600037902 | 25 | 150000000 | 20000000 | 80000000 | 5 | 1000000 | +--------------+--------------+------------+-----------+----------+--------------+------------+--------------+
Collect statistics.
Connect to the PolarDB-X instance by using a command-line client and run the analyze table command to collect the statistics of each table.
analyze table customer; analyze table lineitem; analyze table nation; analyze table orders; analyze table part; analyze table partsupp; analyze table region; analyze table supplier;
Perform stress testing.
Download and decompress the test script package tpch-queries.tar.gz.
tar xzvf tpch-queries.tar.gz
Run the script to perform queries and record the time taken to run each query.
cd tpch-queries 'time' -f "%e" sh all_query.sh {HOST} {USER} {PASSWORD} {DB} {PORT}
Test results (5.4.15-16704996)
The following figure shows the test results. Unit: seconds.
Q1 | 48.45 |
Q2 | 2.06 |
Q3 | 14.49 |
Q4 | 3.35 |
Q5 | 9.11 |
Q6 | 10.32 |
Q7 | 29.77 |
Q8 | 10.82 |
Q9 | 52.06 |
Q10 | 6.47 |
Q11 | 9.76 |
Q12 | 11.88 |
Q13 | 6.69 |
Q14 | 2.86 |
Q15 | 6.8 |
Q16 | 1.92 |
Q17 | 2.12 |
Q18 | 13.35 |
Q19 | 4.26 |
Q20 | 13.66 |
Q21 | 18.69 |
Q22 | 3.22 |
Total duration | 280.21 |
Test results (5.4.15-16715927)
The following figure shows the test results. Unit: seconds.
Q1 | 49.62 |
Q2 | 2.08 |
Q3 | 14.67 |
Q4 | 3.36 |
Q5 | 9.26 |
Q6 | 10.48 |
Q7 | 30.34 |
Q8 | 10.94 |
Q9 | 50.84 |
Q10 | 6.37 |
Q11 | 9.27 |
Q12 | 12.17 |
Q13 | 7.06 |
Q14 | 2.9 |
Q15 | 6.76 |
Q16 | 2.01 |
Q17 | 2.1 |
Q18 | 13.7 |
Q19 | 4.05 |
Q20 | 13.62 |
Q21 | 18.42 |
Q22 | 3.33 |
Total duration | 280.21 |
Test results (5.4.16-16717637)
The following figure shows the test results. Unit: seconds.
Q1 | 49.37 |
Q2 | 2.18 |
Q3 | 14.82 |
Q4 | 3.34 |
Q5 | 9.4 |
Q6 | 10.21 |
Q7 | 30.77 |
Q8 | 10.99 |
Q9 | 51.17 |
Q10 | 6.58 |
Q11 | 9.91 |
Q12 | 11.97 |
Q13 | 7.97 |
Q14 | 3.06 |
Q15 | 6.89 |
Q16 | 2.08 |
Q17 | 2.02 |
Q18 | 13.83 |
Q19 | 4.3 |
Q20 | 13.6 |
Q21 | 18.78 |
Q22 | 3.31 |
Total duration | 286.55 |
Test results (5.4.17-16835173)
The following figure shows the test results. Unit: seconds.
Q1 | 47.37 |
Q2 | 1.84 |
Q3 | 15.31 |
Q4 | 3.33 |
Q5 | 9.17 |
Q6 | 9.58 |
Q7 | 30.32 |
Q8 | 10.95 |
Q9 | 52.17 |
Q10 | 7.04 |
Q11 | 9.43 |
Q12 | 12.11 |
Q13 | 7.41 |
Q14 | 3.08 |
Q15 | 6.87 |
Q16 | 2.04 |
Q17 | 2.08 |
Q18 | 14.06 |
Q19 | 4.43 |
Q20 | 13 |
Q21 | 18.9 |
Q22 | 3.39 |
Total duration | 283.88 |
Test results (5.4.18-16977056)
The following figure shows the test results. Unit: seconds.
Q1 | 50.63 |
Q2 | 3.15 |
Q3 | 15.36 |
Q4 | 3.62 |
Q5 | 10.06 |
Q6 | 10.03 |
Q7 | 30.51 |
Q8 | 10.91 |
Q9 | 49.74 |
Q10 | 8.05 |
Q11 | 4.1 |
Q12 | 12.2 |
Q13 | 5.45 |
Q14 | 2.68 |
Q15 | 7.02 |
Q16 | 2.18 |
Q17 | 2.16 |
Q18 | 14.2 |
Q19 | 3.66 |
Q20 | 14.07 |
Q21 | 19.04 |
Q22 | 3.46 |
Total duration | 282.28 |