All Products
Search
Document Center

Test the DRDS performance based on TPC-H 50G

Last Updated: Sep 17, 2020

Introduction to TPC-H

Developed and released by the Transaction Processing Performance Council (TPC), the TPC Benchmark H (TPC-H) is widely used in the industry to evaluate analytical query capabilities of databases. The TPC-H testing suite contains 8 data tables and 22 complex SQL queries. Most of the queries contain multi-table join operations, subqueries, and aggregation operations based on GROUP BY clauses.

Test design

The following test results are based on 50 GB data with the scale factor of 50. Major tables and their data volumes are as follows: The LINEITEM table contains approximately 300 million rows of data, the ORDERS table contains 75 million rows of data, and the PARSUPP table contains 40 million rows of data.

The following takes Q18 as an example. Q18 contains join operations on four tables with tens of millions of data rows to hundreds of millions of data rows, including a Semi-Join operation of subqueries. In addition, Q18 contains aggregation operations based on the columns specified in the GROUP BY clause. A Distributed Relational Database Service (DRDS) instance of the Enterprise Edition took approximately 11 seconds to execute this query.

  1. select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
  2. from CUSTOMER, ORDERS, LINEITEM
  3. where o_orderkey in (
  4. select l_orderkey
  5. from LINEITEM
  6. group by l_orderkey
  7. having sum(l_quantity) > 314
  8. )
  9. and c_custkey = o_custkey
  10. and o_orderkey = l_orderkey
  11. group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
  12. order by o_totalprice desc, o_orderdate
  13. limit 100;

For more information about TPC-H, see Analysis of 22 SQL statements in TPC-H.

Test results

  • Test environment for the Enterprise Edition: 32-core 128 GB DRDS instance of the Enterprise Edition (16-core 64 GB per DRDS server node) and four 8-core 32 GB dedicated ApsaraDB RDS for MySQL instances 5.7
  • Test environment for the Standard Edition: 16-core 64 GB DRDS instance of the Standard Edition (8-core 32 GB per DRDS server node) and four 4-core 32 GB dedicated ApsaraDB RDS for MySQL instances 5.7

Note: DRDS instances of the Starter Edition do not support parallel query. We recommend that you not use them to execute analytical queries.

Query Enterprise Edition (Unit: seconds) Standard Edition (Unit: seconds)
Q01 55.82 111.84
Q02 6.12 11.54
Q03 15.99 30
Q04 17.71 36.56
Q05 10.89 23.01
Q06 8.06 16.76
Q07 17.09 34.80
Q08 13.44 26.09
Q09 53.81 101.51
Q10 8.73 19.67
Q11 18.25 19.74
Q12 8.80 18.60
Q13 14.15 31.33
Q14 17.49 42.43
Q15 20.62 42.79
Q16 2.13 4.15
Q17 1.93 4.07
Q18 11.01 22.82
Q19 12.97 27.61
Q20 27.77 49.25
Q21 38.84 68.08
Q22 5.27 11.29
Total 386.77 754.65

Test process

Step 1 Prepare an ECS instance as the stress test machine

First, prepare an Elastic Compute Service (ECS) instance on which you will perform all the following operations such as preparing data and running stress tests.

  • We recommend that you select VPC. ApsaraDB RDS for MySQL instances of the classic network type for some specifications may have insufficient inventory. You need to record the ID and name of the VPC. All the instances that are described in subsequent steps will be deployed in this VPC.

  • We recommend that you use the latest Debian or CentOS image to prevent lack of dependency libraries during compilation.

Step 2 Prepare an ApsaraDB RDS for MySQL instance for data transfer

Prepare an ApsaraDB RDS for MySQL instance to import data, because TPC-H DBGEN produces .tbl files, which must be imported into MySQL databases by running the LOAD DATA command that is not supported by DRDS.

Create and deploy an ApsaraDB RDS for MySQL instance in the same VPC as the ECS instance.

Download and compile TPC-H DBGEN on the ECS instance. For more information, see Use TPC-H to test performance and Import TPC-H data into a MySQL database.

Run the LOAD DATA command to import the generated data into the ApsaraDB RDS for MySQL instance:

  1. mysql --local-infile -hrm-XXXXXXXXXX.mysql.rds.aliyuncs.com -u<user> -p<password>
  1. LOAD DATA LOCAL INFILE 'customer.tbl' INTO TABLE customer FIELDS TERMINATED BY '|';

You can also import data of other tables by following the preceding step.

Step 3 Prepare the DRDS instance that you want to test

Create a DRDS instance and the corresponding ApsaraDB RDS for MySQL instances in the same VPC with the ECS instance.

Create a database and tables in the DRDS instance and specify database and table sharding methods. We recommend that you use the following table structure:

  1. CREATE TABLE `customer` (
  2. `c_custkey` int(11) NOT NULL,
  3. `c_name` varchar(25) NOT NULL,
  4. `c_address` varchar(40) NOT NULL,
  5. `c_nationkey` int(11) NOT NULL,
  6. `c_phone` varchar(15) NOT NULL,
  7. `c_acctbal` decimal(15,2) NOT NULL,
  8. `c_mktsegment` varchar(10) NOT NULL,
  9. `c_comment` varchar(117) NOT NULL,
  10. PRIMARY KEY (`c_custkey`)
  11. ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by hash(`c_custkey`) tbpartition by hash(`c_custkey`) tbpartitions 4;
  12. CREATE TABLE `lineitem` (
  13. `l_orderkey` int(11) NOT NULL,
  14. `l_partkey` int(11) NOT NULL,
  15. `l_suppkey` int(11) NOT NULL,
  16. `l_linenumber` int(11) NOT NULL,
  17. `l_quantity` decimal(15,2) NOT NULL,
  18. `l_extendedprice` decimal(15,2) NOT NULL,
  19. `l_discount` decimal(15,2) NOT NULL,
  20. `l_tax` decimal(15,2) NOT NULL,
  21. `l_returnflag` varchar(1) NOT NULL,
  22. `l_linestatus` varchar(1) NOT NULL,
  23. `l_shipdate` date NOT NULL,
  24. `l_commitdate` date NOT NULL,
  25. `l_receiptdate` date NOT NULL,
  26. `l_shipinstruct` varchar(25) NOT NULL,
  27. `l_shipmode` varchar(10) NOT NULL,
  28. `l_comment` varchar(44) NOT NULL,
  29. KEY `IDX_LINEITEM_PARTKEY` (`l_partkey`),
  30. PRIMARY KEY (`l_orderkey`,`l_linenumber`)
  31. ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by RIGHT_SHIFT(`l_orderkey`,6) tbpartition by RIGHT_SHIFT(`l_orderkey`,6) tbpartitions 4;
  32. CREATE TABLE `orders` (
  33. `o_orderkey` int(11) NOT NULL,
  34. `o_custkey` int(11) NOT NULL,
  35. `o_orderstatus` varchar(1) NOT NULL,
  36. `o_totalprice` decimal(15,2) NOT NULL,
  37. `o_orderdate` date NOT NULL,
  38. `o_orderpriority` varchar(15) NOT NULL,
  39. `o_clerk` varchar(15) NOT NULL,
  40. `o_shippriority` int(11) NOT NULL,
  41. `o_comment` varchar(79) NOT NULL,
  42. PRIMARY KEY (`O_ORDERKEY`)
  43. ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by RIGHT_SHIFT(`O_ORDERKEY`,6) tbpartition by RIGHT_SHIFT(`O_ORDERKEY`,6) tbpartitions 4;
  44. CREATE TABLE `part` (
  45. `p_partkey` int(11) NOT NULL,
  46. `p_name` varchar(55) NOT NULL,
  47. `p_mfgr` varchar(25) NOT NULL,
  48. `p_brand` varchar(10) NOT NULL,
  49. `p_type` varchar(25) NOT NULL,
  50. `p_size` int(11) NOT NULL,
  51. `p_container` varchar(10) NOT NULL,
  52. `p_retailprice` decimal(15,2) NOT NULL,
  53. `p_comment` varchar(23) NOT NULL,
  54. PRIMARY KEY (`p_partkey`)
  55. ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by hash(`p_partkey`) tbpartition by hash(`p_partkey`) tbpartitions 4;
  56. CREATE TABLE `partsupp` (
  57. `ps_partkey` int(11) NOT NULL,
  58. `ps_suppkey` int(11) NOT NULL,
  59. `ps_availqty` int(11) NOT NULL,
  60. `ps_supplycost` decimal(15,2) NOT NULL,
  61. `ps_comment` varchar(199) NOT NULL,
  62. KEY `IDX_PARTSUPP_SUPPKEY` (`PS_SUPPKEY`),
  63. PRIMARY KEY (`ps_partkey`,`ps_suppkey`)
  64. ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by hash(`ps_partkey`) tbpartition by hash(`ps_partkey`) tbpartitions 4;
  65. CREATE TABLE `supplier` (
  66. `s_suppkey` int(11) NOT NULL,
  67. `s_name` varchar(25) NOT NULL,
  68. `s_address` varchar(40) NOT NULL,
  69. `s_nationkey` int(11) NOT NULL,
  70. `s_phone` varchar(15) NOT NULL,
  71. `s_acctbal` decimal(15,2) NOT NULL,
  72. `s_comment` varchar(101) NOT NULL,
  73. PRIMARY KEY (`s_suppkey`)
  74. ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by hash(`s_suppkey`) tbpartition by hash(`s_suppkey`) tbpartitions 4;
  75. CREATE TABLE `nation` (
  76. `n_nationkey` int(11) NOT NULL,
  77. `n_name` varchar(25) NOT NULL,
  78. `n_regionkey` int(11) NOT NULL,
  79. `n_comment` varchar(152) DEFAULT NULL,
  80. PRIMARY KEY (`n_nationkey`)
  81. ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;
  82. CREATE TABLE `region` (
  83. `r_regionkey` int(11) NOT NULL,
  84. `r_name` varchar(25) NOT NULL,
  85. `r_comment` varchar(152) DEFAULT NULL,
  86. PRIMARY KEY (`r_regionkey`)
  87. ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;

Step 4 Synchronize data to the DRDS instance by using DTS

For detailed procedure, see Synchronize data to the DRDS instance.

Step 5 Perform a TPC-H test

Before you run a TPC-H test, run the ANALYZE command to collect statistics. This operation is required because the current statistics may be incorrect after a large amount of data was imported.

  1. analyze table customer;
  2. analyze table lineitem;
  3. analyze table nation;
  4. analyze table orders;
  5. analyze table part;
  6. analyze table partsupp;
  7. analyze table region;
  8. analyze table supplier;

On the ECS instance, connect the MySQL client to the DRDS instance to execute TPC-H queries.