This topic describes the design, procedure, and results of the TPC Benchmark H (TPC-H) test for PolarDB-X 1.0.

TPC-H overview

TPC-H is a widely used decision support benchmark. It is developed and released by the Transaction Processing Performance Council (TPC) to evaluate the query capabilities of databases. In the TPC-H benchmark, 22 complex SQL queries are performed on 8 tables. Most queries contain joins on several tables, subqueries, and GROUP BY clauses.

Note This implementation of TPC-H is derived from the TPC-H Benchmark and is not comparable to published TPC-H Benchmark results, as this implementation does not comply with all the requirements of the TPC-H Benchmark.

Test design

  • Test environment for Enterprise Edition instances: one PolarDB-X 1.0 Enterprise Edition instance that contains two nodes each with 16 CPU cores and 64 GB of memory, and four dedicated ApsaraDB RDS for MySQL 5.7 instances with 8 CPU cores and 32 GB of memory.
  • Test environment for Standard Edition instances: one PolarDB-X 1.0 Standard Edition instance that contains two nodes each with 8 CPU cores and 32 GB of memory, and four dedicated ApsaraDB RDS for MySQL 5.7 instances with 4 CPU cores and 32 GB of memory.

The following test results are provided based on the queries performed on 50 GB data. The scalar factor of the 50 GB data is 50. The main tables are the LINEITEM table, ORDERS table, and PARSUPP table. The LINEITEM table has approximately 300 million rows. The ORDERS table has 75 million rows. The PARSUPP table has 40 million rows.

In the following example, Q18 contains joins on four tables that have tens of millions of rows to hundreds of millions of rows, including a subquery with semi-joins. In addition, Q18 contains aggregation operations based on the columns specified in the GROUP BY clause. It takes approximately 11 seconds for a PolarDB-X 1.0 instance to perform this query.

select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
from CUSTOMER, ORDERS, LINEITEM
where o_orderkey in (
        select l_orderkey
        from LINEITEM
        group by l_orderkey
        having sum(l_quantity) > 314
    )
    and c_custkey = o_custkey
    and o_orderkey = l_orderkey
group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
order by o_totalprice desc, o_orderdate
limit 100;

Test procedure

Note In the following test procedure, the LoadData feature is used to import TPC-H data. If you want to use this feature, the kernel version of your instance must be V5.4.7-16000638 or later.
  1. Create an Electric Compute Service (ECS) instance and use it as the stress test machine. The following operations such as preparing test data and running the stress test are performed on this ECS instance.
    Note
    • We recommend that you select a virtual private cloud (VPC) when you create an ECS instance. If you use the classic network type, some specifications of ApsaraDB RDS for MySQL instances may be unavailable. Record the ID and the name of the VPC. All the instances that are described in subsequent steps must be deployed and all the corresponding data must be stored in this VPC.
    • We recommend that you use the latest Debian or CentOS image. This ensures that all the required dependency libraries are available during compilation.
  2. Create a PolarDB-X 1.0 instance and an ApsaraDB RDS for MySQL instance. Make sure that these instances are deployed in the same VPC as the ECS instance created in the preceding step.
  3. Create databases and tables on the PolarDB-X 1.0 instance. In this step, you must specify a sharding method. We recommend that you use the table schemas shown in the following code:
    CREATE TABLE `customer` (
      `c_custkey` int(11) NOT NULL,
      `c_name` varchar(25) NOT NULL,
      `c_address` varchar(40) NOT NULL,
      `c_nationkey` int(11) NOT NULL,
      `c_phone` varchar(15) NOT NULL,
      `c_acctbal` decimal(15,2) NOT NULL,
      `c_mktsegment` varchar(10) NOT NULL,
      `c_comment` varchar(117) NOT NULL,
      PRIMARY KEY (`c_custkey`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by hash(`c_custkey`) tbpartition by hash(`c_custkey`) tbpartitions 4;
    CREATE TABLE `lineitem` (
      `l_orderkey` bigint(20) NOT NULL,
      `l_partkey` int(11) NOT NULL,
      `l_suppkey` int(11) NOT NULL,
      `l_linenumber` bigint(20) NOT NULL,
      `l_quantity` decimal(15,2) NOT NULL,
      `l_extendedprice` decimal(15,2) NOT NULL,
      `l_discount` decimal(15,2) NOT NULL,
      `l_tax` decimal(15,2) NOT NULL,
      `l_returnflag` varchar(1) NOT NULL,
      `l_linestatus` varchar(1) NOT NULL,
      `l_shipdate` date NOT NULL,
      `l_commitdate` date NOT NULL,
      `l_receiptdate` date NOT NULL,
      `l_shipinstruct` varchar(25) NOT NULL,
      `l_shipmode` varchar(10) NOT NULL,
      `l_comment` varchar(44) NOT NULL,
      KEY `IDX_LINEITEM_SUPPKEY` (`l_suppkey`),
      KEY `IDX_LINEITEM_PARTKEY` (`l_partkey`),
      PRIMARY KEY (`l_orderkey`,`l_linenumber`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by RIGHT_SHIFT(`l_orderkey`,6) tbpartition by RIGHT_SHIFT(`l_orderkey`,6) tbpartitions 4;
    CREATE TABLE `orders` (
      `o_orderkey` bigint(20) NOT NULL,
      `o_custkey` int(11) NOT NULL,
      `o_orderstatus` varchar(1) NOT NULL,
      `o_totalprice` decimal(15,2) NOT NULL,
      `o_orderdate` date NOT NULL,
      `o_orderpriority` varchar(15) NOT NULL,
      `o_clerk` varchar(15) NOT NULL,
      `o_shippriority` bigint(20) NOT NULL,
      `o_comment` varchar(79) NOT NULL,
      PRIMARY KEY (`O_ORDERKEY`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by RIGHT_SHIFT(`O_ORDERKEY`,6) tbpartition by RIGHT_SHIFT(`O_ORDERKEY`,6) tbpartitions 4;
    CREATE TABLE `part` (
      `p_partkey` int(11) NOT NULL,
      `p_name` varchar(55) NOT NULL,
      `p_mfgr` varchar(25) NOT NULL,
      `p_brand` varchar(10) NOT NULL,
      `p_type` varchar(25) NOT NULL,
      `p_size` int(11) NOT NULL,
      `p_container` varchar(10) NOT NULL,
      `p_retailprice` decimal(15,2) NOT NULL,
      `p_comment` varchar(23) NOT NULL,
      PRIMARY KEY (`p_partkey`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by hash(`p_partkey`) tbpartition by hash(`p_partkey`) tbpartitions 4;
    CREATE TABLE `partsupp` (
      `ps_partkey` int(11) NOT NULL,
      `ps_suppkey` int(11) NOT NULL,
      `ps_availqty` int(11) NOT NULL,
      `ps_supplycost` decimal(15,2) NOT NULL,
      `ps_comment` varchar(199) NOT NULL,
      KEY `IDX_PARTSUPP_SUPPKEY` (`PS_SUPPKEY`),
      PRIMARY KEY (`ps_partkey`,`ps_suppkey`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by hash(`ps_partkey`) tbpartition by hash(`ps_partkey`) tbpartitions 4;
    CREATE TABLE `supplier` (
      `s_suppkey` int(11) NOT NULL,
      `s_name` varchar(25) NOT NULL,
      `s_address` varchar(40) NOT NULL,
      `s_nationkey` int(11) NOT NULL,
      `s_phone` varchar(15) NOT NULL,
      `s_acctbal` decimal(15,2) NOT NULL,
      `s_comment` varchar(101) NOT NULL,
      PRIMARY KEY (`s_suppkey`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 dbpartition by hash(`s_suppkey`) tbpartition by hash(`s_suppkey`) tbpartitions 4;
    CREATE TABLE `nation` (
      `n_nationkey` int(11) NOT NULL,
      `n_name` varchar(25) NOT NULL,
      `n_regionkey` int(11) NOT NULL,
      `n_comment` varchar(152) DEFAULT NULL,
      PRIMARY KEY (`n_nationkey`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;
    CREATE TABLE `region` (
      `r_regionkey` int(11) NOT NULL,
      `r_name` varchar(25) NOT NULL,
      `r_comment` varchar(152) DEFAULT NULL,
      PRIMARY KEY (`r_regionkey`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 broadcast;
  4. Import data to the PolarDB-X 1.0 instance. Download the script from tpchData to the ECS instance.
    # Extract the script from the tpchData.zip file.
    unzip tpchData.zip
    cd tpchData-update-20210809
    # Edit the conf file.
    vim params.conf

    Change the connection information in the last line of the conf file to the actual connection information.

    #!/bin/bash
    
    ### remote generating directory
    export remoteGenDir=./
    
    ### target path
    export targetPath=../tpch/tpchRaw
    export sourcePath=../tpch/tpchRaw
    
    ### cores per worker, default value is 1
    export coresPerWorker=1
    
    ### threads per worker, default value is 1
    export threadsPerWorker=1
    
    export hint=""
    ### If you need to test data of a different size, change the database name. For example, if you want to test 100 GB data, change the name to tpch_100g.
    export insertMysql="mysql -hxxxxxxxxxx.drds.aliyuncs.com -P3306 -uxxx -pxxxxxx -Ac --local-infile tpch_50g -e"

    Generate 50 GB of data.

    cd datagen
    # Generate data. This step needs to be performed only once. If you need prepared data again in this test, you can reuse the generated data.
    sh generateTPCH.sh 50
    # Load data into the PolarDB-X 1.0 instance.
    cd ../loadTpch
    sh loadTpch.sh 50

    Verify whether the generated data is correct.

    MySQL [tpch_5g]> select (select count(*) from customer) as customer_cnt,
        ->        (select count(*)  from lineitem) as lineitem_cnt,
        ->        (select count(*)  from nation) as nation_cnt,
        ->        (select count(*)  from orders) as order_cnt,
        ->        (select count(*) from part) as part_cnt,
        ->        (select count(*) from partsupp) as partsupp_cnt,
        ->        (select count(*) from region) as region_cnt,
        ->        (select count(*) from supplier) as supplier_cnt;
  5. Execute the ANALYZE statement to collect statistics before you run the TPC-H test.
    analyze table customer;
    analyze table lineitem;
    analyze table nation;
    analyze table orders;
    analyze table part;
    analyze table partsupp;
    analyze table region;
    analyze table supplier;
  6. Run the TPC-H test.
    cd tpchData
    cd runTpch
    sh runTpch.sh
    # After the test is run, go to the result_fixed_mget directory and check the result of each SQL query.
    cd result_fixed_mget

Test results

Note PolarDB-X 1.0 Starter Edition instances do not support parallel queries. We recommend that you do not use these instances to perform analytical queries.
QueryEnterprise Edition (unit: seconds)Standard Edition (unit: seconds)
Q0155.82111.84
Q026.1211.54
Q0315.9930
Q0417.7136.56
Q0510.8923.01
Q068.0616.76
Q0717.0934.80
Q0813.4426.09
Q0953.81101.51
Q108.7319.67
Q1118.2519.74
Q128.8018.60
Q1314.1531.33
Q1417.4942.43
Q1520.6242.79
Q162.134.15
Q171.934.07
Q1811.0122.82
Q1912.9727.61
Q2027.7749.25
Q2138.8468.08
Q225.2711.29
Total386.77754.65