All Products
Search
Document Center

PolarDB:Performance test for parallel queries

Last Updated:Mar 05, 2025

This topic describes the OLAP performance test on the Cluster Edition cluster of PolarDB MySQL 8.0 based on TPC-H. You can check the cluster perform by using the methods described in this topic.

Test environment

  • The ECS instance and the PolarDB for MySQL cluster are deployed in the same region and zone and on the same VPC.

  • PolarDB for MySQL cluster configurations:

    • One cluster

    • Database engine: MySQL 8.0.1 or MySQL 8.0.2

    • Product edition: Enterprise Edition

    • Edition: Cluster Edition

    • Specification type: Dedicated

    • Node specifications: polar.mysql.x8.4xlarge(32 cores and 256 GB)

    • Number of nodes: 2 (1 primary node and 1 read-only node)

    Note

    The primary endpoint is used to connect to the cluster in the test. For more information about how to view the primary endpoint of a PolarDB for MySQL cluster, see Manage the endpoints of a cluster.

  • ECS instance configurations:

    • One instance

    • Instance type: ecs.c5.4xlarge (16 vCPU and 32 GiB)

    • Image: CentOS 7.0 64-bit

    • System disk: ESSD with capacity of 1000 GB

Test tool

TPC-H is a commonly used benchmark that is developed and released by the Transaction Processing Performance Council (TPC) to evaluate the analytical query capabilities of databases. The TPC-H benchmark contains eight tables and 22 complex SQL statements. Most of the queries contain JOIN clauses on several tables, subqueries, and GROUP BY clauses.

Note

The TPC-H performance test described in this topic is implemented based on the TPC-H benchmark test but cannot meet all requirements of the TPC-H benchmark test. Therefore, the test results cannot be compared with the published results of the TPC-H benchmark test.

Install TPC-H tools

Important
  • Log on to the ECS instance as the root user and run the following command:

  • The TPC-H used in this topic is TPC-H_Tools_v2.18.0. You must register the TPC-H before you can download it.

To install TPC-H on the ECS instance, perform the following steps:

  1. Register and download the TPC-H package. Upload the TPC-H package to the ECS instance. For more information, see Upload files.

  2. Run the following command to decompress the TPC-H package. In the example, the TPC-H-Tool.zip file is used. Replace it with the actual file name.

    unzip TPC-H-Tool.zip
  3. After the decompression is complete, find and then go to the dbgen folder. In the example, TPC-H-Tool/dbgen is used. Replace it with the actual file name.

    cd TPC-H-Tool/dbgen
  4. Copy the makefile file to this directory.

    cp makefile.suite makefile
  5. Install GCC.

    sudo yum install gcc
    Note

    In this example, the CentOS operating system is used. If you are using another operating system, adjust the installation command accordingly. For example, in Ubuntu, run the command sudo apt install gcc.

  6. Modify the values of the CC, DATABASE, MACHINE, and WORKLOAD parameters in the makefile file.

    1. Open the makefile file.

      vim makefile
    2. Press the i key to modify the parameter value.

      ################
      ## CHANGE NAME OF ANSI COMPILER HERE
      ################
      CC      = gcc
      # Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
      #                                  SQLSERVER, SYBASE, ORACLE, VECTORWISE
      # Current values for MACHINE are:  ATT, DOS, HP, IBM, ICL, MVS, 
      #                                  SGI, SUN, U2200, VMS, LINUX, WIN32 
      # Current values for WORKLOAD are:  TPCH
      DATABASE= MYSQL
      MACHINE = LINUX
      WORKLOAD = TPCH
    3. Press the Esc key, enter :wq, and then press the Enter key to save your changes and exit the edit mode.

  7. Modify the tpcd.h file and add new macros.

    1. Open the tpcd.h file.

      vim tpcd.h
    2. Press the i key to add the following macro definition in database portability defines.

      #ifdef MYSQL
      #define GEN_QUERY_PLAN "EXPLAIN PLAN"
      #define START_TRAN "START TRANSACTION"
      #define END_TRAN "COMMIT"
      #define SET_OUTPUT ""
      #define SET_ROWCOUNT "limit %d;\n"
      #define SET_DBASE "use %s;\n"
      #endif
    3. Press the Esc key, enter :wq, and then press the Enter key to save and close the file.

  8. Compile the file.

    make

    After the file is compiled, the following executable files are generated in the directory:

    • dbgen: the tool that is used to generate data. If you use the InfiniDB test script, you must use this tool to generate data for TPC-H.

    • qgen: the tool that is used to generate SQL statements. This tool generates initial statements for the test. The queries generated by using different seeds are different from each other. To ensure that test results are repeatable, use the 22 complex SQL statements of TPC-H. You can click here to download the 22 statements.

  9. Use dbgen to generate test data.

    ./dbgen -s 100
    Note

    The -s parameter specifies the Scale Factor for generating data, which indicates the size of the benchmark dataset. -s 100 indicates that a test dataset of about 100 GB is generated (the specific size may vary slightly depending on the table structure and data distribution).

  10. (Optional) Use TPC-H to generate a query.

    Note

    To ensure that test results are repeatable, skip this step and use the 22 complex SQL statements of TPC-H. You can click here to download the 22 statements.

    1. Copy qgen and dists.dss into the queries directory.

      cp qgen queries
      cp dists.dss queries
    2. Generate the query.

      1. Create the generate_queries.sh script.

        vim generate_queries.sh
      2. Press the i key and add the following content to the file:

        #!/usr/bin/bash
        
        # Go to the query directory.
        cd queries
        
        # Generate 22 queries.
        for i in {1..22}
        do
          ./qgen -d $i -s 100 > db"$i".sql
        done
      3. Press the Esc key, enter :wq, and then press the Enter key to save and close the file.

      4. Set the script execution permissions.

        chmod +x generate_queries.sh
      5. Execute the generate_queries.sh script.

        ./generate_queries.sh

Test procedure

Connect to the PolarDB for MySQL cluster from the ECS instance to initialize data and test data.

  1. Create an execution script to load data.

    1. Create the load.ddl file in the dbgen folder of TPC-H.

      vim load.ddl
    2. Press the i key and add the following content to the file:

      load data local INFILE 'customer.tbl' INTO TABLE customer FIELDS TERMINATED BY '|';
      load data local INFILE 'region.tbl' INTO TABLE region FIELDS TERMINATED BY '|';
      load data local INFILE 'nation.tbl' INTO TABLE nation FIELDS TERMINATED BY '|';
      load data local INFILE 'supplier.tbl' INTO TABLE supplier FIELDS TERMINATED BY '|';
      load data local INFILE 'part.tbl' INTO TABLE part FIELDS TERMINATED BY '|';
      load data local INFILE 'partsupp.tbl' INTO TABLE partsupp FIELDS TERMINATED BY '|';
      load data local INFILE 'orders.tbl' INTO TABLE orders FIELDS TERMINATED BY '|';
      load data local INFILE 'lineitem.tbl' INTO TABLE lineitem FIELDS TERMINATED BY '|';
    3. Press the Esc key, enter :wq, and then press the Enter key to save and close the file.

  2. Modify the dss.ri file to create the primary and foreign keys of the table.

    1. Copy the dss.ri file and empty the dss.ri file.

      cp dss.ri dss_bk.ri
      > dss.ri
    2. Open the dss.ri file.

      vim dss.ri
    3. Press the i key and add the following content to the file:

      use tpch100g;
      -- ALTER TABLE REGION DROP PRIMARY KEY;
      -- ALTER TABLE NATION DROP PRIMARY KEY;
      -- ALTER TABLE PART DROP PRIMARY KEY;
      -- ALTER TABLE SUPPLIER DROP PRIMARY KEY;
      -- ALTER TABLE PARTSUPP DROP PRIMARY KEY;
      -- ALTER TABLE ORDERS DROP PRIMARY KEY;
      -- ALTER TABLE LINEITEM DROP PRIMARY KEY;
      -- ALTER TABLE CUSTOMER DROP PRIMARY KEY;
      -- For table REGION
      ALTER TABLE REGION
      ADD PRIMARY KEY (R_REGIONKEY);
      -- For table NATION
      ALTER TABLE NATION
      ADD PRIMARY KEY (N_NATIONKEY);
      ALTER TABLE NATION
      ADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references REGION(R_REGIONKEY);
      COMMIT WORK;
      -- For table PART
      ALTER TABLE PART
      ADD PRIMARY KEY (P_PARTKEY);
      COMMIT WORK;
      -- For table SUPPLIER
      ALTER TABLE SUPPLIER
      ADD PRIMARY KEY (S_SUPPKEY);
      ALTER TABLE SUPPLIER
      ADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references NATION(N_NATIONKEY);
      COMMIT WORK;
      -- For table PARTSUPP
      ALTER TABLE PARTSUPP
      ADD PRIMARY KEY (PS_PARTKEY,PS_SUPPKEY);
      COMMIT WORK;
      -- For table CUSTOMER
      ALTER TABLE CUSTOMER
      ADD PRIMARY KEY (C_CUSTKEY);
      ALTER TABLE CUSTOMER
      ADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references NATION(N_NATIONKEY);
      COMMIT WORK;
      -- For table LINEITEM
      ALTER TABLE LINEITEM
      ADD PRIMARY KEY (L_ORDERKEY,L_LINENUMBER);
      COMMIT WORK;
      -- For table ORDERS
      ALTER TABLE ORDERS
      ADD PRIMARY KEY (O_ORDERKEY);
      COMMIT WORK;
      -- For table PARTSUPP
      ALTER TABLE PARTSUPP
      ADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references SUPPLIER(S_SUPPKEY);
      COMMIT WORK;
      ALTER TABLE PARTSUPP
      ADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references PART(P_PARTKEY);
      COMMIT WORK;
      -- For table ORDERS
      ALTER TABLE ORDERS
      ADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references CUSTOMER(C_CUSTKEY);
      COMMIT WORK;
      -- For table LINEITEM
      ALTER TABLE LINEITEM
      ADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY)  references ORDERS(O_ORDERKEY);
      COMMIT WORK;
      ALTER TABLE LINEITEM
      ADD FOREIGN KEY LINEITEM_FK2 (L_PARTKEY,L_SUPPKEY) references 
              PARTSUPP(PS_PARTKEY,PS_SUPPKEY);
      COMMIT WORK;
    4. Press the Esc key, enter :wq, and then press the Enter key to save and close the file.

  3. Install the MySQL client.

    sudo yum install mysql
    Note

    In this example, the CentOS operating system is used. If you are using another operating system, adjust the installation command accordingly.

  4. Use the privileged account and the primary endpoint of the cluster to connect to the PolarDB for MySQL cluster. For more information, see Create a privileged account and Connect to a database.

    mysql -h <Primary endpoint of the PolarDB cluster> -P <Port> -u <Username of the database> -p <Password of the database>
    Note

    If an error message "Access denied for user 'xxx'@'xxx' (using password: YES)" appears, the username or password is incorrect. Check whether the information you entered is correct.

  5. Create the tpch100g database and switch to the current database.

    CREATE DATABASE tpch100g;
    
    use tpch100g;
  6. Create a table.

    source ./dss.ddl
    Note

    dss.ddl is in the dbgen directory.

  7. Load data.

    source ./load.ddl
    Note

    load.ddl is in the dbgen directory and was created in the first step.

  8. Create primary keys and foreign keys.

    source ./dss.ri
    Note

    dss.ri is in the dbgen directory and is modified in the second step.

  9. (Optional) Create an index.

    You must run the exit; command to exit the MySQL client. Then, create a script on the ECS instance and execute it. Sample script:

    #!/usr/bin/bash
    host=$1
    port=$2
    user=$3
    password=$4
    db=$5
    sqls=("create index i_s_nationkey on supplier (s_nationkey);"
    "create index i_ps_partkey on partsupp (ps_partkey);"
    "create index i_ps_suppkey on partsupp (ps_suppkey);"
    "create index i_c_nationkey on customer (c_nationkey);"
    "create index i_o_custkey on orders (o_custkey);"
    "create index i_o_orderdate on orders (o_orderdate);"
    "create index i_l_orderkey on lineitem (l_orderkey);"
    "create index i_l_partkey on lineitem (l_partkey);"
    "create index i_l_suppkey on lineitem (l_suppkey);"
    "create index i_l_partkey_suppkey on lineitem (l_partkey, l_suppkey);"
    "create index i_l_shipdate on lineitem (l_shipdate);"
    "create index i_l_commitdate on lineitem (l_commitdate);"
    "create index i_l_receiptdate on lineitem (l_receiptdate);"
    "create index i_n_regionkey on nation (n_regionkey);"
    "analyze table supplier"
    "analyze table part"
    "analyze table partsupp"
    "analyze table customer"
    "analyze table orders"
    "analyze table lineitem"
    "analyze table nation"
    "analyze table region")
    for sql in "${sqls[@]}"
    do
        mysql -h$host -P$port -u$user -p$password -D$db  -e "$sql"
    done
    Note
    • When you create a script, make sure that the script has the required permissions. In this example, the create_indexes.sh script is used. You can use the command chmod +x create_indexes.sh to add the permissions to the script.

    • When executing the script, you must pass the corresponding parameters. In this example, the create_indexes.sh script is run as the ./create_indexes.sh <host> <port> <user> <password> <database> command. The parameters in the preceding script:

      • <host>: the endpoint of the cluster.

      • <port>: the port.

      • <user>: the username of the database.

      • <password>: the password of the database.

      • <database>: the name of the database.

    • If the warning message "[Warning] Using a password on the command line interface can be insecure" appears when the script is executed, the script uses the mysql -h <database connection address> -P <database port> -u <database account> -p <database password> -e<SQL> command. Therefore, the system will prompt you to enter the password on the command line, which may cause other users on the system to view the password by running commands such as ps. This message does not affect your operations. After the operation is complete, you can go to the PolarDB console to change the password of the database.

    • To measure the performance improvement brought by parallel queries, execute the following query to preload the indexes to the memory pool:

      #!/bin/bash
      host=$1
      port=$2
      user=$3
      password=$4
      dbname=$5
      MYSQL="mysql -h$host -P$port -u$user -p$password -D$dbname"
      if [ -z ${dbname} ]; then
          echo "dbname not defined."
          exit 1
      fi
      table_indexes=(
              "supplier PRIMARY"
              "supplier i_s_nationkey"
              "part PRIMARY"
              "partsupp PRIMARY"
              "partsupp i_ps_partkey"
              "partsupp i_ps_suppkey"
              "customer PRIMARY"
              "customer i_c_nationkey"
              "orders PRIMARY"
              "orders i_o_custkey"
              "orders i_o_orderdate"
              "lineitem PRIMARY"
              "lineitem i_l_orderkey"
              "lineitem i_l_partkey"
              "lineitem i_l_suppkey"
              "lineitem i_l_partkey_suppkey"
              "lineitem i_l_shipdate"
              "lineitem i_l_commitdate"
              "lineitem i_l_receiptdate"
              "nation i_n_regionkey"
              "nation PRIMARY"
              "region PRIMARY"
      )
      for table_index in "${table_indexes[@]}"
      do
          ti=($table_index)
          table=${ti[0]}
          index=${ti[1]}
          SQL="select count(*) from ${table} force index(${index})"
          echo "$MYSQL -e '$SQL'"
          $MYSQL -e "$SQL"
      done
  10. Run a query.

    You must run the exit; command to exit the MySQL client. Then, create a script on the ECS instance and execute it. Sample script:

    #!/usr/bin/env bash
    host=$1
    port=$2
    user=$3
    password=$4
    database=$5
    resfile=$6
    echo "start test run at"`date "+%Y-%m-%d %H:%M:%S"`|tee -a ${resfile}.out
    for (( i=1; i<=22;i=i+1 ))
    do
    queryfile="./queries/Q"${i}".sql"
    start_time=`date "+%s.%N"`
    echo "run query ${i}"|tee -a ${resfile}.out
    mysql -h ${host}  -P${port} -u${user} -p${password} $database -e" source $queryfile;" |tee -a ${resfile}.out
    end_time=`date "+%s.%N"`
    start_s=${start_time%.*}
    start_nanos=${start_time#*.}
    end_s=${end_time%.*}
    end_nanos=${end_time#*.}
    if [ "$end_nanos" -lt "$start_nanos" ];then
            end_s=$(( 10#$end_s -1 ))
            end_nanos=$(( 10#$end_nanos + 10 ** 9))
    fi
    time=$(( 10#$end_s - 10#$start_s )).`printf "%03d\n" $(( (10#$end_nanos - 10#$start_nanos)/10**6 ))`
    echo ${queryfile} "the "${j}" run cost "${time}" second start at"`date -d @$start_time "+%Y-%m-%d %H:%M:%S"`" stop at"`date -d @$end_time "+%Y-%m-%d %H:%M:%S"` >> ${resfile}.time
    done
    Note
    • Before you execute the script, confirm the following items:

      • Make sure that the script has the required permissions. In this example, the run_queries.sh script is used. You can use the command chmod +x run_queries.sh to add the permissions to the script.

      • Check the directory where you are currently located. The script in the example is running in the dbgen directory. Modify queryfile="./queries/Q"${i}".sql" accordingly. Otherwise, the error message "ERROR at line 1: Failed to open file 'QXX.sql'" appears.

      • Make sure that you have uploaded and decompressed the 22 queries in attached files to the dbgen/queries directory. If you use other query statements, modify queryfile="./queries/Q"${i}".sql" as needed. Otherwise, the error message "ERROR at line 1: Failed to open file 'QXX.sql'" appears.

    • When executing the script, you must pass the corresponding parameters. In this example, the run_queries.sh script is run as the ./run_queries.sh <host> <port> <user> <password> <database> <resfile> command. The parameters in the preceding script:

      • <host>: the endpoint of the cluster.

      • <port>: the port.

      • <user>: the username of the database.

      • <password>: the password of the database.

      • <database>: the name of the database.

      • <resfile>: the name of the result file.

    • If the warning message "[Warning] Using a password on the command line interface can be insecure" appears when the script is executed, the script uses the mysql -h <database connection address> -P <database port> -u <database account> -p <database password> -e<SQL> command. Therefore, the system will prompt you to enter the password on the command line, which may cause other users on the system to view the password by running commands such as ps. This message does not affect your operations. After the operation is complete, you can go to the PolarDB console to change the password of the database.

Results

For more information, see Parallel query performance.