Background information
TPC-H is a benchmark that is widely used to evaluate the analytic query capabilities of databases. It is developed and released by Transaction Processing Performance Council (TPC). In the TPC-H benchmark, 22 complex SQL queries are performed on eight tables. Most queries contain joins on several tables, subqueries, and GROUP BY clauses.
In this example, a test based on the TPC-H benchmark is implemented, but it does not meet all the requirements of a TPC-H benchmark test. Therefore, the test results are incomparable with the published results of the TPC-H benchmark test.
Test design
Data volume
1 TB (Scalar Factor=1000), consisting of the following parts:
LINEITEM: approximately 6 billion lines
ORDERS: 1.5 billion lines
PART_SUPP: 0.8 billion lines
Instance specifications for the test
Node: 16c128g
Number of nodes: 3, 4, and 6
Dataset size: 1,000 GB
ECS instance type for the test
ecs.g7.4xlarge (with 16 vCPUs and 64 GB RAM)
Test procedure
Step 1: Prepare the ECS instance
Prepare an Elastic Compute Service (ECS) instance that meets the preceding requirements. The ECS instance will be used in later steps of importing data and applying loads.
Deploy the ECS instance in a virtual private cloud (VPC) and record the VPC name and ID. Other required resources will also be deployed in this VPC later.
We recommend that you run CentOS and enable Internet access on this ECS instance.
Step 2: Prepare a PolarDB-X instance for the test
Create a PolarDB-X instance. For more information, see Create a PolarDB-X instance.
You must deploy the PolarDB-X instance in the same VPC as the ECS instance.
You must add the internal IP address of the ECS instance to the whitelist of the PolarDB-X instance.
Step 3: Adjust the PolarDB-X instance
Specific parameters of the PolarDB-X instance must be adjusted to provide optimal performance in the test.
Change the value of the XPROTO_MAX_DN_CONCURRENT parameter to 4000. For more information, see Set parameters.
Connect to the PolarDB-X instance by using the CLI and run the following SQL statements in the same session to disable logging, CPU sampling, and statistics collection:
set global RECORD_SQL = false; set global MPP_METRIC_LEVEL = 0; set global ENABLE_CPU_PROFILE = false; set global ENABLE_BACKGROUND_STATISTIC_COLLECTION=false; set global ENABLE_STATISTIC_FEEDBACK=false;
Step 4: Install Benchmark Boot
Open the 4121 port in the security group of the ECS instance. For more information, see Add a security group rule.
Log on to the ECS instance and run the
bash -c "$(curl -fsSL https://benchmark-boot.oss-cn-hangzhou.aliyuncs.com/setup.sh)"command to install Benchmark Boot. For more information, see Download and install Benchmark Boot.Visit http://{Public IP address of the ECS instance}:4121 in the browser. If the Benchmark Boot homepage appears, the installation is successful.
Step 5: Prepare data
On the Benchmark Boot homepage, choose . On the page that appears, configure the host IP address, port, username, password, TPC-H library, and library creation mode (AUTO) to create a PolarDB-X connection. Then, click Submit.
Choose , set the size of the input dataset to 1,000 GB and the number of physical shards to 16, and then click Submit to import 1,000 GB of test data to the primary PolarDB-X instance.
NoteAfter the data is imported, Benchmark Boot automatically executes analyze table to collect statistics.
After the data is imported, execute the following SQL statements on the primary PolarDB-X instance to create a Clustered Columnar Index (CCI). For more information, see Create and use a CCI.
create clustered columnar index `nation_col_index` on nation(`n_nationkey`) partition by hash(`n_nationkey`) partitions 1; create clustered columnar index `region_col_index` on region(`r_regionkey`) partition by hash(`r_regionkey`) partitions 1; create clustered columnar index `customer_col_index` on customer(`c_custkey`) partition by hash(`c_custkey`) partitions 96; create clustered columnar index `part_col_index` on part(`p_size`) partition by hash(`p_partkey`) partitions 96; create clustered columnar index `partsupp_col_index` on partsupp(`ps_partkey`) partition by hash(`ps_partkey`) partitions 96; create clustered columnar index `supplier_col_index` on supplier(`s_suppkey`) partition by hash(`s_suppkey`) partitions 96; create clustered columnar index `orders_col_index` on orders(`o_orderdate`,`o_orderkey`) partition by hash(`o_orderkey`) partitions 96; create clustered columnar index `lineitem_col_index` on lineitem(`l_shipdate`,`l_orderkey`) partition by hash(`l_orderkey`) partitions 96;NoteYou can execute
show columnar indexto view the status of the CCI. For more information, see SHOW COLUMNAR INDEX.It takes about 45 minutes to create a CCI.
Step 6: Add a read-only column store instance
Add a read-only column store instance to the primary PolarDB-X instance. For more information, see Add a column store read-only instance.
You must also add the internal IP address of the ECS instance to the whitelist of the read-only PolarDB-X column store instance.
Step 7: Adjust the read-only column store instance
For optimal performance to be achieved in the test, you must also adjust specific parameters of the read-only PolarDB-X column store instance.
Connect to the read-only PolarDB-X column store instance by using the CLI and run the following SQL statements in the same session to set parameters:
set global SCHEDULE_BY_PARTITION=true;
set global BLOCK_CACHE_MEMORY_SIZE_FACTOR=0.3;
set global OSS_FS_USE_BYTES_CACHE=true;
set global OSS_FS_MEMORY_RATIO_OF_BYTES_CACHE=0.3;
set global ONLY_CACHE_PRIMARY_KEY_IN_BLOCK_CACHE=true;
set global MPP_TASK_LOCAL_MAX_BUFFER_SIZE=8000000000;
set global GLOBAL_RF_ROWS_UPPER_BOUND=50000000;Step 8: Run the TPC-H test
Choose . On the page that appears, configure the TPC-H library name and library creation mode (Auto Detect) to create a connection string for the read-only column store instance. Then, click Submit.
Choose , select All Queries, specify a task description, and click Submit to run the 22 query statements for the TPC-H test.
Choose to view the test results.
NoteA warm-up is required if the TPC-H test is run for the first time. Stable results are produced after several rounds of test running.
Test results
The following table lists the test results. Unit: seconds.
Query | 3*16C128 GB | 4*16C128 GB | 6*16C128 GB |
Q1 | 28.15 | 24.81 | 14 |
Q2 | 11.55 | 9.02 | 5.12 |
Q3 | 20.98 | 16.54 | 8.73 |
Q4 | 16.26 | 14.59 | 7.06 |
Q5 | 28.25 | 21.28 | 14.01 |
Q6 | 7.76 | 6.84 | 1.95 |
Q7 | 25.81 | 17.83 | 11.14 |
Q8 | 23.97 | 21.38 | 10.37 |
Q9 | 71.63 | 57.86 | 36.24 |
Q10 | 35.98 | 27.91 | 16.85 |
Q11 | 6.61 | 4.65 | 2.54 |
Q12 | 8.46 | 7.1 | 3.37 |
Q13 | 33.15 | 23.87 | 16.8 |
Q14 | 9.32 | 8.1 | 3.31 |
Q15 | 7.44 | 6.2 | 3.11 |
Q16 | 10.18 | 6.07 | 4.08 |
Q17 | 18.2 | 11.97 | 7.46 |
Q18 | 47.26 | 38.65 | 20.15 |
Q19 | 24.86 | 21.81 | 13.48 |
Q20 | 11.16 | 8.59 | 5.35 |
Q21 | 50.24 | 44.85 | 22.76 |
Q22 | 11.54 | 8.53 | 6.05 |
Total time | 508.76 | 408.45 | 233.93 |