This topic describes how to use TPC Benchmark H (TPC-H) to test the performance of StarRocks.
Prerequisites
An EMR Serverless StarRocks instance is created. For more information, see Create an instance.
Background information
TPC-H is a test dataset that is developed by the Transaction Processing Performance Council (TPC) to simulate decision support systems. TPC-H is used in academia and industries to evaluate the performance of decision support systems.
TPC-H models data in production environments to simulate the data warehouse of a sales system. In this example, a total of 22 SQL queries are tested against nine tables in two datasets whose data sizes are 1 GB and 100 GB. The following table describes the datasets and lists the tables.
The tests in this example are performed based on the TPC-H benchmark, but the tests do not meet all requirements of the TPC-H benchmark test. As a result, the test results in this example may not match the published results of the TPC-H benchmark test.
We recommend that you use E-MapReduce (EMR) Serverless StarRocks instances of the following specifications to perform tests:
1 GB dataset: one frontend (FE) that has eight compute units (CUs) and three backends (BEs) or compute nodes (CNs), each of which has eight CUs
100GB dataset: one FE that has eight CUs and three BEs or CNs, each of which has 16 CUs
Data size | Description | Table |
100 GB | Tests the StarRocks performance based on a TPC-H 100 GB dataset. | customer lineitem nation orders part partsupp region revenue0 supplier |
1 G | Tests the StarRocks performance based on a TPC-H 1 GB dataset. |
Step 1: Go to the SQL editor
Go to the Instances tab of the E-MapReduce (EMR) console.
Log on to the EMR console.
In the left-side navigation pane, choose .
In the top navigation bar, select a region based on your business requirements.
Click StarRocks Manager, or click Connect in the operation column of the already created instance.
For more information, see Use EMR StarRocks Manager to connect to an EMR Serverless StarRocks instance.
In the left-side navigation pane, click SQL Editor.
On the Queries tab, view the files that contain the sample code of TPC-H tests.
Step 2: Run tests
This section describes how to test the StarRocks performance based on TPC-H datasets of 1 GB and 100 GB.
TPC-H 100 GB dataset
Initialize the database and tables.
On the Queries tab, double-click TPC-H-100G - 01. Initialize the database and tables. In the SQL editor that appears, view the SQL statements used to initialize the database and tables.
Click Run to execute the SQL statements. After the SQL statements are executed, the database and tables are initialized.
Load the test data.
On the Queries tab, double-click TPC-H-100G - 02. Load the test data. In the SQL editor that appears, view the SQL statements used to load the test data that is 100 GB in size.
Click Run to execute the SQL statements. After the SQL statements are executed, the test data is loaded.
Execute the SQL statements for testing.
On the Queries tab, double-click TPC-H-100G - 03. Execute the SQL statements for testing. In the SQL editor that appears, view the SQL statements used to perform a test on 100 GB of test data.
Click Run to execute the SQL statements and view the returned test results.
TPC-H 1 GB dataset
Initialize the database and tables.
On the Queries tab, double-click TPC-H-1G - 01. Initialize the database and tables. In the SQL editor that appears, view the SQL statements used to initialize the database and tables.
Click Run to execute the SQL statements. After the SQL statements are executed, the database and tables are initialized.
Load the test data.
On the Queries tab, double-click TPC-H-1G - 02. Load the test data. In the SQL editor that appears, view the SQL statements used to load the test data that is 1 GB in size.
Click Run to execute the SQL statements. After the SQL statements are executed, the test data is loaded.
Execute the SQL statements for testing.
On the Queries tab, double-click TPC-H-1G - 03. Execute the SQL statements for testing. In the SQL editor that appears, view the SQL statements used to perform a test on 1 GB of test data.
Click Run to execute the SQL statements and view the returned test results.