All Products
Search
Document Center

E-MapReduce:Test cases

Last Updated:Dec 12, 2024

This topic describes how to use TPC Benchmark H (TPC-H) to test the performance of StarRocks.

Prerequisites

An EMR Serverless StarRocks instance is created. For more information, see Create an instance.

Background information

TPC-H is a test dataset that is developed by the Transaction Processing Performance Council (TPC) to simulate decision support systems. TPC-H is used in academia and industries to evaluate the performance of decision support systems.

TPC-H models data in production environments to simulate the data warehouse of a sales system. In this example, a total of 22 SQL queries are tested against nine tables in two datasets whose data sizes are 1 GB and 100 GB. The following table describes the datasets and lists the tables.

Note
  • The tests in this example are performed based on the TPC-H benchmark, but the tests do not meet all requirements of the TPC-H benchmark test. As a result, the test results in this example may not match the published results of the TPC-H benchmark test.

  • We recommend that you use E-MapReduce (EMR) Serverless StarRocks instances of the following specifications to perform tests:

    • 1 GB dataset: one frontend (FE) that has eight compute units (CUs) and three backends (BEs) or compute nodes (CNs), each of which has eight CUs

    • 100GB dataset: one FE that has eight CUs and three BEs or CNs, each of which has 16 CUs

Data size

Description

Table

100 GB

Tests the StarRocks performance based on a TPC-H 100 GB dataset.

customer

lineitem

nation

orders

part

partsupp

region

revenue0

supplier

1 G

Tests the StarRocks performance based on a TPC-H 1 GB dataset.

Step 1: Go to the SQL editor

  1. Go to the Instances tab of the E-MapReduce (EMR) console.

    1. Log on to the EMR console.

    2. In the left-side navigation pane, choose EMR Serverless > StarRocks.

    3. In the top navigation bar, select a region based on your business requirements.

    4. Click StarRocks Manager, or click Connect in the operation column of the already created instance.

      For more information, see Use EMR StarRocks Manager to connect to an EMR Serverless StarRocks instance.

  2. In the left-side navigation pane, click SQL Editor.

  3. On the Queries tab, view the files that contain the sample code of TPC-H tests.

Step 2: Run tests

This section describes how to test the StarRocks performance based on TPC-H datasets of 1 GB and 100 GB.

TPC-H 100 GB dataset

  1. Initialize the database and tables.

    1. On the Queries tab, double-click TPC-H-100G - 01. Initialize the database and tables. In the SQL editor that appears, view the SQL statements used to initialize the database and tables.

    2. Click Run to execute the SQL statements. After the SQL statements are executed, the database and tables are initialized.

  2. Load the test data.

    1. On the Queries tab, double-click TPC-H-100G - 02. Load the test data. In the SQL editor that appears, view the SQL statements used to load the test data that is 100 GB in size.

    2. Click Run to execute the SQL statements. After the SQL statements are executed, the test data is loaded.

  3. Execute the SQL statements for testing.

    1. On the Queries tab, double-click TPC-H-100G - 03. Execute the SQL statements for testing. In the SQL editor that appears, view the SQL statements used to perform a test on 100 GB of test data.

    2. Click Run to execute the SQL statements and view the returned test results.

TPC-H 1 GB dataset

  1. Initialize the database and tables.

    1. On the Queries tab, double-click TPC-H-1G - 01. Initialize the database and tables. In the SQL editor that appears, view the SQL statements used to initialize the database and tables.

    2. Click Run to execute the SQL statements. After the SQL statements are executed, the database and tables are initialized.

  2. Load the test data.

    1. On the Queries tab, double-click TPC-H-1G - 02. Load the test data. In the SQL editor that appears, view the SQL statements used to load the test data that is 1 GB in size.

    2. Click Run to execute the SQL statements. After the SQL statements are executed, the test data is loaded.

  3. Execute the SQL statements for testing.

    1. On the Queries tab, double-click TPC-H-1G - 03. Execute the SQL statements for testing. In the SQL editor that appears, view the SQL statements used to perform a test on 1 GB of test data.

    2. Click Run to execute the SQL statements and view the returned test results.