All Products
Search
Document Center

MaxCompute:TPC-DS performance testing

Last Updated:Jan 09, 2024

MaxCompute has high performance advantages in the industry and is suitable for queries of terabytes, petabytes, or even exabytes of data. This topic describes how to perform a big data benchmark TPC-DS test based on the public datasets and test tools that are provided by MaxCompute to verify the performance of MaxCompute.

Preparations

  1. Prepare an environment.

    • Before you perform a TPC-DS test, activate MaxCompute and create a project. For more information, see Create a project.

    • Activate MaxCompute Query Acceleration (MCQA) for a subscription MaxCompute project. For more information, see MaxCompute Query Acceleration.

  2. Prepare a test tool.

    MaxCompute provides a TPC-DS automated performance test tool to help you quickly complete a TPC-DS test and automatically generate test results.

    Important

    The test tool can be used only in Linux in which a Java Development Kit (JDK) of 1.7 or later is installed.

    You can click mc_tpcds_benchmark to download the package of the test tool and run the following command on the Linux server to decompress the package:

    unzip mc_tpcds_benchmark.zip

    The following code shows the directory structure of the decompressed file.

    .
    |_t1c7039e3-2a1d-451b-bfda-d14c49016243-tpc-ds-tool.zip
    |_config
    |_init_tools.sh
    |_load_table.sh
    |_logs
    |_odps_clt
    |_patches
    |_pt.sh
    |_queries_1
    |_queries_1.quality
    |_queries_10
    |_queries_100
    |_queries_1000
    |_queries_10000
    |_queries_100000
    |_querygen.sh
    |_results
    |_run_stream.sh
    |_run_stream.sh.offline
    |_sqls
    |_start_session_only.sh
    |_start_session.sql
    |_start_session.sql_tmp
    |_tools_file
    |_tt.sh
    |_v2.10.1rc3
  3. Obtain a test dataset.

    MaxCompute provides public datasets. You do not need to prepare test data. All test data is stored in the public project BIGDATA_PUBLIC_DATASET of MaxCompute. For more information, see Overview.

    TPC-DS test datasets are divided into 10 GB, 100 GB, 1 TB, and 10 TB datasets based on the data size. The following table describes the datasets.

    Type

    Description

    Dataset name

    Schema name

    TPC-DS

    TPC-DS is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. TPC-DS enables emerging technologies, such as big data systems, to perform benchmark tests.

    • TPC-DS 10-GB performance test dataset

    • TPC-DS 100-GB performance test dataset

    • TPC-DS 1-TB performance test dataset

    • TPC-DS 10-TB performance test dataset

    • tpcds_10g

    • tpcds_100g

    • tpcds_1t

    • tpcds_10t

Procedure

Modify the configuration file of the test tool

Go to the mc_tpcds_benchmark directory of the decompressed package of the test tool and modify the config file. The following table describes the configuration items that you need to modify.

Configuration item

Description

Value

ODPS_CLT_CMD

The absolute path of the executable file of the MaxCompute client.

The client that is provided in the package is odps_clt in the working directory. You can modify the related configuration. For more information, see Install and configure the MaxCompute client.

Example: /xxxxx/mc_tpcds_benchmark/odps_clt/bin/odpscmd.

PROJECT

The MaxCompute project that is used for the test.

Example: tpcds_test.

SF

The data size of the TPC-DS test.

Unit: GB. 1 indicates 1 GB. 1000 indicates 1 TB. You can change the value based on your test requirements.

Default value: 1000

SQL_FLAGS

The built-in flag parameters of MaxCompute. You do not need to modify the configuration of these parameters.

  • set odps.sql.session.result.cache.enable=false: Disable the result cache feature for a MaxCompute project in MCQA mode. This ensures that each query can be independently executed.

  • set odps.sql.allow.cartesian=true: Allow SQL to support Cartesian product calculation.

  • set odps.sql.session.query.timeout=600: Specify the timeout period of a Fuxi job for a MaxCompute project in MCQA mode.

Start the test

Run the following command in the mc_tpcds_benchmark directory to start the TPC-DS test:

nohup sh pt.sh > pt.log 2>&1 &

If the test is successful, a pt.log file is generated in the mc_tpcds_benchmark directory. You can run the following command to view the logs of the job:

tail -f pt.log

View the execution information about MaxCompute jobs

You can view the execution information about a job on the Jobs page in the MaxCompute console. For more information, see Manage jobs in the new MaxCompute console.

View test results

If the execution is successful, a test result file named console_test_result.csv is generated in the mc_tpcds_benchmark directory. You can view test results in the file, including the total test duration, the execution time of each query, and the related LogView information.

Test result references

The following table provides the TPC-DS 1-TB performance test result for subscription MaxCompute projects that use 64 compute units (CUs) and 128 CUs.

Item

Test specifications

Time consumed in the TPC-DS test (seconds)

MaxCompute

64 CUs

2694.92

128 CUs

1347.62

Note

One CU is equal to 4 GB of memory and 1 CPU core.

The following table describes the test result of each query.

Query

MaxCompute (64 CUs)

MaxCompute (128 CUs)

Time consumed (seconds)

Time consumed (seconds)

query01

6.762

5.443

query02

14.148

6.897

query03

11.566

3.976

query04

144.66

69.903

query05

21.408

10.47

query06

13.52

6.612

query07

21.245

11.78

query08

12.265

7.116

query09

36.593

16.632

query10

8.122

5.32

query11

87.64

36.659

query12

4.832

3.15

query13

36.385

21.059

query14_1

51.514

23.185

query14_2

51.514

23.185

query15

10.798

5.082

query16

38.785

18.592

query17

62.474

26.586

query18

21.512

15.546

query19

18.07

8.265

query20

7.528

6.05

query21

4.209

2.487

query22

11.714

9.449

query23_1

108.57

48.22

query23_2

114.723

52.581

query24_1

37.552

19.499

query24_2

37.552

19.499

query25

65.682

27.925

query26

16.169

8.208

query27

21.118

10.624

query28

51.973

21.805

query29

52.938

23.139

query30

5.913

5.186

query31

22.735

12.598

query32

6.319

4.86

query33

24.736

12.623

query34

13.912

7.348

query35

22.332

15.09

query36

9.554

5.067

query37

7.521

4.516

query38

23.075

12.383

query39_1

6.392

5.416

query39_2

5.874

5.103

query40

11.921

7.784

query41

2.251

2.325

query42

9.763

4.153

query43

7.354

3.959

query44

13.956

5.597

query45

16.837

11.118

query46

18.565

9.506

query47

24.091

11.06

query48

34.251

16.782

query49

18.269

9.803

query50

38.274

15.834

query51

15.488

8.011

query52

9.671

4.767

query53

11.719

9.567

query54

22.255

12.879

query55

9.625

6.552

query56

28.238

13.62

query57

14.05

7.098

query58

15.003

7.153

query59

17.664

10.237

query60

22.724

11.404

query61

25.064

15.224

query62

6.247

3.723

query63

11.569

6.753

query64

65.078

30.954

query65

24.433

13.206

query66

14.68

6.369

query67

177.557

87.298

query68

21.597

14.711

query69

9.36

4.694

query70

12.77

6.85

query71

24.038

9.816

query72

40.065

20.605

query73

12.876

5.305

query74

63.063

24.245

query75

62.983

27.108

query76

18.532

11.337

query77

10.406

6.047

query78

60.976

25.741

query79

23.916

14.605

query80

32.804

14.048

query81

7.194

7.519

query82

14.972

6.258

query83

4.644

3.897

query84

4.722

3.57

query85

11.669

7.434

query86

4.219

3.083

query87

25.945

11.291

query88

18.668

8.806

query89

16.146

6.265

query90

4.522

3.099

query91

4.093

3.663

query92

4.766

2.895

query93

38.735

16.587

query94

22.634

13.693

query95

29.991

17.616

query96

10.929

8.17

query97

23.508

10.563

query98

6.518

4.003

query99

9.157

6.229

Total

2694.919

1347.623