MaxCompute has high performance advantages in the industry and is suitable for queries of terabytes, petabytes, or even exabytes of data. This topic describes how to perform a big data benchmark TPC-DS test based on the public datasets and test tools that are provided by MaxCompute to verify the performance of MaxCompute.
Preparations
Prepare an environment.
Before you perform a TPC-DS test, activate MaxCompute and create a project. For more information, see Create a project.
Activate MaxCompute Query Acceleration (MCQA) for a subscription MaxCompute project. For more information, see MaxCompute Query Acceleration.
Prepare a test tool.
MaxCompute provides a TPC-DS automated performance test tool to help you quickly complete a TPC-DS test and automatically generate test results.
ImportantThe test tool can be used only in Linux in which a Java Development Kit (JDK) of 1.7 or later is installed.
You can click mc_tpcds_benchmark to download the package of the test tool and run the following command on the Linux server to decompress the package:
unzip mc_tpcds_benchmark.zip
The following code shows the directory structure of the decompressed file.
. |_t1c7039e3-2a1d-451b-bfda-d14c49016243-tpc-ds-tool.zip |_config |_init_tools.sh |_load_table.sh |_logs |_odps_clt |_patches |_pt.sh |_queries_1 |_queries_1.quality |_queries_10 |_queries_100 |_queries_1000 |_queries_10000 |_queries_100000 |_querygen.sh |_results |_run_stream.sh |_run_stream.sh.offline |_sqls |_start_session_only.sh |_start_session.sql |_start_session.sql_tmp |_tools_file |_tt.sh |_v2.10.1rc3
Obtain a test dataset.
MaxCompute provides public datasets. You do not need to prepare test data. All test data is stored in the public project
BIGDATA_PUBLIC_DATASET
of MaxCompute. For more information, see Overview.TPC-DS test datasets are divided into 10 GB, 100 GB, 1 TB, and 10 TB datasets based on the data size. The following table describes the datasets.
Type
Description
Dataset name
Schema name
TPC-DS
TPC-DS is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. TPC-DS enables emerging technologies, such as big data systems, to perform benchmark tests.
TPC-DS 10-GB performance test dataset
TPC-DS 100-GB performance test dataset
TPC-DS 1-TB performance test dataset
TPC-DS 10-TB performance test dataset
tpcds_10g
tpcds_100g
tpcds_1t
tpcds_10t
Procedure
Modify the configuration file of the test tool
Go to the mc_tpcds_benchmark directory of the decompressed package of the test tool and modify the config file. The following table describes the configuration items that you need to modify.
Configuration item | Description | Value |
ODPS_CLT_CMD | The absolute path of the executable file of the MaxCompute client. The client that is provided in the package is odps_clt in the working directory. You can modify the related configuration. For more information, see Install and configure the MaxCompute client. | Example: /xxxxx/mc_tpcds_benchmark/odps_clt/bin/odpscmd. |
PROJECT | The MaxCompute project that is used for the test. | Example: tpcds_test. |
SF | The data size of the TPC-DS test. Unit: GB. 1 indicates 1 GB. 1000 indicates 1 TB. You can change the value based on your test requirements. | Default value: 1000 |
SQL_FLAGS | The built-in flag parameters of MaxCompute. You do not need to modify the configuration of these parameters. |
|
Start the test
Run the following command in the mc_tpcds_benchmark directory to start the TPC-DS test:
nohup sh pt.sh > pt.log 2>&1 &
If the test is successful, a pt.log file is generated in the mc_tpcds_benchmark directory. You can run the following command to view the logs of the job:
tail -f pt.log
View the execution information about MaxCompute jobs
You can view the execution information about a job on the Jobs page in the MaxCompute console. For more information, see Manage jobs in the new MaxCompute console.
View test results
If the execution is successful, a test result file named console_test_result.csv is generated in the mc_tpcds_benchmark directory. You can view test results in the file, including the total test duration, the execution time of each query, and the related LogView information.
Test result references
The following table provides the TPC-DS 1-TB performance test result for subscription MaxCompute projects that use 64 compute units (CUs) and 128 CUs.
Item | Test specifications | Time consumed in the TPC-DS test (seconds) |
MaxCompute | 64 CUs | 2694.92 |
128 CUs | 1347.62 |
One CU is equal to 4 GB of memory and 1 CPU core.
The following table describes the test result of each query.
Query | MaxCompute (64 CUs) | MaxCompute (128 CUs) |
Time consumed (seconds) | Time consumed (seconds) | |
query01 | 6.762 | 5.443 |
query02 | 14.148 | 6.897 |
query03 | 11.566 | 3.976 |
query04 | 144.66 | 69.903 |
query05 | 21.408 | 10.47 |
query06 | 13.52 | 6.612 |
query07 | 21.245 | 11.78 |
query08 | 12.265 | 7.116 |
query09 | 36.593 | 16.632 |
query10 | 8.122 | 5.32 |
query11 | 87.64 | 36.659 |
query12 | 4.832 | 3.15 |
query13 | 36.385 | 21.059 |
query14_1 | 51.514 | 23.185 |
query14_2 | 51.514 | 23.185 |
query15 | 10.798 | 5.082 |
query16 | 38.785 | 18.592 |
query17 | 62.474 | 26.586 |
query18 | 21.512 | 15.546 |
query19 | 18.07 | 8.265 |
query20 | 7.528 | 6.05 |
query21 | 4.209 | 2.487 |
query22 | 11.714 | 9.449 |
query23_1 | 108.57 | 48.22 |
query23_2 | 114.723 | 52.581 |
query24_1 | 37.552 | 19.499 |
query24_2 | 37.552 | 19.499 |
query25 | 65.682 | 27.925 |
query26 | 16.169 | 8.208 |
query27 | 21.118 | 10.624 |
query28 | 51.973 | 21.805 |
query29 | 52.938 | 23.139 |
query30 | 5.913 | 5.186 |
query31 | 22.735 | 12.598 |
query32 | 6.319 | 4.86 |
query33 | 24.736 | 12.623 |
query34 | 13.912 | 7.348 |
query35 | 22.332 | 15.09 |
query36 | 9.554 | 5.067 |
query37 | 7.521 | 4.516 |
query38 | 23.075 | 12.383 |
query39_1 | 6.392 | 5.416 |
query39_2 | 5.874 | 5.103 |
query40 | 11.921 | 7.784 |
query41 | 2.251 | 2.325 |
query42 | 9.763 | 4.153 |
query43 | 7.354 | 3.959 |
query44 | 13.956 | 5.597 |
query45 | 16.837 | 11.118 |
query46 | 18.565 | 9.506 |
query47 | 24.091 | 11.06 |
query48 | 34.251 | 16.782 |
query49 | 18.269 | 9.803 |
query50 | 38.274 | 15.834 |
query51 | 15.488 | 8.011 |
query52 | 9.671 | 4.767 |
query53 | 11.719 | 9.567 |
query54 | 22.255 | 12.879 |
query55 | 9.625 | 6.552 |
query56 | 28.238 | 13.62 |
query57 | 14.05 | 7.098 |
query58 | 15.003 | 7.153 |
query59 | 17.664 | 10.237 |
query60 | 22.724 | 11.404 |
query61 | 25.064 | 15.224 |
query62 | 6.247 | 3.723 |
query63 | 11.569 | 6.753 |
query64 | 65.078 | 30.954 |
query65 | 24.433 | 13.206 |
query66 | 14.68 | 6.369 |
query67 | 177.557 | 87.298 |
query68 | 21.597 | 14.711 |
query69 | 9.36 | 4.694 |
query70 | 12.77 | 6.85 |
query71 | 24.038 | 9.816 |
query72 | 40.065 | 20.605 |
query73 | 12.876 | 5.305 |
query74 | 63.063 | 24.245 |
query75 | 62.983 | 27.108 |
query76 | 18.532 | 11.337 |
query77 | 10.406 | 6.047 |
query78 | 60.976 | 25.741 |
query79 | 23.916 | 14.605 |
query80 | 32.804 | 14.048 |
query81 | 7.194 | 7.519 |
query82 | 14.972 | 6.258 |
query83 | 4.644 | 3.897 |
query84 | 4.722 | 3.57 |
query85 | 11.669 | 7.434 |
query86 | 4.219 | 3.083 |
query87 | 25.945 | 11.291 |
query88 | 18.668 | 8.806 |
query89 | 16.146 | 6.265 |
query90 | 4.522 | 3.099 |
query91 | 4.093 | 3.663 |
query92 | 4.766 | 2.895 |
query93 | 38.735 | 16.587 |
query94 | 22.634 | 13.693 |
query95 | 29.991 | 17.616 |
query96 | 10.929 | 8.17 |
query97 | 23.508 | 10.563 |
query98 | 6.518 | 4.003 |
query99 | 9.157 | 6.229 |
Total | 2694.919 | 1347.623 |