Before you execute and test a TPC-H dataset on Alibaba Cloud AnalyticDB for MySQL, make the following preparations:
- Complete the following steps to create a directory in OSS to store the TPC-H dataset:
- Activate OSS
Make sure that OSS and AnalyticDB for MySQL are located in the same region.
- Create buckets
- Create folders
The following storage path in Object Storage Service (OSS) is used in this example:
- Activate OSS
- You have created an AnalyticDB for MySQL cluster, configured a whitelist, and created accounts and databases for the cluster. For more information, see Quick start of AnalyticDB for MySQL.
- Download the TPC-H dataset.
- Generate TPC-H test data.
After you download the TPC-H dataset, use the DBGEN tool built into the dataset to generate CSV-formatted data at a specified scale factor. The scale factor can be set based on the size of data to be generated. For example, if you need to generate 100 GB of TPC-H test data, set the scale factor to 100 by using the
100:dbgen -s100command. If you need to generate a larger data volume in the dataset, you can use the concurrency function of DBGEN to accelerate data generation.
Typically, a TPC-H table corresponds to a single data file. However, if the scale factor is set to a large value such as 10000, multiple data files will be created for a single TPC-H table.
- Upload the test data to OSS.
After you generate the TPC-H test data, upload the data to the following path in OSS:
If a table corresponds to a single data file, you can upload all the data files to an OSS directory. If multiple data files are created for a single table, we recommend that you upload the data files corresponding to the same table to a separate subdirectory. This way, you only need to specify the subdirectories while exporting external tables.
- Create TPC-H tables and OSS external tables in AnalyticDB for MySQL, and use OSS external tables to import data into AnalyticDB for MySQL.
- Execute and test TPC-H queries.