This topic describes how to import 100 GB of TPC-H test data to AnalyticDB for MySQL and Presto.
The following table lists the number of rows in different tables of the TPC-H dataset.
Table name | Number of rows |
customer | 15,000,000 |
lineitem | 600,037,902 |
nation | 25 |
orders | 150,000,000 |
part | 20,000,000 |
partsupp | 80,000,000 |
region | 5 |
supplier | 1,000,000 |
Import data to AnalyticDB for MySQL
Execute the LOAD DATA statement to import the files that are generated by DBGEN to AnalyticDB for MySQL.
LOAD DATA LOCAL INFILE 'customer.tbl' INTO TABLE CUSTOMER FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'orders.tbl' INTO TABLE ORDERS FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'lineitem.tbl' INTO TABLE LINEITEM FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'nation.tbl' INTO TABLE NATION FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'partsupp.tbl' INTO TABLE PARTSUPP FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL NFILE 'part.tbl' INTO TABLE PART FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'region.tbl' INTO TABLE REGION FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n'; LOAD DATA LOCAL INFILE 'supplier.tbl' INTO TABLE SUPPLIER FIELDS TERMINATED BY '|' LINES TERMINATED BY '\r\n';
NoteYou can also use Object Storage Service (OSS) external tables to import test data to AnalyticDB for MySQL. For more information, see Use external tables to import data to Data Warehouse Edition.
Collect statistics.
ANALYZE TABLE customer UPDATE HISTOGRAM; ANALYZE TABLE lineitem UPDATE HISTOGRAM; ANALYZE TABLE nation UPDATE HISTOGRAM; ANALYZE TABLE orders UPDATE HISTOGRAM; ANALYZE TABLE part UPDATE HISTOGRAM; ANALYZE TABLE partsupp UPDATE HISTOGRAM; ANALYZE TABLE region UPDATE HISTOGRAM; ANALYZE TABLE supplier UPDATE HISTOGRAM;
NoteThe query optimizer converts queries into execution plans that are executed by the execution engine. The quality of an execution plan affects its query performance. Statistics can be used to help the query optimizer generate high-quality execution plans. After you import data, you must collect histogram statistics on all tables to obtain the optimal performance. For more information about statistics, see Statistics.
Import data to Presto
You can use external tables to import test data to Presto.
The customer table
INSERT overwrite TABLE customer SELECT c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment FROM ${source_db}.customer;
The lineitem table
INSERT overwrite TABLE lineitem SELECT l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment FROM ${source_db}.lineitem;
The nation table
INSERT overwrite TABLE nation SELECT n_nationkey, n_name, n_regionkey, n_comment FROM ${source_db}.nation;
The orders table
INSERT overwrite TABLE orders SELECT o_orderkey, o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, o_shippriority, o_comment FROM ${source_db}.orders;
The part table
INSERT overwrite TABLE part SELECT p_partkey, p_name, p_mfgr, p_brand, p_type, p_size, p_container, p_retailprice, p_comment FROM ${source_db}.part;
The partsupp table
INSERT overwrite TABLE partsupp SELECT ps_partkey, ps_suppkey, ps_availqty, ps_supplycost, ps_comment FROM ${source_db}.partsupp;
The region table
INSERT overwrite TABLE region SELECT r_regionkey, r_name, r_comment FROM ${source_db}.region;
The supplier table
INSERT overwrite TABLE supplier SELECT s_suppkey, s_name, s_address, s_nationkey, s_phone, s_acctbal, s_comment FROM ${source_db}.supplier;