All Products
Search
Document Center

AnalyticDB for MySQL:Import data

Last Updated:Nov 30, 2023

This topic describes how to import 100 GB of TPC-H test data to AnalyticDB for MySQL and Presto.

The following table lists the number of rows in different tables of the TPC-H dataset.

Table name

Number of rows

customer

15,000,000

lineitem

600,037,902

nation

25

orders

150,000,000

part

20,000,000

partsupp

80,000,000

region

5

supplier

1,000,000

Import data to AnalyticDB for MySQL

  1. Execute the LOAD DATA statement to import the files that are generated by DBGEN to AnalyticDB for MySQL.

    LOAD DATA LOCAL INFILE 'customer.tbl' INTO TABLE CUSTOMER 
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'orders.tbl'   INTO TABLE ORDERS   
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'lineitem.tbl' INTO TABLE LINEITEM 
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'nation.tbl'   INTO TABLE NATION   
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'partsupp.tbl' INTO TABLE PARTSUPP 
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL NFILE 'part.tbl'     INTO TABLE PART     
    FIELDS TERMINATED BY '|'   LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'region.tbl'   INTO TABLE REGION   
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    LOAD DATA LOCAL INFILE 'supplier.tbl' INTO TABLE SUPPLIER 
    FIELDS TERMINATED BY '|'  LINES TERMINATED BY '\r\n';
    Note

    You can also use Object Storage Service (OSS) external tables to import test data to AnalyticDB for MySQL. For more information, see Use external tables to import data to Data Warehouse Edition.

  2. Collect statistics.

    ANALYZE TABLE customer UPDATE HISTOGRAM;
    ANALYZE TABLE lineitem UPDATE HISTOGRAM;
    ANALYZE TABLE nation UPDATE HISTOGRAM;
    ANALYZE TABLE orders UPDATE HISTOGRAM;
    ANALYZE TABLE part UPDATE HISTOGRAM;
    ANALYZE TABLE partsupp UPDATE HISTOGRAM;
    ANALYZE TABLE region UPDATE HISTOGRAM;
    ANALYZE TABLE supplier UPDATE HISTOGRAM;
    Note

    The query optimizer converts queries into execution plans that are executed by the execution engine. The quality of an execution plan affects its query performance. Statistics can be used to help the query optimizer generate high-quality execution plans. After you import data, you must collect histogram statistics on all tables to obtain the optimal performance. For more information about statistics, see Statistics.

Import data to Presto

You can use external tables to import test data to Presto.

  • The customer table

    INSERT overwrite TABLE customer
    SELECT
        c_custkey,
            c_name,
            c_address,
            c_nationkey,
            c_phone,
            c_acctbal,
            c_mktsegment,
            c_comment
    FROM ${source_db}.customer;
  • The lineitem table

    INSERT overwrite TABLE lineitem
    SELECT
        l_orderkey,
        l_partkey,
        l_suppkey,
        l_linenumber,
        l_quantity,
        l_extendedprice,
        l_discount,
        l_tax,
        l_returnflag,
        l_linestatus,
        l_shipdate,
        l_commitdate,
        l_receiptdate,
        l_shipinstruct,
        l_shipmode,
        l_comment
    FROM ${source_db}.lineitem;
  • The nation table

    INSERT overwrite TABLE nation
    SELECT
            n_nationkey,
            n_name,
            n_regionkey,
            n_comment
    FROM ${source_db}.nation;
  • The orders table

    INSERT overwrite TABLE orders
    SELECT
            o_orderkey,
            o_custkey,
            o_orderstatus,
            o_totalprice,
            o_orderdate,
            o_orderpriority,
            o_clerk,
            o_shippriority,
            o_comment
    FROM ${source_db}.orders;
  • The part table

    INSERT overwrite TABLE part
    SELECT
            p_partkey,
            p_name,
            p_mfgr,
            p_brand,
            p_type,
            p_size,
            p_container,
            p_retailprice,
            p_comment
    FROM ${source_db}.part;
  • The partsupp table

    INSERT overwrite TABLE partsupp
    SELECT
            ps_partkey,
            ps_suppkey,
            ps_availqty,
            ps_supplycost,
            ps_comment
    FROM ${source_db}.partsupp;
  • The region table

    INSERT overwrite TABLE region
    SELECT
            r_regionkey,
            r_name,
            r_comment
    FROM ${source_db}.region;
  • The supplier table

    INSERT overwrite TABLE supplier
    SELECT
        s_suppkey,
        s_name,
        s_address,
        s_nationkey,
        s_phone,
        s_acctbal,
        s_comment
    FROM ${source_db}.supplier;