All Products
Search
Document Center

Parquet

Last Updated: Jul 25, 2019

This topic uses customer.tbl as an example to describe how to convert text files to Parquet files.

1

Procedure

  1. Create an OSS schema.

    1. CREATE SCHEMA dla_oss_db with DBPROPERTIES(
    2. catalog='oss',
    3. location 'oss://dlaossfile1/TPC-H/'
    4. )
  2. Create a table named customer_txt in DLA and set LOCATION to the path of customer.tbl in OSS.

    1. CREATE EXTERNAL TABLE customer_txt (
    2. c_custkey int,
    3. c_name string,
    4. c_address string,
    5. c_nationkey int,
    6. c_phone string,
    7. c_acctbal double,
    8. c_mktsegment string,
    9. c_comment string
    10. )
    11. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    12. STORED AS TEXTFILE LOCATION 'oss://dlaossfile1/TPC-H/customer/customer.tbl'
  3. Create the target table customer_parquet in DLA and set LOCATION to the required path in OSS.

    2

    Note: LOCATION must be an existing directory in OSS and ended with /.

    1. CREATE EXTERNAL TABLE customer_parquet (
    2. c_custkey int,
    3. c_name string,
    4. c_address string,
    5. c_nationkey int,
    6. c_phone string,
    7. c_acctbal double,
    8. c_mktsegment string,
    9. c_comment string
    10. )
    11. ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
    12. STORED AS PARQUET LOCATION 'oss://dlaossfile1/TPC-H/customer_parquet/'

    STORED AS PARQUET: indicates that the table is stored in Parquet format.

  4. Run the INSERT...SELECT statement to insert data from the customer_txt table to the customer_parquet table.

    1. INSERT INTO customer_parquet SELECT * FROM customer_txt;
  5. View the data in table customer_parquet.

    After the INSERT...SELECT statement is executed, view the Parquet file created in OSS.

    3

    More information

Create a table in Parquet format.