All Products
Search
Document Center

PARQUET

Last Updated: Jul 30, 2019

This topic describes how to create tables for PARQUET files in DLA.

Parquet is a columnar storage file format supported by Apache Hadoop. When the same data is stored in ORC format and Parquet format, the data scanning performance is superior to that in CSV format.

Prerequisites

For Parquet test data preparations, see File format conversion.

Procedure

  1. Create an OSS schema.

    1. CREATE SCHEMA dla_oss_db with DBPROPERTIES(
    2. catalog='oss',
    3. location 'oss://dlaossfile1/dla/'
    4. )
  2. Create a Parquet table.

    ```sql

CREATE EXTERNAL TABLE customer_parqet_date (c_custkey int,c_name string,c_address string,c_nationkey int,c_phone string,c_acctbal double,c_mktsegment string,c_comment string)STORED AS PARQUETLOCATION ‘oss://dlaossfile1/TPC-H/customer_parquet/‘

  1. `STORED AS PARQUET `: specifies the file format PARQUET.
  2. 1. View the Parquet table data.
  3. ```sql
  4. SELECT * FROM customer_parqet_date

Parquet data query result