All Products
Search
Document Center

MaxCompute:ORC external tables

Last Updated:Jul 16, 2025

This topic describes how to create, read, and write ORC-formatted OSS external tables.

Prerequisites

  • The Alibaba Cloud account or RAM user is granted permissions to access OSS. Alibaba Cloud accounts, RAM users, and RAM roles can access OSS external tables. For more information about authorization, see STS authorization.

  • A MaxCompute project is created.

    MaxCompute is deployed only in specific regions. To prevent a cross-region data connectivity issue, we recommend that you use a bucket in the same region as your MaxCompute project.
  • The Alibaba Cloud account or RAM user is granted the CreateTable permission on your project. For more information about table operation permissions, see MaxCompute permissions.

  • (Optional) An OSS bucket, OSS directories, and OSS data files are prepared. For more information, see Create a bucket, Manage directories, and Simple upload.

    MaxCompute can automatically create an OSS directory in OSS. For SQL statements that include external tables and UDFs, you can execute operations to read and write external tables and UDFs with a single SQL statement. You can also manually create an OSS directory.

Limits

  • OSS external tables do not support the clustering.

  • A single file cannot exceed 3 GB. If one does, we recommend that you split it into multiple files.

Supported data types

Note

In the following tables, 已开通 indicates supported and 未开通 indicates not supported.

  • For more information about MaxCompute data types, see MaxCompute V1.0 data type edition and MaxCompute V2.0 data type edition.

  • JNI mode (Native ORC Reader is not used when reading tables): set odps.ext.oss.orc.native=false;, supports both reading and writing.

  • Native mode (Native ORC Reader is used when reading tables): set odps.ext.oss.orc.native=true;, supports only reading.

Data type

JNI mode support

Native mode support

TINYINT

已开通

已开通

SMALLINT

已开通

已开通

INT

已开通

已开通

BIGINT

已开通

已开通

BINARY

已开通

已开通

FLOAT

已开通

已开通

DOUBLE

已开通

已开通

DECIMAL(precision,scale)

已开通

已开通

VARCHAR(n)

已开通

已开通

CHAR(n)

已开通

已开通

STRING

已开通

已开通

DATE

已开通

已开通

DATETIME

已开通

已开通

TIMESTAMP

已开通

已开通

TIMESTAMP_NTZ

已开通

已开通

BOOLEAN

已开通

已开通

ARRAY

已开通

已开通

MAP

已开通

已开通

STRUCT

已开通

已开通

JSON

未开通

未开通

Compression formats

To read or write compressed OSS files, add the with serdeproperties attribute to the table creation statement. For more information, see With serdeproperties parameters.

Supported data file formats for reading and writing: ORC files compressed using SNAPPY or ZLIB.

Create an external table

Syntax

For more information about the syntax for creating external tables of different formats, see OSS external tables.

  • Simplified syntax

    CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
    (
      <col_name> <data_type>,
      ...
    )
    [COMMENT <table_comment>]
    [PARTITIONED BY (<col_name> <data_type>, ...)]
    STORED AS orc
    LOCATION '<oss_location>';
  • Complete syntax

    CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
    (
      <col_name> <data_type>,
      ...
    )
    [COMMENT <table_comment>]
    [PARTITIONED BY (<col_name> <data_type>, ...)]
    ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
    WITH serdeproperties(
        'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
    )
    STORED AS orc 
    LOCATION '<oss_location>' 
    tblproperties (
        '<xxx>'='<yyy>'
    );

Common parameters

For more information about common parameters, see Basic syntax parameters.

Specific parameters

With serdeproperties attribute parameters

property_name

Scenario

Description

property_value

Default value

mcfed.orc.schema.resolution

Add this property when the schemas of data in the same OSS external table are different.

Used to set the ORC file parsing method. name indicates parsing based on column names.

name

By default, parsing is based on column numbers.

Equivalent to: 'mcfed.orc.schema.resolution'='position'.

Tblproperties attribute parameters

property_name

Scenario

Description

property_value

Default value

mcfed.orc.compress

Add this property to write data in an ORC data file to OSS using a compression algorithm.

ORC compression property. Specifies the compression algorithm of ORC data files.

  • SNAPPY

  • ZLIB

None

io.compression.codecs

Add this property when the OSS data file is in the raw Snappy format.

If you set this parameter to True, MaxCompute normally reads data from compressed files. Otherwise, the data read operation fails.

com.aliyun.odps.io.compress.SnappyRawCodec

None

Write data

For more information about the syntax for writing data, see Write syntax.

Query and analyze

  • For more information about the SELECT syntax, see Syntax.

  • For more information about how to optimize query plans, see Query optimization.

Scenarios

Create an external table in ORC format with SNAPPY compression, and perform data reading and writing operations.

Note
  • When you execute the following sample code, replace <uid> with your Alibaba Cloud account ID.

  • The role used in the following examples is aliyunodpsdefaultrole. To use a different role, replace aliyunodpsdefaultrole with the name of the target role and grant the target role permissions to access OSS.

  1. Prepare a SNAPPY format data file.

    Create the orc_snappy/dt=20250526 directory hierarchy in the oss-mc-test bucket in Sample data, and store the Snappy file in the partition directory dt=20250526.

  2. Create an ORC external table with SNAPPY compression format.

    CREATE EXTERNAL TABLE orc_data_type_snappy
    (
        vehicleId INT,
        recordId INT,
        patientId INT,
        calls INT,
        locationLatitute DOUBLE,
        locationLongtitue DOUBLE,
        recordTime STRING,
        direction STRING
    )
    PARTITIONED BY (dt STRING )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
    WITH serdeproperties (
     'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
    ) 
    STORED AS ORC  
    LOCATION 'oss://oss-cn-hangzhou-internal.aliyuncs.com/oss-mc-test/orc_snappy/'
    tblproperties (
        'mcfed.orc.compress'='SNAPPY');
    
    -- Import partition data
    MSCK REPAIR TABLE orc_data_type_snappy ADD PARTITIONS;
  3. Read data from the ORC external table.

    SELECT * FROM orc_data_type_snappy WHERE dt=20250526 LIMIT 10;

    The following result is returned:

    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | vehicleid  | recordid   | patientid  | calls      | locationlatitute | locationlongtitue | recordtime     | direction  | dt         |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | 1          | 12         | 76         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:10 | SW         | 20250526   |
    | 1          | 1          | 51         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:00 | S          | 20250526   |
    | 1          | 2          | 13         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:01 | NE         | 20250526   |
    | 1          | 3          | 48         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:02 | NE         | 20250526   |
    | 1          | 4          | 30         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:03 | W          | 20250526   |
    | 1          | 5          | 47         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:04 | S          | 20250526   |
    | 1          | 6          | 9          | 1          | 46.81006         | -92.08174         | 9/14/2014 0:05 | S          | 20250526   |
    | 1          | 7          | 53         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:06 | N          | 20250526   |
    | 1          | 8          | 63         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:07 | SW         | 20250526   |
    | 1          | 9          | 4          | 1          | 46.81006         | -92.08174         | 9/14/2014 0:08 | NE         | 20250526   |
    | 1          | 10         | 31         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:09 | N          | 20250526   |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
  4. Write data to the ORC external table and query.

    INSERT INTO orc_data_type_snappy PARTITION (dt ='20250526') 
      VALUES (1,16,76,1,46.81006,-92.08174,'9/14/2014 0:10','SW');
    
    -- Query the newly written data
    SELECT * FROM orc_data_type_snappy WHERE dt = '20250526' AND recordid=16;

    The following result is returned:

    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | vehicleid  | recordid   | patientid  | calls      | locationlatitute | locationlongtitue | recordtime     | direction  | dt         |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
    | 1          | 16         | 76         | 1          | 46.81006         | -92.08174         | 9/14/2014 0:10 | SW         | 20250526   |
    +------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+