Use RCFile external tables to query and write RCFile-format data in Object Storage Service (OSS) without moving the data into MaxCompute. OSS manages the storage; MaxCompute manages the schema and query execution.
Constraints
| Constraint | Detail |
|---|---|
| Cluster property | Not supported for OSS external tables |
| File size limit | 2 GB per file. Split files larger than 2 GB before referencing them. |
| Region | MaxCompute and OSS must be in the same region |
Supported data types
For the full list of MaxCompute data types, see Data types (Version 1.0) and Data types (Version 2.0).
Data type | Support | Data type | Support |
TINYINT |
| STRING |
|
SMALLINT |
| DATE |
|
INT |
| DATETIME |
|
BIGINT |
| TIMESTAMP |
|
BINARY |
| TIMESTAMP_NTZ |
|
FLOAT |
| BOOLEAN |
|
DOUBLE |
| ARRAY |
|
DECIMAL(precision, scale) |
| MAP |
|
VARCHAR(n) |
| STRUCT |
|
CHAR(n) |
| JSON |
|
Create an external table
Syntax
CREATE EXTERNAL TABLE [IF NOT EXISTS] mc_oss_rcfile_extable
(
<col_name> <data_type>,
...
)
[COMMENT <table_comment>]
[PARTITIONED BY (<col_name> <data_type>, ...)]
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe'
STORED AS rcfile
LOCATION 'oss_location';For parameter descriptions, see Basic syntax parameters.
Write data
For the write syntax, see Write data to OSS.
Query and analysis
For the SELECT syntax, see Read data from OSS.
For query plan tuning, see Query optimization.
End-to-end example
This example creates a partitioned RCFile external table, syncs partition metadata from OSS, queries data, and writes a new record.
Prerequisites
Before you begin, ensure that you have:
A MaxCompute project. See Create a MaxCompute project.
An OSS bucket in the same region as your MaxCompute project. See Create a bucket and Manage folders. MaxCompute can create OSS folders automatically — you do not need to create them before running SQL.
OSS access permissions via an Alibaba Cloud account, a Resource Access Management (RAM) user, or a RAM role. See Authorize access in STS mode for OSS.
The
CreateTablepermission in your MaxCompute project. See MaxCompute permissions.
Step 1: Upload test data
Log in to the OSS console and upload the RCFile data file to the oss-mc-test/Rcfile/dt=20241223/ folder. See Upload files to OSS.
Step 2: Create an RCFile external table
CREATE EXTERNAL TABLE mc_oss_rcfile_extable
(
vehicleId STRING,
recordId STRING,
patientId STRING,
calls STRING,
locationLatitute STRING,
locationLongtitue STRING,
recordTime STRING,
direction STRING
)
PARTITIONED BY (dt STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe'
STORED AS rcfile
LOCATION 'oss://oss-cn-hangzhou-internal.aliyuncs.com/oss-mc-test/Rcfile/';Step 3: Sync partition metadata
For partitioned external tables, sync partition metadata from OSS to MaxCompute after uploading data:
-- Import partition data.
MSCK REPAIR TABLE mc_oss_rcfile_extable ADD PARTITIONS;For the full syntax, see Syntax for adding partition data to an OSS external table.
Step 4: Query the external table
SELECT * FROM mc_oss_rcfile_extable2_pt WHERE dt=20241223;Expected result:
+------------+------------+------------+------------+------------------+-------------------+------------------+------------+------------+
| vehicleid | recordid | patientid | calls | locationlatitute | locationlongtitue | recordtime | direction | dt |
+------------+------------+------------+------------+------------------+-------------------+------------------+------------+------------+
| 1 | 1 | 7 | 1 | 39.995353 | 116.410101 | 2025/12/23 10:01 | NW | 20241223 |
| 1 | 2 | 15 | 1 | 39.996123 | 116.412345 | 2025/12/24 10:01 | NE | 20241223 |
| 2 | 3 | 22 | 2 | 40.001234 | 116.425678 | 2025/12/25 10:01 | SE | 20241223 |
| 1 | 4 | 71 | 1 | 39.998888 | 106.419999 | 2025/12/26 10:01 | NW | 20241223 |
| 3 | 5 | 50 | 1 | 29.989999 | 116.401111 | 2025/12/27 10:01 | NE | 20241223 |
| 2 | 6 | 28 | 3 | 40.003456 | 110.42891 | 2025/12/28 10:01 | NW | 20241223 |
| 4 | 7 | 20 | 2 | 39.991234 | 116.435555 | 2025/12/29 10:01 | NE | 20241223 |
| 1 | 8 | 13 | 1 | 39.994567 | 116.41789 | 2025/12/30 10:01 | SE | 20241223 |
+------------+------------+------------+------------+------------------+-------------------+------------------+------------+------------+Step 5: Write data
INSERT INTO TABLE mc_oss_rcfile_extable2_pt PARTITION (dt=20251223) VALUES ('1','13','76','1','46.81006','-92.08174','9/14/2014 0:10','SW');Step 6: Verify the write
SELECT * FROM mc_oss_rcfile_extable2_pt WHERE dt=20250521 AND recordId='13';Expected result:
+------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
| vehicleid | recordid | patientid | calls | locationlatitute | locationlongtitue | recordtime | direction | dt |
+------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+
| 1 | 13 | 76 | 1 | 46.81006 | -92.08174 | 9/14/2014 0:10 | SW | 20251223 |
+------------+------------+------------+------------+------------------+-------------------+----------------+------------+------------+