This topic describes how to create, read from, and write to OSS foreign tables in Delta Lake format.
MaxCompute supports the foreign table feature only with a specific version of the Delta Lake software development kit (SDK). MaxCompute will not update the SDK version or further develop this feature. Use Paimon foreign tables to read data in data lake table formats.
Scope
The cluster property is not supported for OSS external tables.
A single file cannot exceed 3 GB. If a file is larger than 3 GB, you must split it.
MaxCompute and OSS must be deployed in the same region.
Delta Lake foreign tables support only reading all data from the files mapped to the foreign table. They do not support automatic hiding of system columns, incremental reads, snapshot reads, or write operations. To perform read and write operations that support atomicity, consistency, isolation, and durability (ACID), use features such as MaxCompute Delta tables or Paimon foreign tables.
The default Delta Lake SDK version integrated into MaxCompute is
io.delta:delta-core_2.11:0.2.0.5. Forward or backward compatibility for the Delta Lake SDK is not guaranteed. Compatibility is maintained by the open source community.
Supported data types
For more information about MaxCompute data types, see Data types (version 1.0) and Data types (version 2.0).
Data type | Supported | Data type | Support |
TINYINT |
| STRING |
|
SMALLINT |
| DATE |
|
INT |
| DATETIME |
|
BIGINT |
| TIMESTAMP |
|
BINARY |
| TIMESTAMP_NTZ |
|
FLOAT |
| BOOLEAN |
|
DOUBLE |
| ARRAY |
|
DECIMAL(precision,scale) |
| MAP |
|
VARCHAR(n) |
| STRUCT |
|
CHAR(n) |
| JSON |
|
Create a foreign table
Syntax
CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
(
<col_name> <data_type>,
...
)
[COMMENT <table_comment>]
[PARTITIONED BY (<col_name> <data_type>, ...)]
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS
INPUTFORMAT 'io.delta.hive.DeltaInputFormat'
OUTPUTFORMAT 'io.delta.hive.DeltaOutputFormat'
LOCATION 'oss_location';Parameters
For more information, see Basic syntax parameters.
Query and analysis
For more information about the SELECT syntax, see Read data from OSS.
For more information about query plan optimization, see Query optimization.
Example
This example shows how to use the built-in open source data resolver to create a Delta Lake foreign table, write data to OSS, and query the data.
Prepare the data.
Log on to the OSS console and upload the test RCFILE format data file to the specified directory
oss-mc-test/Demo_rcfile+pt/dt=20250521/in your OSS bucket. For more information, see Upload files to OSS.Upload test data
Log on to the OSS console and upload a test data file in Delta Lake format to the
oss-mc-test/Demo_delta_pt/dt=20250612/folder in your OSS bucket. For more information, see Upload files to OSS.Create a Delta Lake foreign table
CREATE EXTERNAL TABLE test_delta_pt ( id INT ) PARTITIONED BY (dt STRING ) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'io.delta.hive.DeltaInputFormat' OUTPUTFORMAT 'io.delta.hive.DeltaOutputFormat' LOCATION 'oss://oss-cn-hangzhou-internal.aliyuncs.com/oss-mc-test/Demo_delta_pt/';Import partition data. If the created OSS foreign table is a partitioned table, you must also import the partition data. For more information, see Syntax for adding partition data to an OSS foreign table.
MSCK REPAIR TABLE test_delta_pt ADD PARTITIONS;Query the Delta Lake foreign table.
SELECT * FROM test_delta_pt WHERE dt='20250612';