MaxCompute lets you create external tables that point to Apache Iceberg data stored in Object Storage Service (OSS). You can query and write that data directly without moving it into MaxCompute.
Iceberg external table support is currently in beta. Run SET odps.service.mode=off; before any read or write (DML) operation.
Limitations
-
SDK version: MaxCompute bundles Iceberg SDK 1.6.1.
-
Read: Supports Iceberg table format v2. Time travel is not supported.
-
Write: Supports
INSERT INTOandINSERT OVERWRITE.UPSERTis not supported.
Choose a table mode
MaxCompute provides two modes for Iceberg external tables. Review the differences before creating a table.
| Standard external table | Delegate-mode external table | |
|---|---|---|
| Schema definition | Specify column names and data types at create time | Not required — detected automatically from Iceberg metadata |
| Schema evolution | Does not auto-update when the underlying Iceberg schema changes | Re-read from the current file on each read |
| Empty OSS directory | Allowed — Iceberg creates the schema on first write | Not allowed — reads and writes fail on an empty directory |
| Partition handling | Specify partition columns in PARTITIONED BY |
No partition columns in the CREATE statement; partition values appear as the last column in query results |
Use a standard external table when:
-
Your team defines and controls the schema, and it will not evolve frequently.
-
You are writing new data and the OSS directory does not yet contain Iceberg files.
Use a delegate-mode external table when:
-
Iceberg data already exists in OSS and the schema may evolve.
-
You want MaxCompute to always reflect the current Iceberg schema without manual updates.
Prerequisites
Before you begin, make sure you have:
-
An OSS bucket with the Iceberg data path ready
-
A RAM role (
aliyunodpsdefaultrole) that grants MaxCompute access to your OSS bucket. Record the role ARN in the formatacs:ram::<uid>:role/aliyunodpsdefaultrole -
For delegate-mode: readable Iceberg files already present at the OSS path
Create a standard external table
Syntax
CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
(
<col_name> <data_type>,
...
)
[COMMENT <table_comment>]
[PARTITIONED BY (<col_name> <data_type>, ...)]
STORED AS ICEBERG
WITH SERDEPROPERTIES (
'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
)
LOCATION '<oss_location>'
TBLPROPERTIES ('<tbproperty_name>'='<tbproperty_value>', ...);
The schema you specify at create time is fixed. When the underlying Iceberg data schema evolves (for example, columns are added), you must update the external table definition manually.
For common parameters (storage format, compression, and so on), see Common parameter descriptions.
TBLPROPERTIES for Iceberg data
When your OSS data is in Iceberg format, add the following properties to TBLPROPERTIES:
| Property | Description | Values |
|---|---|---|
iceberg_uuid_fields |
Marks columns whose Iceberg type is UUID. Use ; to separate multiple columns. |
col1;col2, col_struct.field1, col_list.entry, col_map.key, col_map.val |
iceberg_time_fields |
Marks columns whose Iceberg type is TIME. Use ; to separate multiple columns. |
col1;col2, col_struct.field1, col_list.entry, col_map.key, col_map.val |
iceberg_write_data_format |
Output file format. Default: Parquet. | ORC, Parquet, Avro |
Example
SET odps.service.mode=off;
SET odps.sql.type.system.odps2=true;
SET odps.sql.decimal.odps2=true;
SET odps.sql.hive.compatible=true;
CREATE EXTERNAL TABLE ext_tbl_iceberg_ordinary
(
id bigint COMMENT 'User unique identifier',
name string COMMENT 'User name',
age bigint COMMENT 'User age',
gender string COMMENT 'User gender',
height float COMMENT 'User height in centimeters',
birthday date COMMENT 'User birthday',
phone_number string COMMENT 'User phone number',
email string COMMENT 'User email address',
address string COMMENT 'User address',
salary decimal(18, 2) COMMENT 'User salary in USD',
create_time timestamp COMMENT 'Time when user information was created',
update_time timestamp COMMENT 'Time when user information was last updated',
is_deleted boolean COMMENT 'Flag indicating whether the user record has been deleted'
)
STORED AS ICEBERG
LOCATION 'oss://oss-cn-<region>-internal.aliyuncs.com/<oss-path>/'
TBLPROPERTIES (
'iceberg_uuid_fields'='id',
'iceberg_write_data_format'='parquet',
'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
);
Replace the following placeholders:
| Placeholder | Description | Example |
|---|---|---|
<region> |
OSS region ID | cn-hangzhou |
<oss-path> |
Path to the Iceberg data directory in the bucket | myproject/iceberg/users |
<uid> |
Your Alibaba Cloud account UID | 123456789012 |
Create a delegate-mode external table
Syntax
Delegate mode reads schema and partition information directly from the Iceberg metadata in OSS. No column definitions are needed.
The OSS path must contain readable Iceberg data. Pointing a delegate-mode table at an empty directory causes all reads and writes to fail.
CREATE EXTERNAL TABLE [IF NOT EXISTS] <mc_oss_extable_name>
[COMMENT <table_comment>]
STORED AS ICEBERG
LOCATION '<oss_location>'
TBLPROPERTIES (
'iceberg_write_data_format'='[parquet|orc|avro]',
'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
);
For common parameters, see Common parameter descriptions.
Parameters
When your OSS data files are in Iceberg format, you can add the following TBLPROPERTIES attribute:
| Property | Description |
|---|---|
iceberg_write_data_format |
Specifies the output data format. |
Example
SET odps.service.mode=off;
CREATE EXTERNAL TABLE ext_tbl_iceberg_delegate
STORED AS ICEBERG
LOCATION 'oss://oss-cn-<region>-internal.aliyuncs.com/<oss-path>/'
TBLPROPERTIES (
'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
);
Write data
Both INSERT INTO and INSERT OVERWRITE are supported. For the full write syntax, see Syntax reference.
-- Turn off online mode (required in beta)
SET odps.service.mode=off;
INSERT INTO <table-name> VALUES (...);
INSERT OVERWRITE TABLE <table-name> VALUES (...);
Writing to a partitioned delegate-mode table: Do not specify partition columns in the INSERT statement. Append the partition values at the end of each row. MaxCompute creates the corresponding partition directories in OSS automatically.
Query data
For the full SELECT syntax, see Syntax reference. For query optimization guidance, see Query optimization.
-- Turn off online mode (required in beta)
SET odps.service.mode=off;
SELECT * FROM <table-name> WHERE ...;
Querying a partitioned delegate-mode table: Partition values appear in the last column of the result set. No special syntax is needed.
Partition operations
Standard external table partitions
| Operation | Supported |
|---|---|
| Add partition | Yes |
| Update partition timestamp | Yes |
| List all partitions | Yes |
| View partition information | Yes |
| Delete partition | Yes |
| Modify partition value | No |
| Merge partitions | No |
| Delete partition data | No |
If you cannot list partitions or read partitioned data, run ADD PARTITION to register the partition. For syntax, see Partition data syntax.
Delegate-mode table partitions
Delegate-mode tables discover partitions automatically from the Iceberg metadata. The CREATE statement is the same for partitioned and non-partitioned tables — do not specify partition keys.
Example: create and query a partitioned delegate-mode table
-- Create the table (no partition keys)
CREATE EXTERNAL TABLE ext_tbl_iceberg_pt
STORED AS ICEBERG
LOCATION 'oss://oss-cn-<region>-internal.aliyuncs.com/<oss-path>/'
TBLPROPERTIES (
'odps.properties.rolearn'='acs:ram::<uid>:role/aliyunodpsdefaultrole'
);
-- Query: partition values appear in the last column
SELECT * FROM ext_tbl_iceberg_pt;
DDL statements
Show the CREATE TABLE statement
SHOW CREATE TABLE <table-name>;
View table details
By default, DESC does not return schema information for delegate-mode tables. To include it, enable the following flag first:
SET odps.console.forward.commands.to.sql=true;
DESC [EXTENDED] <table-name>;
Supported data types
MaxCompute maps Iceberg types to MaxCompute types as shown below. For the full Iceberg type specification, see Iceberg primitive types.
| Iceberg type | MaxCompute type | Notes |
|---|---|---|
Types.BooleanType |
BOOLEAN |
|
Types.IntegerType |
INT |
|
Types.LongType |
BIGINT |
|
Types.FloatType |
FLOAT |
|
Types.DoubleType |
DOUBLE |
|
Types.DecimalType |
DECIMAL(precision, scale) |
0 <= precision <= 38; 0 <= scale <= 18 |
Types.DateType |
DATE |
|
Types.TimeType |
BIGINT |
Written with microsecond precision. Specify TIME fields using the iceberg_time_fields TBLPROPERTIES key (e.g., col1;col2, col1.entry, col1.key;col1.value, col1.field1). |
Types.TimestampType |
TIMESTAMP_NTZ |
|
Types.TimestampType_z |
TIMESTAMP |
|
Types.StringType |
STRING |
|
Types.UUIDType |
BINARY |
UUID length must not exceed 16 bytes. Specify UUID fields using the iceberg_uuid_fields TBLPROPERTIES key (e.g., col1;col2, col1.entry, col1.key;col1.val, col1.field1). For UUID byte-array conversion, see the Java UUID reference and conversion guide. |
Types.FixedType |
BINARY |
|
Types.BinaryType |
BINARY |
|
TypeID.STRUCT |
STRUCT |
|
TypeID.LIST |
ARRAY |
|
TypeID.MAP |
MAP |
|
| — | TINYINT |
No Iceberg equivalent. Cannot be used in Iceberg external tables. |
| — | SMALLINT |
No Iceberg equivalent. Cannot be used in Iceberg external tables. |
| — | VARCHAR(n) |
No Iceberg equivalent. Cannot be used in Iceberg external tables. |
| — | CHAR(n) |
No Iceberg equivalent. Cannot be used in Iceberg external tables. |
| — | DATETIME |
No Iceberg equivalent. Cannot be used in Iceberg external tables. |
| — | JSON |
No Iceberg equivalent. Cannot be used in Iceberg external tables. |