Hologres V3.2 and later supports accessing Paimon tables managed by a Data Lake Formation (DLF) catalog. Map a DLF catalog to an External Database or a foreign server, then query, write to, and time-travel Paimon tables using SQL—without migrating data into Hologres native storage.
Usage notes
Capability by version:
| Capability | Minimum version |
|---|---|
| Read Paimon tables via External Database | V3.2 |
| Time Travel, branch queries, and data mirroring | V3.2 |
| Dynamic Table (full, incremental, and auto refresh) | V3.2 |
Write to Paimon primary key tables (INSERT INTO, INSERT OVERWRITE, UPDATE, DELETE) | V4.0 |
| Read data from tags corresponding to expired snapshots | V3.2.6 / V4.0.6 |
Key exclusions:
DLF services must be in the same region as the Hologres instance.
Time Travel, branch queries, and data mirroring are only available for foreign tables mapped via an External Database—not for foreign tables created via a foreign server.
Time Travel is not supported on tables with schema evolution, or in Dynamic Table queries.
Data mirroring does not support Paimon primary key tables.
Read-only replica instances do not support data lake acceleration.
Writing to Paimon dynamic bucketing tables (created with
bucket = -1or without abucketparameter) is not supported.TRUNCATEis not supported on foreign tables. UseINSERT OVERWRITEwith empty data instead.UPDATEandDELETEare only supported on primary key tables where the Merge Engine isdeduplicateorpartial-update. Updating primary key columns is not supported.For
partial-updatetables, enable thepartial-update.remove-record-on-deleteproperty before runningDELETE.Foreign tables do not support mixed DML transactions. Only single DML operations use a two-phase commit protocol to guarantee atomicity.
Default partitioned data (empty or NULL partition values) cannot be read from Paimon.
Prerequisites
Before you begin, make sure you have:
Activated DLF service V2.5 or later and created a DLF catalog. See Authorize and activate DLF and Data catalog.
For DLF service endpoints, see Service endpoints.
A Hologres instance with the data lake acceleration service enabled. See Purchase a Hologres instance and Data lake acceleration for OSS access using DLF.
Connect to a DLF catalog
Hologres supports two approaches to connect to a DLF catalog:
| Approach | Best for |
|---|---|
| External Database (recommended) | Most use cases. Supports Time Travel, branch queries, and data mirroring. Requires V3.2+. |
| Foreign server (legacy) | DLF 1.0 environments or workloads that don't require Time Travel. |
Use an External Database (recommended)
An External Database maps a DLF catalog into Hologres so you can query, manage, and mirror Paimon tables. It supports two authentication modes:
| Mode | When to use |
|---|---|
| SLR mode (default) | Access the catalog with your current Alibaba Cloud identity. Suitable for single-account setups. |
| STS mode | Cross-account access, or when Hologres BASIC accounts need to access the catalog. |
SLR mode
A service-linked role (SLR) lets Hologres access other Alibaba Cloud services on your behalf using your current identity. For instances that are newly purchased or upgraded to V2.2 or later, the SLR is created automatically.
For instances running earlier than V2.2, create the SLR manually on the RAM Quick Authorization page.
Connect to the Hologres instance and create an External Database.
Parameter Required Description catalog_typeYes Set to paimon.metastore_typeYes Set to dlf-restfor DLF V2.5+.dlf_catalogYes The name of the DLF catalog to map. commentNo A description for the External Database. CREATE EXTERNAL DATABASE <ext_database_name> WITH catalog_type 'paimon', metastore_type 'dlf-rest', dlf_catalog '<dlf_catalog_name>', comment 'dlf catalog';List the schemas and tables in the External Database.
-- List schemas SELECT * FROM hologres.hg_external_schemas('<ext_database_name>'); -- List tables in a schema SELECT * FROM hologres.hg_external_tables('<ext_database_name>', '<ext_schema_name>');Query table data.
SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;
STS mode
Security Token Service (STS) issues temporary credentials with limited scope and validity. Use STS mode for cross-account access or when BASIC accounts need to access the catalog.
In the RAM console, create a RAM role and grant it the
AliyunDLFFullAccessorAliyunDLFReadOnlyAccesspermission. See Create a RAM role and grant permissions.Update the trust policy of the RAM role to allow Hologres to assume it. See Modify the trust policy of a RAM role.
{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "Service": [ "hologres.aliyuncs.com" ] } } ], "Version": "1" }In the Data Lake Formation console, grant the RAM role access to the target databases and tables in the DLF catalog.
Create an External Database in Hologres, specifying the RAM role as the authentication policy.
Parameter Required Description catalog_typeYes Set to paimon.metastore_typeYes Set to dlf-restfor DLF V2.5+.dlf_catalogYes The name of the DLF catalog to map. rolearnYes The ARN of the RAM role to assume. Format: acs:ram::<account_id>:role/<role_name>.commentNo A description for the External Database. CREATE EXTERNAL DATABASE <ext_database_name> WITH catalog_type 'paimon', metastore_type 'dlf-rest', dlf_catalog '<dlf_catalog_name>', rolearn 'acs:ram::<account_id>:role/<role_name>', comment 'dlf catalog';Create a USER MAPPING to associate the RAM user or BASIC account with the RAM role.
- For USER MAPPING syntax, see CREATE USER MAPPING. - To create a RAM user, see Create a RAM user. - To create a BASIC account, see User management.
CREATE USER MAPPING FOR "<ram_user_or_basic_account>" EXTERNAL DATABASE <ext_database_name> OPTIONS ( rolearn 'acs:ram::<account_id>:role/<role_name>' );Log in to the Hologres console as the RAM user, or reconnect to the Hologres instance as the BASIC account, then run queries.
SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;
More operations:
External Database: EXTERNAL DATABASE
External Schema: EXTERNAL SCHEMA
External Table: EXTERNAL TABLE
Use a foreign server (legacy)
The foreign server approach does not support Time Travel, branch queries, or data mirroring. Use an External Database for those features.
Step 1: Create a foreign server.
For DLF V2.5 and later:
CREATE SERVER IF NOT EXISTS <server_name> FOREIGN DATA WRAPPER dlf_fdw OPTIONS (
catalog_type 'paimon',
metastore_type 'dlf-rest',
dlf_catalog '<dlf_catalog_name>'
);| Parameter | Required | Description |
|---|---|---|
catalog_type | Yes | Set to paimon. |
metastore_type | Yes | Set to dlf-rest for DLF V2.5+. |
dlf_catalog | Yes | The name of the DLF catalog to map. |
For DLF V1.0:
CREATE SERVER IF NOT EXISTS <server_name> FOREIGN DATA WRAPPER dlf_fdw OPTIONS (
dlf_region '',
dlf_endpoint 'dlf-share.<region>.aliyuncs.com',
oss_endpoint 'oss-<region_id>-internal.aliyuncs.com',
dlf_catalog '<dlf_catalog_name>'
);| Parameter | Required | Description |
|---|---|---|
dlf_region | Yes | The region ID of the DLF service. |
dlf_endpoint | Yes | The DLF service endpoint. Format: dlf-share.<region>.aliyuncs.com. |
oss_endpoint | Yes | The OSS internal endpoint. Format: oss-<region_id>-internal.aliyuncs.com. |
dlf_catalog | Yes | The name of the DLF catalog to map. |
For more foreign server operations, see More foreign server operations.
Step 2: Create foreign tables.
Use IMPORT FOREIGN SCHEMA to import all tables in a DLF database:
-- Import all tables
IMPORT FOREIGN SCHEMA <dlf_db_name>
FROM SERVER <server_name>
INTO <holo_schema_name>
OPTIONS (if_table_exist 'update');
-- Import specific tables
IMPORT FOREIGN SCHEMA <dlf_db_name>
LIMIT TO (<table_name>[, ...])
FROM SERVER <server_name>
INTO <holo_schema_name>
OPTIONS (if_table_exist 'update');Or create a single foreign table with CREATE FOREIGN TABLE:
CREATE FOREIGN TABLE <foreign_table_name>
(
<column_name> <data_type>
[, ...]
)
SERVER <server_name>
OPTIONS
(
schema_name '<dlf_db_name>',
table_name '<dlf_table_name>'
);For syntax details, see IMPORT FOREIGN SCHEMA and CREATE FOREIGN TABLE.
Read Paimon data
Collect statistics for query optimization
Run ANALYZE or AUTO ANALYZE to collect table statistics. The query optimizer uses these statistics to build optimal query plans.
-- Collect statistics for all columns
ANALYZE <ext_database_name>.<ext_schema_name>.<ext_table_name>;
-- Collect statistics for specific columns (more accurate; use for filter columns)
ANALYZE <ext_database_name>.<ext_schema_name>.<ext_table_name>(<col_name>[, <col_name>]);
-- Enable AUTO ANALYZE for an External Database
ALTER EXTERNAL DATABASE <ext_database_name> WITH enable_auto_analyze 'true';For more details, see ANALYZE and AUTO ANALYZE.
Query with Time Travel
Hologres V3.2 and later supports querying historical snapshots of Paimon tables by timestamp, version (snapshot ID), or tag.
Time Travel is only supported for foreign tables mapped via an External Database. It is not supported on tables with schema evolution or in Dynamic Table queries.
List available snapshots:
SELECT * FROM hologres.hg_list_snapshots('<ext_database_name>.<ext_schema_name>.<ext_table_name>');The snapshots table returns the following fields:
| Field | Type | Description |
|---|---|---|
branch_name | TEXT | Branch name |
snapshot_id | TEXT | Snapshot ID |
schema_id | TEXT | Schema ID |
commit_kind | TEXT | Commit type |
commit_time | TIMESTAMPTZ | Commit time |
extend_info | TEXT (JSON) | Remaining Paimon snapshot properties |
You can also query the Paimon Snapshots system table using the hg_list_versions function. For Paimon tables, Version corresponds to Snapshot ID.
SELECT * FROM hologres.hg_list_versions('<ext_database_name>.<ext_schema_name>.<ext_table_name>');List available tags:
SELECT * FROM hologres.hg_list_tags('<ext_database_name>.<ext_schema_name>.<ext_table_name>');| Field | Type | Description |
|---|---|---|
branch_name | TEXT | Branch name |
tag_name | TEXT | Tag name |
snapshot_id | TEXT | Snapshot ID |
schema_id | TEXT | Schema ID |
commit_time | TIMESTAMPTZ | Commit time |
extend_info | TEXT (JSON) | Remaining Paimon tag properties |
Query by timestamp:
SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>
FOR TIMESTAMP AS OF '<timestamp>';Query by version (snapshot ID):
SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR VERSION AS OF '<version>';Query by tag:
SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR TAG AS OF '<tag>';Hologres V3.2.6 and V4.0.6 and later support reading data from tags that correspond to expired Paimon snapshots.
For more information on Paimon system tables, see Paimon system tables.
Query Paimon branches
Hologres V3.2 and later supports querying data from a specific Paimon branch and querying historical snapshots within a branch.
List available branches:
SELECT * FROM hologres.hg_list_branches('<ext_database_name>.<ext_schema_name>.<ext_table_name>');| Field | Type | Description |
|---|---|---|
branch_name | TEXT | Branch name |
create_time | TIMESTAMPTZ | Branch creation time |
extend_info | TEXT (JSON) | Remaining Paimon branch properties |
Query the current data of a branch:
SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR branch AS OF '<branch_name>';If scan.fallback-branch is set on the Paimon table, Hologres automatically reads missing partitions from the fallback branch. See Manage branches.Query historical snapshots within a branch:
-- By timestamp
SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR branch AS OF '<branch_name>' TIMESTAMP AS OF '<timestamp>';
-- By tag
SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR branch AS OF '<branch_name>' TAG AS '<tag_name>';Accelerate queries with lake table mirroring
Hologres V3.2 and later supports lake table mirroring, which syncs Paimon table metadata and data into Hologres in near-real-time or at scheduled intervals. Mirrored data is served from Hologres storage, significantly reducing query latency. Both full table mirroring and partial mirroring of partitioned tables are supported.
Lake table mirroring does not support Paimon primary key tables.
For setup details, see Lake table mirroring.
Use Dynamic Table for continuous refresh
A Dynamic Table automatically processes and stores aggregated results from one or more Paimon foreign tables, with built-in refresh strategies. Starting from V3.0, MaxCompute foreign tables and DLF foreign tables are supported as base tables for a Dynamic Table.
Dynamic Table does not support Time Travel queries on Paimon tables or branches.
Full refresh — replaces all data on each refresh cycle:
CREATE DYNAMIC TABLE paimon_dt_table
WITH (
auto_refresh_mode = 'full',
freshness = '3 minutes'
) AS SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;Incremental refresh — processes only new or changed data:
CREATE DYNAMIC TABLE paimon_dt_table_incremental
WITH (
auto_refresh_mode = 'incremental',
freshness = '3 minutes'
) AS SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;Auto refresh — automatically selects full or incremental refresh based on data changes:
CREATE DYNAMIC TABLE paimon_dt_table_auto
WITH (
auto_refresh_mode = 'auto',
freshness = '3 minutes'
) AS SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;For more details, see Dynamic Table overview.
Write to Paimon tables
Hologres V4.0 and later supports writing to Paimon primary key tables via an External Database. The write process shuffles data by bucket before writing to reduce small file generation. A memory-based caching mechanism supports tables that require caching, including:
Tables with
Merge Engineset toFirst RowTables that generate a Changelog through the
lookupmechanismTables with
Deletion Vectorenabled
INSERT INTO
The following example creates a Paimon primary key table and inserts rows.
-- Create a Paimon primary key table
CREATE EXTERNAL TABLE <ext_database_name>.<ext_schema_name>.<ext_table_name> (
id BIGINT,
customer_id BIGINT,
item TEXT,
amount FLOAT,
dt TEXT,
PRIMARY KEY (id)
)
WITH (
"table_format" = 'paimon',
"file_format" = 'orc',
"merge-engine" = 'deduplicate', -- Adjust as needed
"bucket" = '4' -- Adjust as needed
);
-- Insert rows
INSERT INTO <ext_database_name>.<ext_schema_name>.<ext_table_name>
(id, customer_id, item, amount, dt)
VALUES
(1, 1001, 'apple', 3.50, '2025-09-10'),
(2, 1002, 'banana', 2.00, '2025-09-10'),
(3, 1003, 'orange', 4.20, '2025-09-11');INSERT OVERWRITE
The following example creates a partitioned Paimon append-only table and overwrites a partition.
-- Create a partitioned Paimon append-only table
CREATE EXTERNAL TABLE <ext_database_name>.<ext_schema_name>.<ext_table_name> (
id BIGINT,
name TEXT,
amount FLOAT8,
dt TEXT
)
LOGICAL PARTITION BY LIST(dt)
WITH (
"table_format" = 'paimon',
"file_format" = 'parquet',
"merge-engine" = 'deduplicate',
"bucket-key" = 'id',
"bucket" = '2'
);
-- Overwrite data
INSERT OVERWRITE <ext_database_name>.<ext_schema_name>.<ext_table_name>
(id, name, amount, dt)
VALUES
(1, 'Alice', 100.50, '2025-09-01'),
(2, 'Bob', 200.00, '2025-09-01'),
(3, 'Carol', 150.75, '2025-09-02');UPDATE
UPDATE is supported only on Paimon primary key tables where the Merge Engine is deduplicate or partial-update. Updating primary key columns is not supported.
UPDATE <ext_database_name>.<ext_schema_name>.<ext_table_name>
SET amount = 5
WHERE id = 1;DELETE
DELETE is supported only on Paimon primary key tables where the Merge Engine is deduplicate or partial-update. For partial-update tables, enable the partial-update.remove-record-on-delete property before running DELETE.
DELETE FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>
WHERE id = 2;