All Products
Search
Document Center

Hologres:Access Paimon data with DLF

Last Updated:Mar 17, 2026

Hologres V3.2 and later supports accessing Paimon tables managed by a Data Lake Formation (DLF) catalog. Map a DLF catalog to an External Database or a foreign server, then query, write to, and time-travel Paimon tables using SQL—without migrating data into Hologres native storage.

Usage notes

Capability by version:

CapabilityMinimum version
Read Paimon tables via External DatabaseV3.2
Time Travel, branch queries, and data mirroringV3.2
Dynamic Table (full, incremental, and auto refresh)V3.2
Write to Paimon primary key tables (INSERT INTO, INSERT OVERWRITE, UPDATE, DELETE)V4.0
Read data from tags corresponding to expired snapshotsV3.2.6 / V4.0.6

Key exclusions:

  • DLF services must be in the same region as the Hologres instance.

  • Time Travel, branch queries, and data mirroring are only available for foreign tables mapped via an External Database—not for foreign tables created via a foreign server.

  • Time Travel is not supported on tables with schema evolution, or in Dynamic Table queries.

  • Data mirroring does not support Paimon primary key tables.

  • Read-only replica instances do not support data lake acceleration.

  • Writing to Paimon dynamic bucketing tables (created with bucket = -1 or without a bucket parameter) is not supported.

  • TRUNCATE is not supported on foreign tables. Use INSERT OVERWRITE with empty data instead.

  • UPDATE and DELETE are only supported on primary key tables where the Merge Engine is deduplicate or partial-update. Updating primary key columns is not supported.

  • For partial-update tables, enable the partial-update.remove-record-on-delete property before running DELETE.

  • Foreign tables do not support mixed DML transactions. Only single DML operations use a two-phase commit protocol to guarantee atomicity.

  • Default partitioned data (empty or NULL partition values) cannot be read from Paimon.

Prerequisites

Before you begin, make sure you have:

Connect to a DLF catalog

Hologres supports two approaches to connect to a DLF catalog:

ApproachBest for
External Database (recommended)Most use cases. Supports Time Travel, branch queries, and data mirroring. Requires V3.2+.
Foreign server (legacy)DLF 1.0 environments or workloads that don't require Time Travel.

Use an External Database (recommended)

An External Database maps a DLF catalog into Hologres so you can query, manage, and mirror Paimon tables. It supports two authentication modes:

ModeWhen to use
SLR mode (default)Access the catalog with your current Alibaba Cloud identity. Suitable for single-account setups.
STS modeCross-account access, or when Hologres BASIC accounts need to access the catalog.

SLR mode

A service-linked role (SLR) lets Hologres access other Alibaba Cloud services on your behalf using your current identity. For instances that are newly purchased or upgraded to V2.2 or later, the SLR is created automatically.

For instances running earlier than V2.2, create the SLR manually on the RAM Quick Authorization page.
  1. Connect to the Hologres instance and create an External Database.

    ParameterRequiredDescription
    catalog_typeYesSet to paimon.
    metastore_typeYesSet to dlf-rest for DLF V2.5+.
    dlf_catalogYesThe name of the DLF catalog to map.
    commentNoA description for the External Database.
       CREATE EXTERNAL DATABASE <ext_database_name> WITH
         catalog_type 'paimon',
         metastore_type 'dlf-rest',
         dlf_catalog '<dlf_catalog_name>',
         comment 'dlf catalog';
  2. List the schemas and tables in the External Database.

       -- List schemas
       SELECT * FROM hologres.hg_external_schemas('<ext_database_name>');
    
       -- List tables in a schema
       SELECT * FROM hologres.hg_external_tables('<ext_database_name>', '<ext_schema_name>');
  3. Query table data.

       SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;

STS mode

Security Token Service (STS) issues temporary credentials with limited scope and validity. Use STS mode for cross-account access or when BASIC accounts need to access the catalog.

  1. In the RAM console, create a RAM role and grant it the AliyunDLFFullAccess or AliyunDLFReadOnlyAccess permission. See Create a RAM role and grant permissions.

  2. Update the trust policy of the RAM role to allow Hologres to assume it. See Modify the trust policy of a RAM role.

       {
         "Statement": [
           {
             "Action": "sts:AssumeRole",
             "Effect": "Allow",
             "Principal": {
               "Service": [
                 "hologres.aliyuncs.com"
               ]
             }
           }
         ],
         "Version": "1"
       }
  3. In the Data Lake Formation console, grant the RAM role access to the target databases and tables in the DLF catalog.

  4. Create an External Database in Hologres, specifying the RAM role as the authentication policy.

    ParameterRequiredDescription
    catalog_typeYesSet to paimon.
    metastore_typeYesSet to dlf-rest for DLF V2.5+.
    dlf_catalogYesThe name of the DLF catalog to map.
    rolearnYesThe ARN of the RAM role to assume. Format: acs:ram::<account_id>:role/<role_name>.
    commentNoA description for the External Database.
       CREATE EXTERNAL DATABASE <ext_database_name> WITH
         catalog_type 'paimon',
         metastore_type 'dlf-rest',
         dlf_catalog '<dlf_catalog_name>',
         rolearn 'acs:ram::<account_id>:role/<role_name>',
         comment 'dlf catalog';
  5. Create a USER MAPPING to associate the RAM user or BASIC account with the RAM role.

    - For USER MAPPING syntax, see CREATE USER MAPPING. - To create a RAM user, see Create a RAM user. - To create a BASIC account, see User management.
       CREATE USER MAPPING FOR "<ram_user_or_basic_account>"
       EXTERNAL DATABASE <ext_database_name>
       OPTIONS
       (
         rolearn 'acs:ram::<account_id>:role/<role_name>'
       );
  6. Log in to the Hologres console as the RAM user, or reconnect to the Hologres instance as the BASIC account, then run queries.

       SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;

More operations:

Use a foreign server (legacy)

The foreign server approach does not support Time Travel, branch queries, or data mirroring. Use an External Database for those features.

Step 1: Create a foreign server.

For DLF V2.5 and later:

CREATE SERVER IF NOT EXISTS <server_name> FOREIGN DATA WRAPPER dlf_fdw OPTIONS (
  catalog_type 'paimon',
  metastore_type 'dlf-rest',
  dlf_catalog '<dlf_catalog_name>'
);
ParameterRequiredDescription
catalog_typeYesSet to paimon.
metastore_typeYesSet to dlf-rest for DLF V2.5+.
dlf_catalogYesThe name of the DLF catalog to map.

For DLF V1.0:

CREATE SERVER IF NOT EXISTS <server_name> FOREIGN DATA WRAPPER dlf_fdw OPTIONS (
  dlf_region '',
  dlf_endpoint 'dlf-share.<region>.aliyuncs.com',
  oss_endpoint 'oss-<region_id>-internal.aliyuncs.com',
  dlf_catalog '<dlf_catalog_name>'
);
ParameterRequiredDescription
dlf_regionYesThe region ID of the DLF service.
dlf_endpointYesThe DLF service endpoint. Format: dlf-share.<region>.aliyuncs.com.
oss_endpointYesThe OSS internal endpoint. Format: oss-<region_id>-internal.aliyuncs.com.
dlf_catalogYesThe name of the DLF catalog to map.

For more foreign server operations, see More foreign server operations.

Step 2: Create foreign tables.

Use IMPORT FOREIGN SCHEMA to import all tables in a DLF database:

-- Import all tables
IMPORT FOREIGN SCHEMA <dlf_db_name>
FROM SERVER <server_name>
INTO <holo_schema_name>
OPTIONS (if_table_exist 'update');

-- Import specific tables
IMPORT FOREIGN SCHEMA <dlf_db_name>
LIMIT TO (<table_name>[, ...])
FROM SERVER <server_name>
INTO <holo_schema_name>
OPTIONS (if_table_exist 'update');

Or create a single foreign table with CREATE FOREIGN TABLE:

CREATE FOREIGN TABLE <foreign_table_name>
(
  <column_name> <data_type>
  [, ...]
)
SERVER <server_name>
OPTIONS
(
  schema_name '<dlf_db_name>',
  table_name '<dlf_table_name>'
);

For syntax details, see IMPORT FOREIGN SCHEMA and CREATE FOREIGN TABLE.

Read Paimon data

Collect statistics for query optimization

Run ANALYZE or AUTO ANALYZE to collect table statistics. The query optimizer uses these statistics to build optimal query plans.

-- Collect statistics for all columns
ANALYZE <ext_database_name>.<ext_schema_name>.<ext_table_name>;

-- Collect statistics for specific columns (more accurate; use for filter columns)
ANALYZE <ext_database_name>.<ext_schema_name>.<ext_table_name>(<col_name>[, <col_name>]);

-- Enable AUTO ANALYZE for an External Database
ALTER EXTERNAL DATABASE <ext_database_name> WITH enable_auto_analyze 'true';

For more details, see ANALYZE and AUTO ANALYZE.

Query with Time Travel

Hologres V3.2 and later supports querying historical snapshots of Paimon tables by timestamp, version (snapshot ID), or tag.

Important

Time Travel is only supported for foreign tables mapped via an External Database. It is not supported on tables with schema evolution or in Dynamic Table queries.

List available snapshots:

SELECT * FROM hologres.hg_list_snapshots('<ext_database_name>.<ext_schema_name>.<ext_table_name>');

The snapshots table returns the following fields:

FieldTypeDescription
branch_nameTEXTBranch name
snapshot_idTEXTSnapshot ID
schema_idTEXTSchema ID
commit_kindTEXTCommit type
commit_timeTIMESTAMPTZCommit time
extend_infoTEXT (JSON)Remaining Paimon snapshot properties

You can also query the Paimon Snapshots system table using the hg_list_versions function. For Paimon tables, Version corresponds to Snapshot ID.

SELECT * FROM hologres.hg_list_versions('<ext_database_name>.<ext_schema_name>.<ext_table_name>');

List available tags:

SELECT * FROM hologres.hg_list_tags('<ext_database_name>.<ext_schema_name>.<ext_table_name>');
FieldTypeDescription
branch_nameTEXTBranch name
tag_nameTEXTTag name
snapshot_idTEXTSnapshot ID
schema_idTEXTSchema ID
commit_timeTIMESTAMPTZCommit time
extend_infoTEXT (JSON)Remaining Paimon tag properties

Query by timestamp:

SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>
FOR TIMESTAMP AS OF '<timestamp>';

Query by version (snapshot ID):

SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR VERSION AS OF '<version>';

Query by tag:

SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR TAG AS OF '<tag>';
Hologres V3.2.6 and V4.0.6 and later support reading data from tags that correspond to expired Paimon snapshots.

For more information on Paimon system tables, see Paimon system tables.

Query Paimon branches

Hologres V3.2 and later supports querying data from a specific Paimon branch and querying historical snapshots within a branch.

List available branches:

SELECT * FROM hologres.hg_list_branches('<ext_database_name>.<ext_schema_name>.<ext_table_name>');
FieldTypeDescription
branch_nameTEXTBranch name
create_timeTIMESTAMPTZBranch creation time
extend_infoTEXT (JSON)Remaining Paimon branch properties

Query the current data of a branch:

SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR branch AS OF '<branch_name>';
If scan.fallback-branch is set on the Paimon table, Hologres automatically reads missing partitions from the fallback branch. See Manage branches.

Query historical snapshots within a branch:

-- By timestamp
SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR branch AS OF '<branch_name>' TIMESTAMP AS OF '<timestamp>';

-- By tag
SELECT * FROM '<ext_database_name>.<ext_schema_name>.<ext_table_name>'
FOR branch AS OF '<branch_name>' TAG AS '<tag_name>';

Accelerate queries with lake table mirroring

Hologres V3.2 and later supports lake table mirroring, which syncs Paimon table metadata and data into Hologres in near-real-time or at scheduled intervals. Mirrored data is served from Hologres storage, significantly reducing query latency. Both full table mirroring and partial mirroring of partitioned tables are supported.

Important

Lake table mirroring does not support Paimon primary key tables.

For setup details, see Lake table mirroring.

Use Dynamic Table for continuous refresh

A Dynamic Table automatically processes and stores aggregated results from one or more Paimon foreign tables, with built-in refresh strategies. Starting from V3.0, MaxCompute foreign tables and DLF foreign tables are supported as base tables for a Dynamic Table.

Important

Dynamic Table does not support Time Travel queries on Paimon tables or branches.

Full refresh — replaces all data on each refresh cycle:

CREATE DYNAMIC TABLE paimon_dt_table
WITH (
  auto_refresh_mode = 'full',
  freshness = '3 minutes'
) AS SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;

Incremental refresh — processes only new or changed data:

CREATE DYNAMIC TABLE paimon_dt_table_incremental
WITH (
  auto_refresh_mode = 'incremental',
  freshness = '3 minutes'
) AS SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;

Auto refresh — automatically selects full or incremental refresh based on data changes:

CREATE DYNAMIC TABLE paimon_dt_table_auto
WITH (
  auto_refresh_mode = 'auto',
  freshness = '3 minutes'
) AS SELECT * FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>;

For more details, see Dynamic Table overview.

Write to Paimon tables

Hologres V4.0 and later supports writing to Paimon primary key tables via an External Database. The write process shuffles data by bucket before writing to reduce small file generation. A memory-based caching mechanism supports tables that require caching, including:

  • Tables with Merge Engine set to First Row

  • Tables that generate a Changelog through the lookup mechanism

  • Tables with Deletion Vector enabled

INSERT INTO

The following example creates a Paimon primary key table and inserts rows.

-- Create a Paimon primary key table
CREATE EXTERNAL TABLE <ext_database_name>.<ext_schema_name>.<ext_table_name> (
  id         BIGINT,
  customer_id BIGINT,
  item       TEXT,
  amount     FLOAT,
  dt         TEXT,
  PRIMARY KEY (id)
)
WITH (
  "table_format"  = 'paimon',
  "file_format"   = 'orc',
  "merge-engine"  = 'deduplicate',  -- Adjust as needed
  "bucket"        = '4'             -- Adjust as needed
);

-- Insert rows
INSERT INTO <ext_database_name>.<ext_schema_name>.<ext_table_name>
  (id, customer_id, item, amount, dt)
VALUES
  (1, 1001, 'apple',  3.50, '2025-09-10'),
  (2, 1002, 'banana', 2.00, '2025-09-10'),
  (3, 1003, 'orange', 4.20, '2025-09-11');

INSERT OVERWRITE

The following example creates a partitioned Paimon append-only table and overwrites a partition.

-- Create a partitioned Paimon append-only table
CREATE EXTERNAL TABLE <ext_database_name>.<ext_schema_name>.<ext_table_name> (
  id     BIGINT,
  name   TEXT,
  amount FLOAT8,
  dt     TEXT
)
LOGICAL PARTITION BY LIST(dt)
WITH (
  "table_format" = 'paimon',
  "file_format"  = 'parquet',
  "merge-engine" = 'deduplicate',
  "bucket-key"   = 'id',
  "bucket"       = '2'
);

-- Overwrite data
INSERT OVERWRITE <ext_database_name>.<ext_schema_name>.<ext_table_name>
  (id, name, amount, dt)
VALUES
  (1, 'Alice', 100.50, '2025-09-01'),
  (2, 'Bob',   200.00, '2025-09-01'),
  (3, 'Carol', 150.75, '2025-09-02');

UPDATE

UPDATE is supported only on Paimon primary key tables where the Merge Engine is deduplicate or partial-update. Updating primary key columns is not supported.

UPDATE <ext_database_name>.<ext_schema_name>.<ext_table_name>
SET amount = 5
WHERE id = 1;

DELETE

DELETE is supported only on Paimon primary key tables where the Merge Engine is deduplicate or partial-update. For partial-update tables, enable the partial-update.remove-record-on-delete property before running DELETE.

DELETE FROM <ext_database_name>.<ext_schema_name>.<ext_table_name>
WHERE id = 2;

What's next