CREATE EXTERNAL TABLE creates an external table in an external schema sourced from Data Lake Formation 2.0 (DLF 2.0). Only Apache Paimon format tables are supported.
Limitations
Requires Hologres V3.0 or later.
External tables must reside in an external schema sourced from DLF 2.0.
Only the Apache Paimon table format is supported.
Only ORC and Parquet file formats are supported.
Syntax
CREATE EXTERNAL TABLE [IF NOT EXISTS] <ext_db_name>.<ext_schema_name>.<ext_table_name>
(
[{
<col_name> <col_type> [<col_constraints> [, ...]] |
<table_constraints> [, ...]
}]
)
[LOGICAL PARTITION BY LIST(<col_name> [, ...])]
[
WITH
(
"<property>" = '<value>' [, ...]
)
];WITH clause parameters
| Parameter | Description | Example |
|---|---|---|
table_format | The table format of the external table. Only paimon is supported. | "table_format" = 'paimon' |
file_format | The file format of the external table. Valid values: orc, parquet. | "file_format" = 'orc' |
bucket | The number of buckets. Data in a non-partitioned table, or data in each partition of a partitioned table, is distributed across buckets to enable concurrent reads and writes. For details, see Data Distribution. | "bucket" = '1' |
bucket-key | The column used to distribute data into buckets. | "bucket-key" = 'id' |
changelog-producer | The method for generating changelogs on the table. Changelogs capture complete INSERT, DELETE, and UPDATE records — similar to a binary log in a database — and enable downstream streaming consumption. For details, see Changelog Producer. | "changelog-producer" = 'input' |
Examples
Create a partitioned table in DLF 2.0
The following example creates an Apache Paimon append-only table with a logical partition on created_at.
CREATE EXTERNAL TABLE ext_db_dlf.ext_schema_dlf.ext_par_table_dlf(
id TEXT,
created_at BIGINT, -- logical partition key
type TEXT,
actor_id TEXT,
actor_login TEXT,
repo_id TEXT,
repo_name TEXT,
org TEXT,
org_login TEXT
)
LOGICAL PARTITION BY LIST(created_at)
WITH (
"file_format" = 'orc',
"bucket" = 6,
"bucket-key" = 'id' -- distribute data by id across 6 buckets
);Create a non-partitioned table in DLF 2.0
The following example creates an Apache Paimon primary key table. The changelog-producer parameter specifies the method for generating changelogs to allow downstream consumption in streaming mode.
CREATE EXTERNAL TABLE openlake_win.github_events.gh_event_ods(
id TEXT,
created_at BIGINT,
type TEXT,
actor_id TEXT,
actor_login TEXT,
repo_id TEXT,
repo_name TEXT,
org TEXT,
org_login TEXT,
PRIMARY KEY(id)
) WITH (
"changelog-producer" = 'input', -- generate changelogs for streaming consumers
"bucket" = 6,
"bucket-key" = 'id'
);More operations
Query the table creation statement
SELECT * FROM hologres.hg_dump_script_external('<ext_db_name>.<ext_schema_name>.<ext_table_name>');Query column and partition properties
SELECT * FROM hologres.hg_external_columns('<ext_db_name>', '<ext_schema_name>', '<ext_table_name>');Query all tables in an external schema
SELECT * FROM hologres.hg_external_tables('<ext_db_name>', '<ext_schema_name>');Refresh the metadata of an external table
REFRESH CACHE FOR External TABLE <ext_db_name>.<ext_schema_name>.<ext_table_name> WITH( cache_level = 'metadata');