CREATE EXTERNAL DATABASE maps an external data source—MaxCompute or a Data Lake Formation (DLF) data lake—into a Hologres instance as a single database object. Once mapped, you can query external tables using the three-part path ext_db.schema.table, run federated queries across internal and external data, and manage all metadata from one place.
External databases are globally visible within a Hologres instance, so any internal database can reference them. You can also connect directly to an external database to run SQL statements and set Grand Unified Configuration (GUC) parameters.
Supported in Hologres V3.0 and later.
Choose your data source
The two supported data sources work differently. Pick the right one before writing your statement.
| MaxCompute | DLF | |
|---|---|---|
| What it connects to | A MaxCompute project | A DLF catalog (Paimon format) |
| Storage location | MaxCompute storage | Object Storage Service (OSS) managed by DLF |
| Supported formats | MaxCompute native | Paimon (DLF managed); Paimon, Apache Hudi, Delta Lake, ORC, Parquet, CSV, SequenceFile (DLF V1.0) |
| Use when | Your data lives in MaxCompute projects | Your data lives in OSS data lakes managed by DLF |
Hologres uses its self-developed High-QPS Engine (HQE) for high-performance reads and writes on data lake formats, such as Paimon, Hudi, and DeltaLake.
Prerequisites
Instance configuration
Enable data lake acceleration for the instance before running CREATE EXTERNAL DATABASE:
Open the Hologres console and go to the Instance List or Instance Details page.
Find the target instance and click Data Lake Acceleration in the Actions column.
Click Confirm in the dialog box.
Service activation
DLF: Activate the DLF service. See Quick start and Available regions and endpoints.
DLF V1.0: Also activate Object Storage Service (OSS) and complete OSS authorization. Without the required OSS permissions, queries fail even if the foreign table is created successfully. See Bucket policy.
OSS-HDFS (optional): Activate the OSS-HDFS service if you plan to use OSS-HDFS storage. See Activate the OSS-HDFS service.
Permissions
The account running
CREATE EXTERNAL DATABASEmust have Superuser permissions in Hologres.Permissions on the external data (tables, schemas) are controlled by MaxCompute or DLF, not Hologres. Grant the service-linked role (SLR)
AliyunServiceRoleForHologresIdentityMgmtthe required permissions before accessing external data. See Hologres service-linked role.To use a RAM role instead of the SLR, see RAM role overview.
For non-Alibaba Cloud accounts and RAM users (for example, Hologres BASIC accounts), use
CREATE USER MAPPINGto bind and authorize the account. See CREATE USER MAPPING.
Limitations
Cross-database access is one-directional. When connected to an internal database, access external tables via the full path
ext_db.schema.table. When connected directly to an external database, internal database tables are not accessible.Metadata cache TTL is fixed at 5 minutes. Hologres caches external metadata for 5 minutes to improve performance. This value is not configurable. To force a refresh, use the
REFRESH CACHEstatement described in Refresh external metadata.MaxCompute: only internal projects are supported. External MaxCompute projects cannot be mapped.
Syntax
MaxCompute data source
CREATE EXTERNAL DATABASE <ext_database_name> WITH
metastore_type 'maxcompute'
mc_project '<project_name>'
[comment '<description>'];Parameters
| Parameter | Required | Description |
|---|---|---|
metastore_type | Yes | Metastore type. Set to maxcompute. Case-insensitive. |
mc_project | Yes | Name of the MaxCompute project to map. Only internal MaxCompute projects are supported. |
comment | No | Description of the external database. |
DLF data source
DLF currently supports only the managed storage mode and the Paimon catalog type. Setmetastore_typetodlf-restandcatalog_typetopaimon. Authentication defaults to the SLR (AliyunServiceRoleForHologresIdentityMgmt). To use a RAM role instead, configure it viaCREATE USER MAPPING.
CREATE EXTERNAL DATABASE <ext_database_name> WITH
catalog_type 'paimon'
metastore_type 'dlf-rest'
dlf_catalog '<dlf_catalog_name>'
[comment '<description>'];Parameters
| Parameter | Required | Description |
|---|---|---|
metastore_type | Yes | Metastore type for the DLF data source. Set to dlf-rest. |
catalog_type | Yes | Lake table format. Set to paimon. |
dlf_catalog | Yes | Name of the DLF catalog to map. |
comment | No | Description of the external database. |
DLF V1.0 data source
DLF V1.0 supports custom OSS storage and additional formats: Paimon, Apache Hudi, Delta Lake, ORC, Parquet, CSV, and SequenceFile. See Data lake acceleration. DLF V1.0 does not support SLR or RAM role authentication. Use CREATE USER MAPPING to provide credentials per user.CREATE EXTERNAL DATABASE <ext_database_name> WITH
metastore_type 'dlf'
dlf_region '<region>'
dlf_endpoint '<dlf_endpoint>'
dlf_catalog '<catalog_name>'
oss_endpoint '<oss_endpoint>'
[comment '<description>'];Parameters
| Parameter | Required | Description | Example |
|---|---|---|---|
metastore_type | Yes | Metastore type. Set to dlf for DLF V1.0. | dlf |
catalog_type | No | Lake table format. DLF V1.0 supports Paimon, Hudi, Delta Lake, ORC, Parquet, CSV, and SequenceFile. Not required for DLF V1.0. | paimon |
dlf_region | Yes | Region where the DLF service is deployed. Format: <nation>-<region>. | cn-beijing |
dlf_endpoint | Yes | Internal network endpoint for DLF access. Format: dlf-share.<nation>-<region>.aliyuncs.com. See Available regions and endpoints. | dlf-share.cn-beijing.aliyuncs.com |
dlf_catalog | Yes | Name of the DLF catalog to map. | hive_catalog |
dlf_access_id | Yes | AccessKey ID for DLF access. Provide via CREATE USER MAPPING. | — |
dlf_access_key | Yes | AccessKey secret for DLF access. Provide via CREATE USER MAPPING. | — |
oss_endpoint | Yes | Endpoint for OSS access. Use the internal endpoint for better performance. For OSS-HDFS, use the OSS-HDFS internal endpoint. See Get the OSS-HDFS domain name. | oss-cn-beijing-internal.aliyuncs.com |
oss_access_id | Yes | AccessKey ID for OSS access. Provide via CREATE USER MAPPING. | — |
oss_access_key | Yes | AccessKey secret for OSS access. Provide via CREATE USER MAPPING. | — |
comment | No | Description of the external database. | — |
Examples
MaxCompute data source
When you map a MaxCompute project that uses a two-layer model, Hologres displays it as a three-layer structure:ext_db.schema.mc_table. The schema name defaults todefault. For example:SELECT * FROM ext_db.default.mc_table.
-- Create an external database for a MaxCompute project.
CREATE EXTERNAL DATABASE ext_database_mc WITH
metastore_type 'maxcompute'
mc_project 'mc_3_layer_project'
comment 'mc three layer project';
-- Query a table in the external database.
SELECT * FROM ext_database_mc.mc_schema.mc_table;DLF data source
-- Create an external database for a DLF Paimon catalog.
CREATE EXTERNAL DATABASE ext_database_dlf WITH
metastore_type 'dlf-rest'
catalog_type 'paimon'
dlf_catalog 'dlf_paimon_catalog'
comment 'DLF paimon catalog';
-- Query a table in the external database.
SELECT * FROM ext_database_dlf.dlf_db.paimon_table;DLF V1.0 data source
DLF V1.0 requires an AccessKey pair for both DLF and OSS. Provide credentials using CREATE USER MAPPING rather than embedding them in the CREATE EXTERNAL DATABASE statement.
-- Create an external database.
CREATE EXTERNAL DATABASE ext_database_dlf1 WITH
metastore_type 'dlf'
dlf_region 'cn-beijing'
dlf_endpoint 'dlf-share.cn-beijing.aliyuncs.com'
dlf_catalog 'hive_catalog'
oss_endpoint 'oss-cn-beijing-internal.aliyuncs.com'
comment 'DLF1.0 hive catalog';
-- Bind credentials for the current user.
CREATE USER MAPPING FOR current_user
EXTERNAL DATABASE ext_database_dlf1
OPTIONS
(
dlf_access_id 'LTxxxxxxxxxx',
dlf_access_key 'y8xxxxxxxxxxxxx',
oss_access_id 'LTxxxxxxxxxx',
oss_access_key 'y8xxxxxxxxxxxxx'
);
-- Query a table in the external database.
SELECT * FROM ext_database_dlf1.dlf_db.hive_table;More operations
View all external databases
SELECT database_name, options FROM hologres.hg_external_databases();Refresh external metadata
Force an immediate metadata refresh instead of waiting for the cache to expire:
REFRESH CACHE FOR EXTERNAL DATABASE <EXT_DB_NAME> WITH (cache_level = 'metadata');Modify the metadata refresh interval
The metadata_refresh_interval_sec parameter controls the full schema refresh interval per frontend (FE) node. The default is 7,200 seconds. Adjust it with:
ALTER EXTERNAL DATABASE <EXT_DB_NAME> WITH metadata_refresh_interval_sec 1800;