All Products
Search
Document Center

Hologres:Access OSS data with Hive Metastore (beta)

Last Updated:Mar 26, 2026

Starting from V2.2, Hologres supports Hive Metastore (HMS) as a metadata source for data lakes built on Object Storage Service (OSS). If your data lake runs on an E-MapReduce (EMR) cluster with OSS or OSS-HDFS as the storage layer, connect Hologres to HMS to query OSS data directly using SQL — without migrating or copying data.

Prerequisites

Before you begin, ensure that you have:

  • An activated OSS service. See Quick Start.

  • An EMR data lake cluster with test data loaded. See Create test data and Create a cluster. The cluster must meet all of the following conditions:

    • Hive version 3.1.3 or later

    • Kerberos authentication disabled

    • Metadata set to Self-managed RDS or Built-in MySQL

  • A Hologres instance with data lake acceleration enabled and a database created. See Purchase a Hologres instance and Create a database.

    To enable data lake acceleration, go to the Hologres instance list. In the Actions column of the target instance, click Data Lake Acceleration and confirm.
  • A network connection between Hologres and the EMR cluster. Because Hologres is deployed in a classic network and EMR runs in a virtual private cloud (VPC), a reverse endpoint is required for the two services to communicate. Submit a network connection request. The Hologres support team will then guide you through the following steps:

    1. Log on to the VPC console and create a reverse endpoint. See Access Alibaba Cloud services.

    2. For the Endpoint Service parameter, select Other Endpoint Services and enter the endpoint service name for the region where your EMR cluster is located:

      Region

      Endpoint service name

      Beijing

      com.aliyuncs.privatelink.cn-beijing.epsrv-2zeokrydzjd6kx3cbwmb

      Shanghai

      com.aliyuncs.privatelink.cn-shanghai.epsrv-uf61fvlfwta7f7dv9n3x

      Zhangjiakou

      com.aliyuncs.privatelink.cn-zhangjiakou.epsrv-8vbno4k4wwvys0eg2swp

    If your region is not listed, the Hologres team will create an endpoint service and provide the name after you submit the request. The connection uses an IP address. If the EMR cluster's IP address changes, reconfigure the connection.

Limitations

  • Read-only secondary Hologres instances do not support data lake acceleration.

  • UPDATE, DELETE, and TRUNCATE are not supported on foreign tables.

  • Auto Load (batch foreign table mapping from HMS) is not supported.

  • EMR clusters with Kerberos authentication enabled are not supported.

Connect Hologres to HMS

Step 1: Create the extension

Run the following SQL command to install the hive_fdw foreign data wrapper (FDW). This operation requires superuser permissions and only needs to run once per database.

CREATE EXTENSION IF NOT EXISTS hive_fdw;

Step 2: Create a foreign server

Create a foreign server that points to your HMS instance and OSS storage.

Before running the command, collect the following values:

  • HMS IP address: In the E-MapReduce console, click Node Management for your cluster. On the Node Management tab, find the Internal IP of the master node.

  • OSS endpoint: In the OSS console, open the bucket overview page and check the Access Ports area.

CREATE SERVER IF NOT EXISTS <server_name> FOREIGN DATA WRAPPER hive_fdw
OPTIONS (
  hive_metastore_uris 'thrift://<hms_ip>:<port>',
  oss_endpoint '<oss_endpoint>'
);
Parameter Required Description Example
server_name Yes A custom name for the foreign server. hive_server
hive_metastore_uris Yes The URI of the Hive Metastore. Format: thrift://<IP>:<port>. Default port is 9083. thrift://172.16.0.250:9083
oss_endpoint Yes The OSS endpoint. For native OSS, use the internal endpoint for better performance. For OSS-HDFS, only the internal endpoint is supported. See examples below

For oss_endpoint, choose based on your storage type:

  • Native OSS: Use the internal endpoint.

    oss-cn-shanghai-internal.aliyuncs.com
  • OSS-HDFS: Only internal network access is supported.

    <bucket_name>.cn-beijing.oss-dls.aliyuncs.com

Step 3: (Optional) Create a user mapping

A user mapping controls which Hologres accounts can access external data through a foreign server. For example, a foreign server owner can grant a Resource Access Management (RAM) user access to OSS data.

For details on the CREATE USER MAPPING syntax, see the PostgreSQL documentation.

-- Grant the current user access to the foreign server
CREATE USER MAPPING FOR current_user SERVER <server_name> OPTIONS (
  dlf_access_id  '<AccessKey_ID>',
  dlf_access_key '<AccessKey_Secret>',
  oss_access_id  '<AccessKey_ID>',
  oss_access_key '<AccessKey_Secret>'
);

-- Grant a RAM user (123xxx) access to the foreign server
CREATE USER MAPPING FOR "p4_123xxx" SERVER <server_name> OPTIONS (
  dlf_access_id  '<AccessKey_ID>',
  dlf_access_key '<AccessKey_Secret>',
  oss_access_id  '<AccessKey_ID>',
  oss_access_key '<AccessKey_Secret>'
);

-- Remove user mappings
DROP USER MAPPING FOR CURRENT_USER SERVER <server_name>;
DROP USER MAPPING FOR "p4_123xxx" SERVER <server_name>;

Step 4: Create a foreign table

Hologres provides two commands for creating foreign tables:

Command Best for
CREATE FOREIGN TABLE A small number of tables, or when you need to map a subset of columns or assign a custom table name.
IMPORT FOREIGN SCHEMA Batch mapping of multiple tables from an external schema.
Hologres supports partitioned tables in OSS. Supported partition key types are TEXT, VARCHAR, and INT.
With CREATE FOREIGN TABLE: define partition fields as regular columns, since this command maps the schema without storing data.
With IMPORT FOREIGN SCHEMA: the system handles field mapping automatically.
If an external table name conflicts with an existing Hologres internal table, IMPORT FOREIGN SCHEMA skips that table. Use CREATE FOREIGN TABLE to map it with a different name.
-- Create a single foreign table
CREATE FOREIGN TABLE <holo_schema_name>.<table_name>
(
  column_name data_type
  [, ...]
)
SERVER <hive_server_name>
OPTIONS (
  schema_name '<ext_db_name>',
  table_name  '<ext_table_name>'
);

-- Import multiple foreign tables in batch
IMPORT FOREIGN SCHEMA <ext_db_name>
[
  { LIMIT TO | EXCEPT }
  ( table_name [, ...] )
]
FROM SERVER <hive_server_name>
INTO <holo_schema_name>
OPTIONS (
  if_table_exist        'update',
  if_unsupported_type   'error'
);

Step 5: Query the foreign table

After creating the foreign table, query it directly to read data from OSS.

Non-partitioned table:

SELECT * FROM <holo_schema>.<hive_table>;

Partitioned table:

SELECT * FROM <holo_schema>.<hive_partition_table>
WHERE <partition_key> = '<partition_value>';