All Products
Search
Document Center

E-MapReduce:Hudi Catalog

Last Updated:Mar 26, 2026

A Hudi catalog is an external catalog that lets you query Apache Hudi data directly in StarRocks without importing it. Use INSERT INTO with a Hudi catalog to transform and load Hudi data into StarRocks internal tables. StarRocks supports Hudi catalogs from version 2.4.

Use cases

ScenarioDescription
Query accelerationRun StarRocks queries directly against Hudi tables in your data lake without moving data.
Data integrationRead Hudi data and write it into StarRocks internal tables using INSERT INTO.

Supported capabilities

CategoryDetails
Storage systemsHadoop Distributed File System (HDFS), Object Storage Service (OSS)
Metadata servicesData Lake Formation (DLF), Hive Metastore (HMS)
File formatParquet
Compression formatsSNAPPY, LZ4, ZSTD, GZIP, NO_COMPRESSION
Table typesCopy On Write (COW), Merge On Read (MOR)

Create a Hudi catalog

Syntax

CREATE EXTERNAL CATALOG <catalog_name>
[COMMENT <comment>]
PROPERTIES
(
    "type" = "hudi",
    MetastoreParams,
    StorageCredentialParams,
    MetadataUpdateParams
)

Parameters

ParameterRequiredDescription
catalog_nameYesName of the Hudi catalog. Must start with a letter and contain only letters, digits, and underscores (_). Length: 1–64 characters.
commentNoDescription of the Hudi catalog.
typeYesType of the data source. Set to hudi.
MetastoreParamsYesParameters for connecting to the metadata service. See MetastoreParams.

MetastoreParams

Configure one of the following, depending on your metadata service.

Use DLF

PropertyRequiredDescription
hive.metastore.typeYesType of metadata service. Set to dlf.
dlf.catalog.idNoID of an existing data catalog in DLF. If not specified, StarRocks uses the default DLF catalog.

Use HMS

PropertyRequiredDescription
hive.metastore.typeYesType of metadata service. Set to hive.
hive.metastore.urisYesURI of the Hive Metastore service. Format: thrift://<metastore-host>:<port>. The default port is 9083.

Examples

Example 1: OSS with DLF

CREATE EXTERNAL CATALOG hudi_catalog_dlf
PROPERTIES
(
    "type" = "hudi",
    "hive.metastore.type" = "dlf",
    "dlf.catalog.id" = "<your-dlf-catalog-id>"
);

Example 2: OSS with HMS

CREATE EXTERNAL CATALOG hudi_catalog_hms
PROPERTIES
(
    "type" = "hudi",
    "hive.metastore.type" = "hive",
    "hive.metastore.uris" = "thrift://<metastore-host>:9083"
);

Example 3: HDFS with HMS

CREATE EXTERNAL CATALOG hudi_catalog
PROPERTIES
(
    "type" = "hudi",
    "hive.metastore.type" = "hive",
    "hive.metastore.uris" = "thrift://xx.xx.xx.xx:9083"
);

View Hudi catalogs

List all catalogs in your StarRocks cluster:

SHOW CATALOGS;

View the creation statement of a specific Hudi catalog:

SHOW CREATE CATALOG hudi_catalog;

Switch to a Hudi catalog and database

Use either of the following methods.

Option 1: Set catalog, then switch database

-- Switch to the Hudi catalog for the current session:
SET CATALOG <catalog_name>;
-- Switch to the target database:
USE <db_name>;

Option 2: Switch catalog and database in one statement

USE <catalog_name>.<db_name>;

Query a Hudi table

View the schema of a Hudi table:

DESC[RIBE] <catalog_name>.<database_name>.<table_name>;

View the schema and file storage location:

SHOW CREATE TABLE <catalog_name>.<database_name>.<table_name>;

Query data in a Hudi table:

SELECT * FROM <catalog_name>.<database_name>.<table_name>;

Import Hudi data

Use INSERT INTO to transform and load Hudi data into a StarRocks internal table. The following example loads data from a Hudi table into an OLAP table named olap_tbl:

INSERT INTO default_catalog.olap_db.olap_tbl SELECT * FROM hudi_table;

Refresh metadata cache

StarRocks caches Hudi metadata and updates it asynchronously by default to improve query performance. After schema changes or data updates to a Hudi table, manually refresh the metadata cache to make sure StarRocks generates accurate query plans immediately:

REFRESH EXTERNAL TABLE <table_name> [PARTITION ('partition_name', ...)];

Delete a Hudi catalog

DROP CATALOG hudi_catalog;

What's next

For an overview of Apache Hudi, see Overview.