All Products
Search
Document Center

E-MapReduce:Use a DLF Catalog

Last Updated:Mar 26, 2026

EMR Serverless StarRocks can query Paimon, Iceberg, and Hive tables stored in Data Lake Formation (DLF) by connecting through an external catalog. To enable this, you set up RAM-based access control and create a DLF-backed catalog in StarRocks.

Choose your DLF version

DLF has two versions with different catalog types:

Version Catalog types When to use
DLF (current) Paimon catalog (REST), Iceberg catalog (REST) Use for catalogs created in DLF with REST-based access
DLF 1.0 (legacy) Hive catalog, Iceberg catalog, Paimon catalog Use if your catalog was created in the legacy DLF 1.0 service

Use DLF

Prerequisites

Before you begin, ensure that you have:

  • A Serverless StarRocks instance at version 3.3 or later, with a Minor Version of 3.3.8-1.99 or later. To check the minor version, go to the Version Information section on the Instance Details page. If the minor version is earlier than 3.3.8-1.99, update it. To create an instance, see Create an instance

  • A data catalog in DLF

  • A RAM user. To create one, see Create a RAM user

Step 1: Add a RAM user in StarRocks

DLF uses Resource Access Management (RAM) for access control. By default, StarRocks users have no permissions on DLF resources. Add an existing RAM user to StarRocks before granting DLF permissions.

  1. Go to the instance list page.

    1. Log on to the E-MapReduce console.

    2. In the navigation pane on the left, choose EMR Serverless > StarRocks.

    3. In the top menu bar, select the required region.

  2. On the Instance List page, find your instance and click Connect in the Actions column. For more information, see Connect to a StarRocks instance using EMR StarRocks Manager. Connect using the admin user or a StarRocks super administrator account.

  3. In the left-side menu, choose Security Center > User Management, then click Create User.

  4. In the Create User dialog box, set the following parameters and click OK.

    Parameter Value
    User Source RAM User
    Username Select the RAM user (for example, dlf-test)
    Password / Confirm Password Enter a custom password
    Roles Keep the default value public

Step 2: Grant catalog permissions in DLF

  1. Log on to the Data Lake Formation console.

  2. On the Catalogs page, click the name of your catalog.

  3. Click the Permissions tab, then click Grant Permissions.

  4. From the Select DLF User drop-down list, select the RAM user you added in Step 1 (for example, dlf-test).

  5. Set Preset Permission Type to Custom and grant the ALL permission on the current data catalog and all its resources.

  6. Click OK.

Step 3: Create a DLF catalog in StarRocks

Reconnect to the StarRocks instance using the RAM user you added in Step 1. All catalog creation and data access in the following steps uses this RAM user.

To create a query in the SQL Editor, go to the Querys page and click the image icon.

Paimon catalog

Run the following SQL statement to create a Paimon catalog backed by DLF:

CREATE EXTERNAL CATALOG `dlf_catalog`
PROPERTIES (
  'type' = 'paimon',
  'uri' = 'http://cn-hangzhou-vpc.dlf.aliyuncs.com',
  'paimon.catalog.type' = 'rest',
  'paimon.catalog.warehouse' = 'StarRocks_test',
  'token.provider' = 'dlf'
);

Iceberg catalog

Run the following SQL statement to create an Iceberg catalog backed by DLF:

CREATE EXTERNAL CATALOG `iceberg_catalog`
PROPERTIES
(
  'type' = 'iceberg',
  'iceberg.catalog.type' = 'dlf_rest',
  'uri' = 'http://cn-hangzhou-vpc.dlf.aliyuncs.com/iceberg',
  'warehouse' = 'iceberg_test',
  'rest.signing-region' = 'cn-hangzhou'
);
Iceberg foreign tables are read-only. You can run SELECT queries but cannot write data to Iceberg tables from StarRocks.

Step 4: Read and write data

Read and write data (Paimon catalog)

Create a database and table, insert data, and then run a query:

-- Create a database
CREATE DATABASE IF NOT EXISTS dlf_catalog.sr_dlf_db;

-- Create a table
CREATE TABLE dlf_catalog.sr_dlf_db.ads_age_pvalue_analytics (
  final_gender_code STRING COMMENT 'Gender',
  age_level         STRING COMMENT 'Age level',
  pvalue_level      STRING COMMENT 'Consumption level',
  clicks            INT    COMMENT 'Number of clicks',
  total_behaviors   INT    COMMENT 'Total number of behaviors'
);

-- Insert data
INSERT INTO dlf_catalog.sr_dlf_db.ads_age_pvalue_analytics
  (final_gender_code, age_level, pvalue_level, clicks, total_behaviors)
VALUES
  ('M', '18-24', 'Low',    1500, 2500),
  ('F', '25-34', 'Medium', 2200, 3300),
  ('M', '35-44', 'High',   2800, 4000);

-- Query data
SELECT * FROM dlf_catalog.sr_dlf_db.ads_age_pvalue_analytics;

The query returns the inserted rows:

image

Query data (Iceberg catalog)

SELECT * FROM iceberg_catalog.`default`.test_iceberg;

The query result:

image

Use DLF 1.0 (legacy)

Prerequisites

Before you begin, ensure that you have:

Create a Hive catalog

Use the following syntax to create a Hive catalog that points to DLF 1.0 as the metastore:

Syntax

CREATE EXTERNAL CATALOG <catalog_name>
[COMMENT <comment>]
PROPERTIES
(
    "type" = "hive",
    GeneralParams,
    MetastoreParams
)

Parameters

Parameter Required Description
catalog_name Yes Name of the Hive catalog. Must start with a letter and contain only letters (a–z, A–Z), digits (0–9), and underscores (_). Maximum 64 characters.
comment No Description of the Hive catalog.
type Yes Type of the data source. Set to hive.

GeneralParams supports the following parameter:

Parameter Required Description
enable_recursive_listing No Whether StarRocks recursively reads data from subdirectories of a table or partition directory. true (default): traverse subdirectories. false: read only the top-level directory.

MetastoreParams specifies how StarRocks accesses Hive metadata:

Parameter Required Description
hive.metastore.type Yes Type of the metadata service. Set to dlf.
dlf.catalog.id No ID of an existing data catalog in DLF 1.0. If not specified, the system uses the default DLF catalog.

Example

CREATE EXTERNAL CATALOG hive_catalog
PROPERTIES
(
    "type" = "hive",
    "hive.metastore.type" = "dlf",
    "dlf.catalog.id" = "sr_dlf"
);

For more information, see Hive catalog.

Create an Iceberg catalog

Use the following syntax to create an Iceberg catalog that points to DLF 1.0 as the metastore:

Syntax

CREATE EXTERNAL CATALOG <catalog_name>
[COMMENT <comment>]
PROPERTIES
(
    "type" = "iceberg",
    MetastoreParams
)

Parameters

Parameter Required Description
catalog_name Yes Name of the Iceberg catalog. Must start with a letter and contain only letters (a–z, A–Z), digits (0–9), and underscores (_). Maximum 64 characters. The name is case-sensitive.
comment No Description of the Iceberg catalog.
type Yes Type of the data source. Set to iceberg.

MetastoreParams specifies how StarRocks accesses Iceberg metadata:

Parameter Required Description
iceberg.catalog.type Yes Type of the Iceberg catalog. Set to dlf.
dlf.catalog.id No ID of an existing data catalog in DLF 1.0. If not specified, the system uses the default DLF catalog.

Example

CREATE EXTERNAL CATALOG iceberg_catalog_hms
PROPERTIES
(
    "type" = "iceberg",
    "iceberg.catalog.type" = "dlf",
    "dlf.catalog.id" = "sr_dlf"
);

For more information, see Iceberg catalog.

Create a Paimon catalog

Use the following syntax to create a Paimon catalog that points to DLF 1.0 as the metastore:

Syntax

CREATE EXTERNAL CATALOG <catalog_name>
[COMMENT <comment>]
PROPERTIES
(
    "type" = "paimon",
    CatalogParams,
    StorageCredentialParams
);

Parameters

Parameter Required Description
catalog_name Yes Name of the Paimon catalog. Must start with a letter and contain only letters (a–z, A–Z), digits (0–9), and underscores (_). Maximum 64 characters.
comment No Description of the Paimon catalog.
type Yes Type of the data source. Set to paimon.

CatalogParams specifies how StarRocks accesses Paimon metadata:

Parameter Required Description
paimon.catalog.type Yes Type of the catalog. Set to dlf.
paimon.catalog.warehouse Yes Storage path of the warehouse where Paimon data is stored. Supports HDFS, OSS, and OSS-HDFS. For OSS or OSS-HDFS, use the format oss://<yourBucketName>/<yourPath>.
dlf.catalog.id No ID of an existing data catalog in DLF 1.0. If not specified, the system uses the default DLF catalog.

StorageCredentialParams specifies how StarRocks accesses file storage:

  • If you use HDFS, no additional configuration is required.

  • If you use OSS or OSS-HDFS, add the following parameter:

    Important

    After setting aliyun.oss.endpoint, go to the Parameter Configuration page in the EMR Serverless StarRocks console and update fs.oss.endpoint in both core-site.xml and jindosdk.cfg to match this value.

    Parameter Description
    aliyun.oss.endpoint The endpoint of your OSS or OSS-HDFS storage. For OSS, find the endpoint on the Overview page of your bucket under the Port section, or see OSS regions and endpoints. Example: oss-cn-hangzhou.aliyuncs.com. For OSS-HDFS, find the endpoint under OSS-HDFS in the Port section. Example: cn-hangzhou.oss-dls.aliyuncs.com.
    "aliyun.oss.endpoint" = "<YourAliyunOSSEndpoint>"

Example

CREATE EXTERNAL CATALOG paimon_catalog
PROPERTIES
(
    "type" = "paimon",
    "paimon.catalog.type" = "dlf",
    "paimon.catalog.warehouse" = "oss://<yourBucketName>/<yourPath>",
    "dlf.catalog.id" = "paimon_dlf_test"
);

For more information, see Paimon catalog.

What's next