All Products
Search
Document Center

E-MapReduce:Use DLF for unified metadata storage

Last Updated:Mar 26, 2026

The default built-in MySQL metastore in E-MapReduce (EMR) is tied to the cluster lifecycle: deleting a cluster also deletes its metadata, and the metastore cannot be shared across clusters. Switch to Data Lake Formation (DLF) — a fully managed service by Alibaba Cloud for centralized metadata management, user permission management, data ingestion, and data exploration — to store metadata centrally so that multiple clusters share the same metadata and permissions, even after individual clusters are removed. For more information about DLF, see Overview.

DLF can be used as the Hive metastore only on EMR V3.33.0 or later, or EMR V4.5.0 or later.

Prerequisites

Before you begin, make sure that you have:

Compatibility

DLF is compatible with the following compute engines in EMR:

EngineSupported versions
Hive2.x, 3.x
Presto
Spark SQL

Change the metastore type

  1. Go to the Hive service page.

    1. Log on to the EMR console.

    2. In the top navigation bar, select the region where your cluster resides and select a resource group.

    3. On the EMR on ECS page, find your cluster and click Services in the Actions column.

    4. On the Services tab, find the Hive service and click Configure.

  2. On the Configure tab, enter hive.imetastoreclient.factory.class in the search box and click the Search icon. Set the parameter value based on your target metastore type:

    Metastore typeParameter value
    Built-in MySQL, unified metadatabase, or ApsaraDB RDS for MySQLorg.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClientFactory
    DLF unified metadata storagecom.aliyun.datalake.metastore.hive2.DlfMetaStoreClientFactory
  3. Save the configuration.

    1. In the lower-left corner of the Configure tab, click Save.

    2. In the Save dialog box, set the Execution Reason parameter and click Save.

  4. Restart the Hive service.

    1. In the upper-right corner of the Hive service page, choose More > Restart.

    2. In the Restart HIVE Services dialog box, set the Execution Reason parameter and click OK.

    3. In the Confirm dialog box, click OK.

    To track progress, click Operation History in the upper-right corner.