All Products
Search
Document Center

E-MapReduce:EMR Metadata Migration Notice

Last Updated:Mar 26, 2026

E-MapReduce (EMR) supports migrating Hive metadata stored in legacy storage types—Built-in MySQL or Unified Metabases—to Data Lake Formation (DLF). In 2020, Alibaba Cloud EMR launched DLF Unified Metadata as a new storage type to provide a better unified metadata service. This document covers when to migrate, what DLF provides, and how the four-phase migration process works.

When to migrate

Migrate to DLF if any of the following applies to your cluster:

  • Your cluster uses Built-in MySQL. An on-premises MySQL database runs in standalone mode, which cannot guarantee high availability and is prone to service interruptions.

  • Your cluster uses Unified Metabases. This storage type is being gradually discontinued. Clusters must switch to DLF Unified Metadata, available in the new EMR console.

  • Your cluster uses ApsaraDB RDS. Migration is optional but provides better storage performance and scalability.

Why DLF

DLF is a fully managed, highly available, and high-performance metadata service. It is compatible with multiple Hive metastore versions and integrates with open-source compute engines in EMR. Capabilities include:

  • Data profiling, data exploration, and data lake management

  • Data permission management

  • Integration with MaxCompute, Databricks DataInsight (DDI), and Hologres

For more information, see DLF overview.

Migration process

The Alibaba Cloud EMR and DLF teams support the entire migration. The following table describes each phase, the steps involved, and the estimated duration.

Important

During migration (Phase 2), all cluster tasks must be suspended. Plan for approximately 30 minutes of task downtime.

Phase Steps Participant Estimated duration
1. Preparations
  1. Search for DingTalk group 33719678 and join the EMR metadata migration group. Engineers will survey your cluster configuration and resource usage to confirm migration feasibility and schedule.
EMR team + you 2 hours
2. Migration 1. Suspend running tasks and stop the metadata service. 2. Back up existing metadata. 3. Migrate metadata to DLF using the metadata migration feature, and check whether the migration is performed as expected. 4. Set the Type parameter to DLF Unified Metadata. 5. Resume suspended tasks. EMR team + you 30 minutes
3. Check Observe task execution for at least one week. If tasks run as expected, the migration is complete. If issues occur, determine whether to fix them online or initiate a rollback (see Phase 4). EMR team + you 1 week
4. Rollback (optional) 1. Suspend running tasks. 2. Compare metadata between DLF and the Hive metastore; write incremental data back to the Hive metastore. 3. Set the Type parameter to Unified Metabases. 4. Start the Hive metastore. 5. Resume suspended tasks and verify results. EMR team + you 30 minutes

Get support

To start the migration, join the DingTalk group by searching for group number 33719678. Engineers will reach out to plan the migration with you.