All Products
Search
Document Center

OpenLake:January 2026

Last Updated:Jan 28, 2026

Product

Feature

Description

Related Documentation

DLF

Commercialization (billing enabled + SLA)

DLF began commercial operations in late December 2025. After commercialization, billing is enabled and a Service-Level Agreement (SLA) is provided.

DLF commercialization announcement

DLF

Public preview instructions (free preview/how to enable)

Instructions on how to participate in the free public preview of DLF and how to enable the service.

DLF public preview instructions

DLF

DLF 3.0: Omni Catalog

DLF 3.0 features Omni Catalog for multi-engine access and unified metadata management.

Alibaba Cloud DLF 3.0: An intelligent, omni-modal data lakehouse management platform for the AI era

DLF

DLF 3.0: Omni-modal data lakehouse management (structured/unstructured)

Expands from structured data to unified management and administration of unstructured data such as text, images, audio, and video.

Alibaba Cloud DLF 3.0: An intelligent, omni-modal data lakehouse management platform for the AI era

DLF

DLF 3.0: Intelligent storage and performance optimization (for AI workloads)

Features data organization, storage, and performance optimization capabilities for AI workloads.

Alibaba Cloud DLF 3.0: An intelligent, omni-modal data lakehouse management platform for the AI era

Realtime Compute for Apache Flink

Access DLF using Flink SQL (Paimon REST)

Connect Flink SQL to a DLF Catalog using a Paimon REST Catalog.

Access DLF using Flink SQL

Realtime Compute for Apache Flink

Access DLF using Flink DataStream (Paimon REST)

Access a DLF Catalog from Flink DataStream jobs using a Paimon REST Catalog.

Access DLF using Flink DataStream

Realtime Compute for Apache Flink

Access DLF using Flink SQL (Iceberg REST)

Connect Flink SQL to a DLF Catalog for Iceberg tables using an Iceberg REST Catalog.

Access DLF using Flink SQL

EMR Serverless Spark

Use a DLF Catalog in EMR Serverless Spark

Configure and use a DLF Catalog as the metadata catalog in EMR Serverless Spark.

EMR Serverless Spark

Use a DLF Catalog in OpenSearch

Use a DLF Catalog in OpenSearch for Data Lake Formation, synchronization, and indexing.

DataWorks

OpenData: Unified management of objects such as metadata, instances, and members

Use OpenData as a unified entry point to manage objects such as metadata, instances, and members in a workspace.

DataWorks

OpenData: Feature overview

Outlines the capabilities and usage of OpenData.

Manage open data

DataWorks

OpenData: Table schema and object field descriptions

Provides descriptions of OpenData table schemas and fields to facilitate integration and custom development.

Manage open data

DataWorks

OpenLake quick start (DLF-based)

A quick start guide for the OpenLake solution, which features unified metadata in DLF and multi-engine integration.

DataWorks

EMR Serverless Spark environment preparation: Select the DLF metadata service

When you create or configure a Serverless Spark environment in DataWorks, you can select DLF as the metadata service (DLF Catalog).

Prepare an environment

DataWorks

Offline sync task: Field vectorization (embedding)

Vectorize fields in the synchronization pipeline to generate vector fields for downstream vector retrieval or knowledge bases.

Vectorization

DataWorks

Node: PAI Flow node (scheduling/orchestration)

Orchestrate and schedule PAI Flow workflows in DataWorks.

PAI Flow node

DataWorks

Data source: Milvus (vector database read/write/sync)

DataWorks supports Milvus as a data source for reading, writing, and synchronizing vector data.

Milvus

EMR Serverless Spark

Data catalog: Add HMS, DLF-Legacy, and DLF 3.0 Catalogs at the same time

The platform-level data catalog is enhanced to support adding multiple types of catalogs at the same time. This facilitates unified metadata and cross-system access.

Version of 2025-11-12

EMR Serverless Spark

Lake table read/write: DLF table read/write optimization

The engine layer is optimized for reading from and writing to DLF tables to improve the lake table access experience.

Version of 2025-11-12

EMR Serverless Spark

Storage access: Passwordless access to pvfs

Supports passwordless access to pvfs, which facilitates integrated access on the lake storage side.

Version of 2025-11-12

EMR Serverless Spark

Lake format: Added/Enhanced support for the Lance file format

Adds support for the Lance file format for scenarios involving AI and vector data.

Version of 2025-11-12

EMR Serverless Spark

Paimon: Optimization and lineage enhancement

Optimizes Paimon and enhances its lineage capabilities. You can view Resilient Distributed Dataset (RDD) lineage and other information in DataWorks.

Version of 2025-11-12

EMR Serverless Spark

Kyuubi Gateway: Associate authorization tokens with a DLF 3.0 Catalog

The gateway layer supports associating authorization tokens with a DLF 2.5 Catalog for unified user authentication.

Version of 2025-09-17

EMR Serverless Spark

DLF 2.5: Full support for PaimonCatalog and IcebergCatalog

DLF 2.5 Catalog is compatible with PaimonCatalog and IcebergCatalog.

Version of 2025-09-17

EMR Serverless Spark

DLF Lance tables: New support

Adds support for DLF Lance tables in DLF 2.5 Catalog scenarios.

Version of 2025-09-17

EMR Serverless Spark

Workspace: Support for adding multiple DLF Catalogs for federated queries

The workspace level supports adding multiple DLF (formerly DLF 2.5) Catalogs to allow federated queries.

Version of 2025-07-31

EMR Serverless Spark

Livy Gateway: Read from DLF Catalog by default

Livy Gateway reads metadata from a DLF Catalog by default. This allows jobs submitted through Livy to directly access lake tables.

Version of 2025-07-31

EMR Serverless Spark

Manage data catalogs (HMS / DLF 1.0 / DLF 2.5)

Manage data catalogs in the console. Entry points are provided for operations such as adding, viewing, and deleting catalogs.

Manage data catalogs

EMR Serverless StarRocks

V1.19: Associate with the DLF data lake service when creating an instance

You can associate an instance with the DLF data lake service when you create the instance. This enables metadata and permission linkage in the data lakehouse.

Console release notes

Realtime Compute for Apache Flink

Manage Paimon Catalogs (can connect to DLF)

Create, view, and delete Paimon Catalogs. You can directly access Paimon tables in DLF.

Manage Paimon Catalogs

Hologres

Access Paimon Catalogs based on DLF 2.0

Access and manage Paimon Catalogs (such as External Database) through the DLF REST metastore.

Access Paimon Catalogs based on DLF

Hologres

DLF_FDW: Read from and write to OSS (data lake acceleration)

Read from and write to an OSS data lake through a dlf_fdw foreign table.

Accelerate access to an OSS data lake based on DLF

Hologres

Serverless Lakehouse: Paimon-based solution

Instructions on how to build a Serverless Lakehouse solution using Hologres and Paimon.

Hologres Serverless data lake solution based on Paimon

OpenSearch

Vector Search Edition: Data Lake Formation (DLF)

Guidelines on how to build vector indexes and perform retrieval from data lake tables such as DLF and Paimon tables.

Synchronize vector data from OpenLake-DLF to Alibaba Cloud OpenSearch

OpenSearch

Retrieval Engine Edition: Data Lake Formation (DLF)

Guidelines on how to build indexes and perform retrieval from data lake tables such as DLF and Paimon tables.

Data Lake Formation (DLF)

DataWorks

DataStudio (new version): Serverless StarRocks node

Provides a Serverless StarRocks node (including an SQL node) in DataWorks to develop and schedule StarRocks jobs and SQL.

Serverless StarRocks

EMR Serverless StarRocks

Associate with the DLF data lake service when creating an instance

You can associate an instance with the DLF data lake service when you create the instance. This enables metadata and permission linkage in the data lakehouse.

Console release notes

EMR Serverless StarRocks

StarRocks Manager supports associating users with RAM Roles

StarRocks users can be associated with RAM Roles to adapt to DLF's RAM Role-based access control and data lakehouse permission linkage.

Console release notes

EMR Serverless StarRocks

Kernel 3.3.13-1.2.0 (2025-10-29): DLF Iceberg Catalog support

New in Lakehouse: Supports DLF Iceberg Catalog, which allows StarRocks to directly connect to the Iceberg metadata catalog managed by DLF.

Kernel release notes

EMR Serverless StarRocks

Kernel 3.3.13-1.2.0 (2025-10-29): Paimon Native Writer

New in Lakehouse: Supports Paimon Native Writer for enhanced write performance in data lakehouse processing. This can be used with DLF and Paimon catalogs.

Kernel release notes

EMR Serverless StarRocks

Kernel 3.3.20-1.3.0 (2025-12-10): Paimon DV V2 / Native Reader

New in Lakehouse: Supports Paimon Deletion Vector V2. Supports Native Reader for reading from and writing to Paimon Format Tables to enhance lake table read/write capabilities.

Kernel release notes

MaxCompute

DLF + OSS external schema: Host foreign table metadata in DLF

Manage metadata such as schemas and tables on OSS through DLF. MaxCompute queries OSS foreign tables through an External Schema. This feature requires DLF 2.6 or later.

DLF+OSS external schema