Product | Feature | Description | Related Documentation |
DLF | Commercialization (billing enabled + SLA) | DLF began commercial operations in late December 2025. After commercialization, billing is enabled and a Service-Level Agreement (SLA) is provided. | |
DLF | Public preview instructions (free preview/how to enable) | Instructions on how to participate in the free public preview of DLF and how to enable the service. | |
DLF | DLF 3.0: Omni Catalog | DLF 3.0 features Omni Catalog for multi-engine access and unified metadata management. | Alibaba Cloud DLF 3.0: An intelligent, omni-modal data lakehouse management platform for the AI era |
DLF | DLF 3.0: Omni-modal data lakehouse management (structured/unstructured) | Expands from structured data to unified management and administration of unstructured data such as text, images, audio, and video. | Alibaba Cloud DLF 3.0: An intelligent, omni-modal data lakehouse management platform for the AI era |
DLF | DLF 3.0: Intelligent storage and performance optimization (for AI workloads) | Features data organization, storage, and performance optimization capabilities for AI workloads. | Alibaba Cloud DLF 3.0: An intelligent, omni-modal data lakehouse management platform for the AI era |
Realtime Compute for Apache Flink | Access DLF using Flink SQL (Paimon REST) | Connect Flink SQL to a DLF Catalog using a Paimon REST Catalog. | |
Realtime Compute for Apache Flink | Access DLF using Flink DataStream (Paimon REST) | Access a DLF Catalog from Flink DataStream jobs using a Paimon REST Catalog. | |
Realtime Compute for Apache Flink | Access DLF using Flink SQL (Iceberg REST) | Connect Flink SQL to a DLF Catalog for Iceberg tables using an Iceberg REST Catalog. | |
EMR Serverless Spark | Use a DLF Catalog in EMR Serverless Spark | Configure and use a DLF Catalog as the metadata catalog in EMR Serverless Spark. | |
EMR Serverless Spark | Use a DLF Catalog in OpenSearch | Use a DLF Catalog in OpenSearch for Data Lake Formation, synchronization, and indexing. | |
DataWorks | OpenData: Unified management of objects such as metadata, instances, and members | Use OpenData as a unified entry point to manage objects such as metadata, instances, and members in a workspace. | |
DataWorks | OpenData: Feature overview | Outlines the capabilities and usage of OpenData. | |
DataWorks | OpenData: Table schema and object field descriptions | Provides descriptions of OpenData table schemas and fields to facilitate integration and custom development. | |
DataWorks | OpenLake quick start (DLF-based) | A quick start guide for the OpenLake solution, which features unified metadata in DLF and multi-engine integration. | |
DataWorks | EMR Serverless Spark environment preparation: Select the DLF metadata service | When you create or configure a Serverless Spark environment in DataWorks, you can select DLF as the metadata service (DLF Catalog). | |
DataWorks | Offline sync task: Field vectorization (embedding) | Vectorize fields in the synchronization pipeline to generate vector fields for downstream vector retrieval or knowledge bases. | |
DataWorks | Node: PAI Flow node (scheduling/orchestration) | Orchestrate and schedule PAI Flow workflows in DataWorks. | |
DataWorks | Data source: Milvus (vector database read/write/sync) | DataWorks supports Milvus as a data source for reading, writing, and synchronizing vector data. | |
EMR Serverless Spark | Data catalog: Add HMS, DLF-Legacy, and DLF 3.0 Catalogs at the same time | The platform-level data catalog is enhanced to support adding multiple types of catalogs at the same time. This facilitates unified metadata and cross-system access. | |
EMR Serverless Spark | Lake table read/write: DLF table read/write optimization | The engine layer is optimized for reading from and writing to DLF tables to improve the lake table access experience. | |
EMR Serverless Spark | Storage access: Passwordless access to pvfs | Supports passwordless access to pvfs, which facilitates integrated access on the lake storage side. | |
EMR Serverless Spark | Lake format: Added/Enhanced support for the Lance file format | Adds support for the Lance file format for scenarios involving AI and vector data. | |
EMR Serverless Spark | Paimon: Optimization and lineage enhancement | Optimizes Paimon and enhances its lineage capabilities. You can view Resilient Distributed Dataset (RDD) lineage and other information in DataWorks. | |
EMR Serverless Spark | Kyuubi Gateway: Associate authorization tokens with a DLF 3.0 Catalog | The gateway layer supports associating authorization tokens with a DLF 2.5 Catalog for unified user authentication. | |
EMR Serverless Spark | DLF 2.5: Full support for PaimonCatalog and IcebergCatalog | DLF 2.5 Catalog is compatible with PaimonCatalog and IcebergCatalog. | |
EMR Serverless Spark | DLF Lance tables: New support | Adds support for DLF Lance tables in DLF 2.5 Catalog scenarios. | |
EMR Serverless Spark | Workspace: Support for adding multiple DLF Catalogs for federated queries | The workspace level supports adding multiple DLF (formerly DLF 2.5) Catalogs to allow federated queries. | |
EMR Serverless Spark | Livy Gateway: Read from DLF Catalog by default | Livy Gateway reads metadata from a DLF Catalog by default. This allows jobs submitted through Livy to directly access lake tables. | |
EMR Serverless Spark | Manage data catalogs (HMS / DLF 1.0 / DLF 2.5) | Manage data catalogs in the console. Entry points are provided for operations such as adding, viewing, and deleting catalogs. | |
EMR Serverless StarRocks | V1.19: Associate with the DLF data lake service when creating an instance | You can associate an instance with the DLF data lake service when you create the instance. This enables metadata and permission linkage in the data lakehouse. | |
Realtime Compute for Apache Flink | Manage Paimon Catalogs (can connect to DLF) | Create, view, and delete Paimon Catalogs. You can directly access Paimon tables in DLF. | |
Hologres | Access Paimon Catalogs based on DLF 2.0 | Access and manage Paimon Catalogs (such as External Database) through the DLF REST metastore. | |
Hologres | DLF_FDW: Read from and write to OSS (data lake acceleration) | Read from and write to an OSS data lake through a dlf_fdw foreign table. | |
Hologres | Serverless Lakehouse: Paimon-based solution | Instructions on how to build a Serverless Lakehouse solution using Hologres and Paimon. | |
OpenSearch | Vector Search Edition: Data Lake Formation (DLF) | Guidelines on how to build vector indexes and perform retrieval from data lake tables such as DLF and Paimon tables. | Synchronize vector data from OpenLake-DLF to Alibaba Cloud OpenSearch |
OpenSearch | Retrieval Engine Edition: Data Lake Formation (DLF) | Guidelines on how to build indexes and perform retrieval from data lake tables such as DLF and Paimon tables. | |
DataWorks | DataStudio (new version): Serverless StarRocks node | Provides a Serverless StarRocks node (including an SQL node) in DataWorks to develop and schedule StarRocks jobs and SQL. | |
EMR Serverless StarRocks | Associate with the DLF data lake service when creating an instance | You can associate an instance with the DLF data lake service when you create the instance. This enables metadata and permission linkage in the data lakehouse. | |
EMR Serverless StarRocks | StarRocks Manager supports associating users with RAM Roles | StarRocks users can be associated with RAM Roles to adapt to DLF's RAM Role-based access control and data lakehouse permission linkage. | |
EMR Serverless StarRocks | Kernel 3.3.13-1.2.0 (2025-10-29): DLF Iceberg Catalog support | New in Lakehouse: Supports DLF Iceberg Catalog, which allows StarRocks to directly connect to the Iceberg metadata catalog managed by DLF. | |
EMR Serverless StarRocks | Kernel 3.3.13-1.2.0 (2025-10-29): Paimon Native Writer | New in Lakehouse: Supports Paimon Native Writer for enhanced write performance in data lakehouse processing. This can be used with DLF and Paimon catalogs. | |
EMR Serverless StarRocks | Kernel 3.3.20-1.3.0 (2025-12-10): Paimon DV V2 / Native Reader | New in Lakehouse: Supports Paimon Deletion Vector V2. Supports Native Reader for reading from and writing to Paimon Format Tables to enhance lake table read/write capabilities. | |
MaxCompute | DLF + OSS external schema: Host foreign table metadata in DLF | Manage metadata such as schemas and tables on OSS through DLF. MaxCompute queries OSS foreign tables through an External Schema. This feature requires DLF 2.6 or later. |