All Products
Document Center

Data Lake Formation:Overview

Last Updated:Dec 18, 2023

Data Lake Formation (DLF) is developed by Alibaba Cloud to help you build a cloud-based data lake with ease. You can use DLF to build a cloud-based data lake within a few days. DLF allows you to centrally manage metadata and user permissions in a cloud-based data lake. DLF also supports automatic extraction of metadata.


DLF is charged based on the pay-as-you-go billing method. You are charged for the metadata objects and the resources that are used for data import. You can extract metadata for free. To view the usage information, go to the DLF console and click Dosage information in the left-side navigation pane. For more information, see Billing.


Data Lake Formation Architecture

  • Metadata management: allows you to view and analyze the information about databases and tables in a data lake in the DLF console. DLF also allows you to Create a metadatabase to manage metadata and integrate the metadata into third-party applications and services. Then, the metadata discovery feature is used in combination with data import tasks to automatically generate metadata.

  • Access control: grants and manages permissions on metadatabases and metadata tables.

  • Manage data import: uses data import tasks to store the data from MySQL and PolarDB databases and Kafka clusters in DLF. If no metadata is defined in the data import process, data import tasks automatically generate schemas for metadata tables.


  • Data lake management

  • Offline computing of big data

  • Real-time computing of big data

  • Machine learning or deep learning

  • Management of the data in data lakes