Alibaba Cloud Data Lake Formation (DLF) is a fully managed service that helps users quickly build cloud-based data lakes and lakehouses. This service provides customers with unified metadata management, unified permission and security management, and one-click data exploration capabilities. DLF can help users quickly complete the construction and management of cloud-native data lakes and lakehouses, seamlessly integrate with various compute engines, break data silos, and gain business insights.
Pricing
The data exploration, permission management, and lake management features of DLF are in the public preview free stage and are not billed.
The metadata management feature is billed on a pay-as-you-go basis. The first 1 million stored metadata objects per month are free of charge. Any additional metadata objects beyond this limit will be charged. For more information, see Billing.
The first 1 million API requests per month are free of charge. Any additional requests beyond this limit will be charged. For more information, see Billing.
Architecture
Data Catalog: View and manage the data catalog in the data lake through the console.
Database tables and functions: View and manage database tables and function information in the data lake through the console. Operate metadata by CreateDatabase and CreateTable, and integrate into third-party application services. It supports multi-version management and can automatically generate metadata through metadata extraction.
Data permission management: Enhances data permission control on the lake to ensure data security. It supports permissions at five levels of granularity: data catalog, database, data table, data column, and function.
Data lake management: Provides analysis and optimization suggestions for data storage in the lake, strengthens data lifecycle management, optimizes usage costs, and facilitates data O&M.
Data exploration: Provides one-click data exploration capabilities, supports Spark 3.0 SQL syntax, can save historical queries, preview data, export results, and generate TPC-DS test datasets with one click.
Scenarios
Scenario 1: Building a cloud-based data lake
With DLF integrated with E-MapReduce and OSS, you can quickly build a cloud-based data lake.
Scenario 2: Building a data lakehouse architecture
With DLF integrated with MaxCompute, DataWorks, and E-MapReduce, you can quickly build a data lakehouse architecture.
Scenario 3: Building a fully managed lakehouse data architecture
With DLF integrated with Databricks and OSS, you can build a fully managed lakehouse data architecture on the cloud.
Scenario 4: Data analysis
You can quickly analyze and explore structured and semi-structured data within OSS by using metadata extraction and data exploration capabilities.