Alibaba Cloud Data Lake Formation (DLF) is a fully managed service that helps users quickly build cloud-based data lakes and lakehouse. This service provides customers with unified metadata management, unified permission and security management, and one-click data exploration capabilities. DLF can help users quickly complete the construction and management of cloud-native data lakes and lakehouse, seamlessly integrate with various compute engines, break data silos, and gain business insights.
Pricing
The data exploration, permission management, and lake management features of DLF are in the public preview free stage and are not billed.
The metadata management feature is billed on a pay-as-you-go basis. Metadata object storage of up to 1 million per month is free. Charges apply for quantities exceeding this limit. For more information, see Billing.
API requests of up to 1 million per month are free. Charges apply for quantities exceeding this limit. For more information, see Billing.
Architecture
Data Catalog: View and manage the data catalog in the data lake through the console.
Database tables and functions: View and manage database tables and function information in the data lake through the console. Operate metadata by CreateDatabase and CreateTable, and integrate into third-party application services. It supports multi-version management and can automatically generate metadata through metadata extraction.
Data permission management: Enhance data permission control on the lake to ensure data security. It supports permissions at five levels of granularity: data catalog, database, data table, data column, and function.
Data lake management: Provides analysis and optimization suggestions for data storage in the lake, strengthens data lifecycle management, optimizes usage costs, and facilitates data O&M.
Data exploration: Provides one-click data exploration capabilities, supports Spark 3.0 SQL syntax, can save historical queries, preview data, export results, and generate TPC-DS test datasets with one click.
Scenarios
Scenario 1: Building a cloud-based data lake
With DLF integrated with E-MapReduce and OSS, you can quickly build a cloud-based data lake.
Scenario 2: Building a data lakehouse architecture
With DLF integrated with MaxCompute, DataWorks, and E-MapReduce, you can quickly build a data lakehouse architecture.
Scenario 3: Building a fully managed lakehouse data architecture
With DLF integrated with Databricks and OSS, you can build a fully managed lakehouse data architecture on the cloud.
Scenario 4: Data analysis
You can quickly analyze and explore structured and semi-structured data within OSS by using metadata extraction and data exploration capabilities.