Data Lake Formation (DLF) 1.0 is a fully managed service for building cloud-based data lakes and data lakehouses. DLF provides unified metadata management, unified permission and security management, and one-click data exploration to integrate with multiple compute engines and break down data silos.
Pricing
| Item | Billing model | Free tier |
|---|---|---|
| Metadata management | Pay-as-you-go | First 1 million stored metadata objects per month |
| API requests | — | First 1 million API requests per month |
| Data exploration | Free public preview | Not billed |
| Permission management | Free public preview | Not billed |
| Lake management | Free public preview | Not billed |
Additional metadata objects and API requests beyond the free tier are charged. For more information, see Billing.
Architecture
Data Catalog
View and manage the Data Catalog in the data lake through the console.
Database tables and functions
View and manage database tables and functions in the data lake through the console. The CreateDatabase and CreateTable APIs support metadata operations and integration with third-party application services. DLF supports multi-version management and can automatically generate metadata through metadata extraction.
Data permission management
Data permission management controls data access at the lake level to protect data security. Permissions are supported at five levels of granularity: data catalog, database, data table, data column, and function.
Data lake management
Data lake management provides analysis and optimization suggestions for data storage in the lake, supports data lifecycle management, helps optimize usage costs, and simplifies data O&M.
Data exploration
Data exploration provides one-click querying and analysis with Spark 3.0 SQL syntax. Features include saving historical queries, previewing data, exporting results, and generating TPC-DS test datasets with one click.
Scenarios
Build a cloud-based data lake
Integrate DLF with E-MapReduce and Object Storage Service (OSS) to quickly build a cloud-based data lake.
Build a data lakehouse architecture
Integrate DLF with MaxCompute, DataWorks, and E-MapReduce to quickly build a data lakehouse architecture.
Build a fully managed data lakehouse architecture
Integrate DLF with Databricks and OSS to build a fully managed data lakehouse architecture on the cloud.
Analyze data in OSS
Use metadata extraction and data exploration to quickly analyze and explore structured and semi-structured data in OSS.