All Products
Search
Document Center

Data Lake Formation:What is DLF?

Last Updated:Apr 17, 2024

Data Lake Formation (DLF) can help you build cloud-based data lakes with ease, and manage all data lakes in a centralized manner.

DLF is in public preview. You can activate and use DLF at any time. At present, all features of DLF are free of charge.

Usage procedure

DLF can be used to extract source data to data lakes with ease. The process of using DLF includes the following steps:

  1. After DLF in public preview is activated, log on to the Alibaba Cloud Management Console. Choose Products > Analytics Computing > Data Lake Formation (DLF). On the page that appears, click Console to go to the DLF console.

  2. Create a data source. In this step, select the data source whose data you want to import to a data lake. For more information, see Manage data sources.

  3. Create a data import template to periodically extract data from the data source to the data lake. For more information, see Manage data import tasks.

  4. Define a metadatabase and a metadata table in the data lake. For more information, see Manage metadata.

Overview of the DLF console

The homepage of the DLF console consists of the left-side navigation pane and the DLF information section. The DLF console provides quick links for you to use the major features of DLF. This helps you get started with DLF with ease.DLF console

Data lake location

All the data in data lakes that are created by using DLF is stored in Object Storage Service (OSS). You must specify an OSS bucket or an OSS path to store the data of your data lake.

Metadata management

The metadata management in a data lake of DLF consists of the management of metadata in metadatabases and metadata tables.

Data sources

Data is extracted from data sources to the specified data lake location. DLF supports various data sources, such as ApsaraDB RDS for MySQL data sources.

Parameter

Description

Connection Name

The unique name of the connection in DLF.

Connection Type

Only ApsaraDB RDS for MySQL data sources are supported.

Username

The username that is used to connect to the ApsaraDB RDS for MySQL data source.

Password

The password that is used to connect to the ApsaraDB RDS for MySQL data source.

VPC

The virtual private cloud (VPC) in which the ApsaraDB RDS for MySQL data source resides.

VSwitch

The vSwitch in which the ApsaraDB RDS for MySQL data source resides.

Security Group

The security group to which the ApsaraDB RDS for MySQL data source belongs.

Data import templates

You can create data import templates to manually import data from data sources to a data lake in DLF. Alternatively, you can schedule data to be imported from data sources to a data lake in DLF at the specified time.