All Products
Search
Document Center

Data Lake Formation:Getting Started

Last Updated:Sep 06, 2023

This topic describes how to use Data Lake Formation (DLF).

Prerequisites

An Alibaba Cloud account is created. The real-name verification is complete.

Create a data source

Create a data source to import data to DLF. DLF supports ApsaraDB RDS for MySQL and PolarDB data sources.

  • To create an ApsaraDB RDS for MySQL data source, you must enter the username and password that are used to connect to the ApsaraDB RDS for MySQL data source.

  • Specify the virtual private cloud (VPC) in which the ApsaraDB RDS for MySQL data source resides, the vSwitch that is associated with the data source, and the security group to which the data source belongs.

For more information, see Manage data sources.

Create a data import template

  • After a data import template is created, the specified data in a data source can be automatically extracted to DLF at the scheduled time, or manually extracted to DLF.

  • A total of five data import templates are supported. Select or create data import templates based on your requirements for data extraction.

  • When you create the data import template, specify the location from which you want to import data.

  • Specify the RAM role that is used by DLF to access cloud resources. The default RAM role is AliyunDLFWorkFlowDefaultRole.

  • Select the resources that are required to run data extraction tasks, and specify how a task is run.

For more information, see Data import templates.

Create metadata in DLF

  • Create a metadatabase.

  • Create a metadata table, and specify the storage location and format to store the data in the metadata table.

For more information, see Manage metadata.