All Products
Search
Document Center

Dataphin:Create a Data Lake Formation Data Source

Last Updated:Mar 05, 2026

Create a Data Lake Formation data source to enable Dataphin to read business data from or write data to Data Lake Formation. This topic describes how to create a Data Lake Formation data source.

Permissions

Only super administrators, data source administrators, domain architects, project administrators, and custom global roles with the Create Data Source permission can create data sources.

Procedure

  1. In Dataphin, go to the homepage. In the top menu bar, click Management Hub > Datasource Management.

  2. On the Datasource page, click + New Data Source.

  3. On the New Data Source page, in the Big Data section, select Data Lake Formation.

    If you recently used Data Lake Formation, you can also select it in the Recently Used section. Or type a keyword for Data Lake Formation in the search box to locate it quickly.

  4. On the New Data Lake Formation Data Source page, configure the connection parameters.

    1. Configure basic information for the data source.

      Parameter

      Description

      Datasource Name

      Enter a name for the data source. The name must follow these rules:

      • It can contain only letters, digits, underscores (_), and hyphens (-).

      • It cannot exceed 64 characters.

      Datasource Code

      After you configure a datasource code, you can access Dataphin data source tables directly in Flink SQL jobs or using the Dataphin JDBC client. Use the format datasource_code.table_name or datasource_code.schema.table_name. This lets you consume data faster. To switch data sources automatically based on the task execution environment, use the variable format ${datasource_code}.table or ${datasource_code}.schema.table. For more information, see How to develop Dataphin data source tables.

      Important
      • You cannot change the datasource code after you set it.

      • You must set a datasource code before you can preview data on the object details pages in the data catalog and asset checklist.

      • Flink SQL supports only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, SelectDB, and GaussDB (DWS) data sources.

      Datasource Description

      Enter a brief description of the data source. The description cannot exceed 128 characters.

      Data Source Configuration

      Select the type of data source to configure:

      • If your business uses separate production and development data sources, select Production + Development Data Source.

      • If your business uses one data source for both production and development, select Production Data Source.

      Tag

      Add tags to classify your data source. To learn how to create tags, see Manage data source tags.

    2. Configure connection parameters between the data source and Dataphin.

      If you select Production + Development Data Source, you must configure the connection information for the Production + Development Data Source. If your data source configuration is set to Production Data Source, you only need to configure the connection information for the Production Data Source.

      Note

      You typically use separate data sources for production and development to isolate environments and minimize the impact of development on production. However, Dataphin also supports using the same data source for both environments, with identical parameter values.

      Parameter

      Description

      Endpoint

      Enter the Data Lake Formation endpoint. For example, dlfnext.cn-hangzhou.aliyuncs.com.

      Access ID and Access Key

      Enter your Access ID and Access Key. Make sure you have the required database permissions so jobs run correctly.

      DLF Catalog

      Enter the DLF catalog name.

      Database Name

      Enter the database name in the catalog.

  5. Select a default resource group. This group runs tasks associated with the data source, such as database SQL jobs, offline full-database migrations, and data previews.

  6. Click Test Connection or click OK to save and complete the creation of the Data Lake Formation data source.

    Click Test Connection to verify that Dataphin can connect to the data source. If you click OK directly, Dataphin tests all selected clusters. Even if all tests fail, the data source is still created.