Create a Data Lake Formation data source to enable Dataphin to read business data from or write data to Data Lake Formation. This topic describes how to create a Data Lake Formation data source.
Permissions
Only super administrators, data source administrators, domain architects, project administrators, and custom global roles with the Create Data Source permission can create data sources.
Procedure
In Dataphin, go to the homepage. In the top menu bar, click Management Hub > Datasource Management.
On the Datasource page, click + New Data Source.
On the New Data Source page, in the Big Data section, select Data Lake Formation.
If you recently used Data Lake Formation, you can also select it in the Recently Used section. Or type a keyword for Data Lake Formation in the search box to locate it quickly.
On the New Data Lake Formation Data Source page, configure the connection parameters.
Configure basic information for the data source.
Parameter
Description
Datasource Name
Enter a name for the data source. The name must follow these rules:
It can contain only letters, digits, underscores (_), and hyphens (-).
It cannot exceed 64 characters.
Datasource Code
After you configure a datasource code, you can access Dataphin data source tables directly in Flink SQL jobs or using the Dataphin JDBC client. Use the format
datasource_code.table_nameordatasource_code.schema.table_name. This lets you consume data faster. To switch data sources automatically based on the task execution environment, use the variable format${datasource_code}.tableor${datasource_code}.schema.table. For more information, see How to develop Dataphin data source tables.ImportantYou cannot change the datasource code after you set it.
You must set a datasource code before you can preview data on the object details pages in the data catalog and asset checklist.
Flink SQL supports only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, SelectDB, and GaussDB (DWS) data sources.
Datasource Description
Enter a brief description of the data source. The description cannot exceed 128 characters.
Data Source Configuration
Select the type of data source to configure:
If your business uses separate production and development data sources, select Production + Development Data Source.
If your business uses one data source for both production and development, select Production Data Source.
Tag
Add tags to classify your data source. To learn how to create tags, see Manage data source tags.
Configure connection parameters between the data source and Dataphin.
If you select Production + Development Data Source, you must configure the connection information for the Production + Development Data Source. If your data source configuration is set to Production Data Source, you only need to configure the connection information for the Production Data Source.
NoteYou typically use separate data sources for production and development to isolate environments and minimize the impact of development on production. However, Dataphin also supports using the same data source for both environments, with identical parameter values.
Parameter
Description
Endpoint
Enter the Data Lake Formation endpoint. For example,
dlfnext.cn-hangzhou.aliyuncs.com.Access ID and Access Key
Enter your Access ID and Access Key. Make sure you have the required database permissions so jobs run correctly.
DLF Catalog
Enter the DLF catalog name.
Database Name
Enter the database name in the catalog.
Select a default resource group. This group runs tasks associated with the data source, such as database SQL jobs, offline full-database migrations, and data previews.
Click Test Connection or click OK to save and complete the creation of the Data Lake Formation data source.
Click Test Connection to verify that Dataphin can connect to the data source. If you click OK directly, Dataphin tests all selected clusters. Even if all tests fail, the data source is still created.