Creating a MaxCompute data source allows Dataphin to read from or write to MaxCompute. This topic explains the steps to create a MaxCompute data source.
Background information
MaxCompute is Alibaba Cloud's big data computing service, offering an enterprise-level SaaS cloud data warehouse optimized for data analysis scenarios. It provides a fast, fully managed, serverless online data warehouse service that overcomes the resource extensibility and elasticity limitations of traditional data platforms. This minimizes operational and maintenance efforts for users, enabling economical and efficient analysis and processing of vast amounts of data.
Limits
The MaxCompute data source does not support access to external MaxCompute projects. For more information, see MaxCompute Project Overview.
Permission description
Only custom global roles with the New Data Source Permission Point and roles such as Super Administrator, Data Source Administrator, Section Architect, and Project Administrator are authorized to create data sources.
Procedure
On the Dataphin home page, click the top menu bar Management Center > Datasource Management.
On the Datasource page, click + New Data Source.
In the New Data Source dialog box, in the Big Data area, select MaxCompute.
If you have recently used MaxCompute, you can also select it in the Recently Used area. You can quickly filter by entering the keyword MaxCompute in the search box.
In the New Maxcompute Data Source dialog box, configure the connection data source parameters.
Configure the basic information of the data source.
Parameter
Description
Datasource Name
Enter the data source name. The naming convention is as follows:
Can only contain Chinese characters, uppercase and lowercase English letters, numbers, underscores (_), or hyphens (-).
Cannot exceed 64 characters in length.
Datasource Code
After configuring the data source encoding, you can directly access the Dataphin data source table in Flink_SQL tasks or using the Dataphin JDBC client through the format
Datasource Code.Table Name
orDatasource Code.Schema.Table Name
for quick consumption. If you need to automatically switch data sources based on the task execution environment, access through the variable format${Datasource Code}.table
or${Datasource Code}.schema.table
. For more information, see Dataphin Data Source Table Development Method.ImportantOnce the data source encoding is successfully configured, it cannot be modified.
After the data source encoding is successfully configured, data preview can be performed on the object details page of the asset directory and asset checklist.
In Flink SQL, currently only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are supported.
Data Source Description
Provide a brief description of the data source. Must not exceed 128 characters.
Datasource Config
Select the data source to configure:
If the business data source distinguishes between production data source and development data source, select Production + Development Data Source.
If the business data source does not distinguish between production data source and development data source, select Production Data Source.
Tag
You can classify and label the data source according to tags. For information on how to create tags, see Manage Data Source Tags.
Configure the connection parameters between the data source and Dataphin.
If you select Production + Development Data Source as your data source configuration, you must configure the connection information for both the Production and Development Data Sources. However, if you choose Production Data Source as your configuration, you only need to configure the connection information for the Production Data Source.
NoteTypically, production and development data sources should be configured as separate entities to maintain environment isolation and reduce the impact of development activities on production sources. However, Dataphin also supports configuring them as the same data source, with identical parameter values.
Parameter
Description
Endpoint
The Endpoint of MaxCompute. Please select the corresponding Endpoint according to your network environment and connection method.
For information on how to obtain the Endpoint, see Endpoint.
Project Name
This is the MaxCompute project name, not the DataWorks workspace name.
You can log on to the MaxCompute Console. After switching the region in the upper left corner, you can view the specific Maxcompute Project Name on the project management tab.
Access ID,Access Key
The AccessKey ID and AccessKey Secret of the account where the MaxCompute data source is located.
For information on how to obtain, see Obtain AccessKey.
Click Test Connection to verify that the data source can communicate properly with Dataphin.
After a successful test, click OK to finalize the creation of the MaxCompute data source.