By creating a MaxCompute data source, you can enable Dataphin to read business data from MaxCompute or write data to MaxCompute. This topic describes how to create a MaxCompute data source.
Background information
MaxCompute is an Alibaba Cloud big data computing service. It is an enterprise-level Software as a Service (SaaS) cloud data warehouse suitable for data analysis scenarios. It provides fast, fully managed online data warehouse services with a Serverless architecture, eliminating the limitations of traditional data platforms in resource scalability and elasticity. It minimizes user maintenance costs, allowing you to analyze and process massive amounts of data economically and efficiently.
Limits
MaxCompute data sources do not support access to external MaxCompute projects. For more information, see MaxCompute project overview.
Permission description
Only custom global roles with the Create Data Source permission and the super administrator, data source administrator, domain architect, and project administrator roles can create data sources.
Procedure
On the Dataphin homepage, click Management Center > Datasource Management in the top navigation bar.
On the Datasource page, click +Create Data Source.
On the Create Data Source page, select MaxCompute in the Big Data section.
If you have recently used MaxCompute, you can also select it from the Recently Used section. You can also enter MaxCompute keywords in the search box to quickly search for it.
On the Create MaxCompute Data Source page, configure the parameters for connecting to the data source.
Configure the basic information of the data source.
Parameter
Description
Datasource Name
Enter the name of the data source. The name must meet the following requirements:
It can contain only Chinese characters, uppercase and lowercase letters, digits, underscores (_), and hyphens (-).
The name can be up to 64 characters in length.
Datasource Code
After you configure the data source code, you can directly access Dataphin data source tables in Flink_SQL tasks or by using the Dataphin JDBC client in the format of
data source code.table nameordata source code.schema.table namefor quick consumption. If you need to automatically switch data sources based on the task execution environment, access them using the variable format${data source code}.tableor${data source code}.schema.table. For more information, see Dataphin data source table development method.ImportantThe data source code cannot be modified after it is configured successfully.
After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory.
In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.
Data Source Description
Enter a brief description of the data source. The description cannot exceed 128 characters.
Data Source Configuration
Select the data source that you want to configure:
If your business data source distinguishes between production and development data sources, select Production + Development Data Source.
If your business data source does not distinguish between production and development data sources, select Production Data Source.
Tag
You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.
Configure the connection parameters between the data source and Dataphin.
If you select Production + Development Data Source for Data Source Configuration, you need to configure the connection information for both Production + Development Data Source. If you select Production Data Source, you only need to configure the connection information for the Production Data Source.
NoteTypically, production and development data sources should be configured as different data sources to achieve environment isolation between them and reduce the impact of development data sources on production data sources. However, Dataphin also supports configuring them as the same data source with identical parameter values.
Parameter
Description
Endpoint
The Endpoint of MaxCompute. Select the appropriate Endpoint based on your network environment and connection method.
For information about how to obtain the Endpoint, see Endpoint.
Project Name
This is the MaxCompute project name, not the DataWorks workspace name.
You can log on to the MaxCompute console, switch to the appropriate region in the upper-left corner, and then view the specific MaxCompute Project Name on the Project Management tab.
Access ID, Access Key
The AccessKey ID and AccessKey Secret of the account to which the MaxCompute data source belongs.
For information about how to obtain them, see Create an AccessKey pair.
Select a Default Resource Group, which is used to run tasks related to the current data source, including database SQL, offline database migration, and data preview.
Click Test Connection or directly click OK to save and complete the creation of the MaxCompute data source.
When you click Test Connection, the system tests whether the data source can connect to Dataphin normally. If you directly click OK, the system automatically tests the connection for all selected clusters. However, even if all selected clusters fail the connection test, the data source can still be created normally.