By adding a MySQL data source, Dataphin can read business data from MySQL or write data to MySQL. This topic describes how to add a MySQL data source.
Background information
MySQL is a leading open-source relational database management system that uses Structured Query Language (SQL) to store, retrieve, and manage data. It is commonly used in websites, applications, and commercial products and is a common primary relational data storage system.
Permissions
Only custom global role and super administrator, datasource administrator, tile architects, administrator system role with new datasource permission points are supported to create data sources.
Procedure
On the Dataphin home page, select Management Center > Datasource Management from the top menu bar.
On the Datasource page, click + Add Data Source .
In the Add Data Source dialog box, select Relational Database area, and choose MySQL .
If you have recently used MySQL, you can also select MySQL in the Recently Used area. You can also enter the keyword MySQL in the search box for a quick search.
In the Add Mysql Data Source dialog box, configure the connection data source parameters.
Configure the basic information of the data source.
Parameter
Description
Datasource Name
Enter the data source name. The naming conventions are as follows:
Can only contain Chinese characters, uppercase and lowercase English letters, numbers, underscores (_), or hyphens (-).
Cannot exceed 64 characters in length.
Datasource Code
After configuring the data source code, you can directly access the Dataphin data source table in Flink_SQL tasks or using the Dataphin JDBC client by using the format
Datasource Code.Table Name
orDatasource Code.Schema.Table Name
for quick consumption. If you need to automatically switch data sources based on the task execution environment, access it using the variable format${Datasource Code}.table
or${Datasource Code}.schema.table
. For more information, see Dataphin Data Source Table Development Method.ImportantOnce the data source encoding is successfully configured, it cannot be modified.
After the data source encoding is successfully configured, data preview can be performed on the object details page of the asset directory and asset checklist.
In Flink SQL, currently only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are supported.
Version
Only supports MySQL5.1.43, MySQL5.6/5.7, MySQL8, and RDS MySQL versions of MySQL data sources.
ImportantIf you are using Alibaba Cloud's MySQL data source, you need to select the RDS MySQL version to connect normally.
Datasource Description
A brief description of the MySQL data source. Must not exceed 128 characters.
Datasource Config
Based on whether the business data source distinguishes between production data sources and development data sources:
If the business data source distinguishes between production data sources and development data sources, select Production + Development Data Source.
If the business data source does not distinguish between production data sources and development data sources, select Production Data Source.
Tag
You can classify and label the data source based on tags. For how to create tags, see Manage Data Source Tags.
Configure the connection parameters between the data source and Dataphin.
If your data source configuration selects Production + Development Data Source, you need to configure the connection information for Production + Development Data Source. If your data source configuration is Production Data Source, you only need to configure the connection information for Production Data Source.
NoteGenerally, production and development data sources should be configured as different entities to ensure environment isolation and minimize the impact of development activities on production data sources. However, Dataphin also supports using the same parameters for both.
Configuration Method can be either JDBC URL or Host , with JDBC URL being the default selection.
JDBC URL Configuration Method
Parameter
Description
JDBC URL
The format of the JDBC URL link address is
jdbc:mysql://host:port/dbname
.NoteThe default port for MySQL data sources is 3306.
Username, Password
The Username and Password of the database.
SSL Encryption
After enabling, you need to upload the Truststore certificate and enter the Truststore certificate password.
Host Configuration Method
Host Configuration Method
Parameter
Description
Server Address
Enter the IP address and port number of the server.
You can click + Add to add multiple sets of IP addresses and port numbers. Click the
icon to delete redundant IP addresses and port numbers, but at least one set must be retained.
dbname
Enter the database name.
Parameter Configuration
Parameter
Description
Parameter
Parameter Name: Supports selecting an existing parameter name or entering a custom parameter name.
Custom parameter names only support uppercase and lowercase English letters, numbers, half-width periods (.), underscores (_), and hyphens (-).
Parameter Value: When a parameter name is selected, the parameter value is required. Only supports uppercase and lowercase English letters, numbers, half-width periods (.), underscores (_), and hyphens (-), with a length not exceeding 256 characters.
NoteSupports adding multiple parameters by clicking + Add Parameter. Click the
icon to delete redundant parameters. Up to 30 parameters can be added.
Username, Password
The username and password to log on to the MySQL instance.
SSL Encryption
After enabling, you need to upload the Truststore certificate and enter the Truststore certificate password.
NoteWhen the Host configuration method is selected and the data source is added, the system will automatically generate a JDBC URL using the server's IP address and port number.
Configure the advanced settings of the data source.
Parameter
Description
connectTimeout
The connectTimeout duration of the database (unit: milliseconds), default is 900000 milliseconds (15 minutes).
NoteIf you have a connectTimeout configuration in the JDBC URL, the connectTimeout is the timeout configured in the JDBC URL.
For data sources created before Dataphin V3.11, the connectTimeout defaults to
-1
, indicating no timeout limit.
socketTimeout
The socketTimeout duration of the database (unit: milliseconds), default is 1800000 milliseconds (30 minutes).
NoteIf you have a socketTimeout configuration in the JDBC URL, the socketTimeout is the timeout configured in the JDBC URL.
For data sources created before Dataphin V3.11, the socketTimeout defaults to
-1
, indicating no timeout limit.
Connection Retries
If the database connection times out, it will automatically retry the connection until the set number of retries is completed. If the maximum number of retries is reached and the connection is still unsuccessful, the connection fails.
NoteThe default number of retries is 1 time, and supports configuring parameters between 0~10.
The connection retry count will be applied by default to Offline Integration Tasks and Global Quality (the asset quality function module needs to be enabled). Offline integration tasks support configuring task-level retry counts separately.
NoteValue rules for duplicate parameters:
If a parameter is specified in both the JDBC URL and Advanced Settings, or in the Host Configuration method, the value in the JDBC URL takes precedence.
If a parameter is present in both the JDBC URL and Advanced Settings, the value in the JDBC URL takes precedence.
If a parameter is present in both the Advanced Settings and the Host Configuration method, the value in the Advanced Settings takes precedence.
Click Test Connection to verify that the data source can communicate properly with Dataphin.
After you have configured the data source information, you can click Test Connection in the operation column to test the connectivity of either the default cluster or a registered scheduling cluster that is registered and functioning normally within Dataphin. By default, the system selects the default cluster, which cannot be deselected. If there is no resource group under the registered scheduling cluster, the connection test cannot be performed. You must first create a resource group before you can test the connection.
The selected cluster is only for testing network connectivity and is not used for subsequent task operations.
Connection tests typically complete within 2 minutes. If it times out, click the
icon to view the specific reason and retry.
Regardless of whether the test result is Connection Failed, Connection Successful, or Succeeded With Warning, the system will record the time when the final result was generated for you.
If the test result shows Connection Failed, you can click the
icon to see the specific reason for the failure.
If the test result shows Succeeded With Warning, this indicates that the application cluster is connected successfully, but there was a failure in connecting to the scheduling cluster. Consequently, the current data source is not suitable for data development and integration tasks. You can click the
icon to access the log details.
NoteTest results for the Default Cluster show one of three connection statuses: Success with Risk, Connection Successful, or Connection Failed. However, test results for the Registered Scheduling Cluster in Dataphin indicate only two statuses: Connection Successful or Connection Failed.
Ensure at least one cluster can connect to the data source; otherwise, saving the data source information is not possible.
After a successful test, click OK to complete the creation of the MySQL data source.