Add an IBM DB2 data source - Dataphin - Alibaba Cloud - Dataphin

Create an IBM DB2 data source to enable Dataphin to read business data from or write data to IBM DB2.

Background information

IBM DB2 is a relational database management system. To connect IBM DB2 to Dataphin for data development, you must first create an IBM DB2 data source. For more information, see IBM DB2 official website.

Permissions

Only custom global roles with the Create Data Source permission and system roles such as Super Administrator, Data Source Administrator, Domain Architect, and Project Administrator can create data sources.

Procedure

On the Dataphin homepage, choose Management Center > Datasource Management from the top navigation bar.
On the Datasource page, click +Create Data Source.
On the Create Data Source page, select IBM DB2 in the Relational Database section.

If you have recently used IBM DB2, you can also select IBM DB2 in the Recently Used section. You can also enter keywords for IBM DB2 in the search box to quickly find it.

On the Create IBM DB2 Data Source page, configure the parameters for connecting to the data source.

Configure the basic information of the data source.

Parameter	Description
Datasource Name	The name must meet the following requirements: The name can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-). The name cannot exceed 64 characters in length.
Datasource Code	After you configure the data source code, you can reference tables in the data source in a Flink_SQL task by using the `data source code.table name` or `data source code.schema.table name` format. If you need to automatically access the data source in the corresponding environment based on the current environment, use the variable format `${data source code}.table` or `${data source code}.schema.table`. For more information, see Dataphin data source table development method. Important The data source code cannot be modified after it is configured. You can preview data on the object details page in the asset directory and asset checklist only after the data source code is configured. In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, SelectDB, and GaussDB data warehouse service (DWS) data sources are currently supported.
Data Source Description	A brief description of the data source. The description cannot exceed 128 characters.
Data Source Configuration	Select the data source that you want to configure: If your business data source distinguishes between production and development data sources, select Production + Development Data Source. If your business data source does not distinguish between production and development data sources, select Production Data Source.
Tag	Categorize data sources with tags. For information about how to create tags, see Manage data source tags.

Configure the connection parameters between the data source and Dataphin.

If you select Production + Development data source, configure the connection information for both environments. If you select Production data source, configure only the production connection information.

Note

We recommend that you configure the production and development data sources as different data sources to isolate the two environments. However, Dataphin also supports using the same data source with identical parameter values for both.

For Configuration Method, you can select either JDBC URL or Host. The default selection is JDBC URL.

JDBC URL configuration method

Parameter	Description
JDBC URL	The format of the connection address is `jdbc:db2://host:port/dbname:currentSchema=schema;`.
Username, Password	The username and password used to log on to the IBM DB2 database.

Host configuration method

Parameter	Description
Server Address	Enter the IP address and port number of the server. You can click +Add to add multiple sets of IP addresses and port numbers, and click the icon to delete extra IP addresses and port numbers. You must keep at least one set.
dbname	Enter the database name.
Schema	Enter the schema information.

Parameter configuration

Parameter	Description
Parameter	Parameter name: You can select an existing parameter name or enter a custom parameter name. Custom parameter names can only contain uppercase and lowercase letters, digits, periods (.), underscores (_), and hyphens (-). Parameter value: When a parameter name is selected, the parameter value is required. It can only contain uppercase and lowercase letters, digits, periods (.), underscores (_), and hyphens (-), and cannot exceed 256 characters in length. Note You can click +Add Parameter to add multiple parameters, and click the icon to delete extra parameters. You can add up to 30 parameters.
Username, Password	The username and password used to log on to the IBM DB2 database.

Note

After you create a data source with the Host method, switching to the JDBC URL method causes the system to concatenate the server IP address and port number into a JDBC URL automatically.

Configure advanced settings for the data source.

Parameter	Description
connectionTimeout	The connection timeout duration in seconds. Default value: 900 (15 minutes). Note If you have a connectTimeout configuration in the JDBC URL, the connectTimeout value is the timeout period configured in the JDBC URL. For data sources created before Dataphin V3.11, the default connectTimeout value is `-1`, which indicates no timeout limit.
Connection Retries	The number of automatic retry attempts when a database connection times out. If the connection still fails after all retries, it is considered failed. Note The default number of retries is 1. You can configure a value between 0 and 10. The connection retry count will be applied by default to offline integration tasks and global quality (requires the asset quality function module to be enabled). In offline integration tasks, you can configure task-level retry counts separately.

Note

Rules for duplicate parameters:

If a parameter exists in the JDBC URL, Advanced Settings parameters, and Host Configuration method's parameter configuration, the value in the JDBC URL takes precedence.
If a parameter exists in both the JDBC URL and Advanced Settings parameters, the value in the JDBC URL takes precedence.
If a parameter exists in both the Advanced Settings parameters and Host Configuration method's parameter configuration, the value in the Advanced Settings parameters takes precedence.

Select a Default Resource Group. This resource group is used to run tasks related to the current data source, including database SQL, offline database migration, data preview, and more.
Perform a Test Connection or directly click OK to save and complete the creation of the IBM DB2 data source.

Click Test Connection to verify that the data source can connect to Dataphin. If you click OK directly, the system automatically tests the connection for all selected clusters. The data source is created even if all cluster connections fail.
Test Connection tests the connection for the Default Cluster or Registered Scheduling Clusters that have been registered in Dataphin and are in normal use. The Default Cluster is selected by default and cannot be deselected. If there are no resource groups under a Registered Scheduling Cluster, connection testing is not supported. You need to create a resource group first before testing the connection.
- The selected clusters are only used to test network connectivity with the current data source and are not used for running related tasks later.
- The test connection usually takes less than 2 minutes. If it times out, you can click the icon to view the specific reason and retry.
- Regardless of whether the test result is Connection Failed, Connection Successful, or Succeeded With Warning, the system will record the generation time of the final result.
  Note
  Only the test results for the Default Cluster include three connection statuses: Succeeded With Warning, Connection Successful, and Connection Failed. The test results for Registered Scheduling Clusters in Dataphin only include two connection statuses: Connection Successful and Connection Failed.
- When the test result is Connection Failed, you can click the icon to view the specific failure reason.
- When the test result is Succeeded With Warning, it means that the application cluster connection is successful but the scheduling cluster connection failed. The current data source cannot be used for data development and integration. You can click the icon to view the log information.