Create an Amazon RDS for Oracle data source - Dataphin - Alibaba Cloud Documentation Center

By creating an Amazon RDS for Oracle data source, you can enable Dataphin to read business data from or write data to Amazon RDS for Oracle. This topic describes how to create an Amazon RDS for Oracle data source.

Permissions

Only users who have the Create Data Source permission point in a custom global role and users who have the super administrator, data source administrator, domain architect, or project administrator role can create data sources.

Procedure

In the top navigation bar of the Dataphin homepage, choose Management Hub > Datasource Management.
On the Datasource page, click +Create Data Source.
On the Create Data Source page, select Amazon RDS for Oracle in the Relational Database section.
If you have recently used Amazon RDS for Oracle, you can also select it in the Recently Used section. You can also enter keywords in the search box to quickly find Amazon RDS for Oracle.

On the Create Amazon RDS For Oracle Data Source page, configure the parameters for connecting to the data source.

Configure the basic information of the data source.

Parameter	Description
Datasource Name	Enter a name for the data source. The name must meet the following requirements: It can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-). It cannot exceed 64 characters in length.
Datasource Code	After you configure the data source code, you can directly access Dataphin data source tables in Flink_SQL tasks or by using the Dataphin JDBC client in the format of `data source code.table name` or `data source code.schema.table name` for quick consumption. If you need to automatically switch data sources based on the task execution environment, you can access them using the variable format `${data source code}.table` or `${data source code}.schema.table`. For more information, see Development method for Dataphin data source tables. Important The data source code cannot be modified after it is configured successfully. After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory. In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.
Version	You can only select Oracle19c or Oracle21c version.
Data Source Description	A brief description of the data source. It cannot exceed 128 characters.
Data Source Configuration	Select the data source that you want to configure: If your business data source distinguishes between production and development data sources, select Production + Development Data Source. If your business data source does not distinguish between production and development data sources, select Production Data Source.
Tag	You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.

Configure the parameters for connecting the data source to Dataphin.

If you select Production + Development data source for your data source configuration, you need to configure the connection information for the Production + Development data source. If your data source configuration is Production data source, you only need to configure the connection information for the Production data source.

Note

In most cases, the production data source and development data source should be configured as different data sources to isolate the development environment from the production environment and reduce the impact of the development data source on the production data source. However, Dataphin also supports configuring them as the same data source with identical parameter values.

You can select either JDBC URL or Host for Configuration Method. The default is JDBC URL.

JDBC URL configuration method

Parameter	Description
JDBC URL	The format of the connection address is `jdbc:oracle:thin:@host:port:sid` or `jdbc:oracle:thin:@//host:port/servicename`.
Schema	If the data source is only used for real-time database migration, you do not need to specify a schema. If the data source is used for offline integration or real-time computing, you need to specify a schema (case-sensitive).
Username, Password	Enter the authentication username and password. To ensure that tasks can be executed properly, make sure that the user has the required data permissions.

Host configuration method

Parameter	Description
Server Address	Enter the IP address and port number of the server. You can add only one set.
Service Type	You can select Service Name or SID (System Identifier).
dbname	Enter the database name.

Parameter configuration

Parameter	Description
Parameter	Parameter name: You can select an existing parameter name or enter a custom parameter name. Custom parameter names can only contain letters, digits, periods (.), underscores (_), and hyphens (-). Parameter value: When a parameter name is selected, the parameter value is required. It can only contain letters, digits, periods (.), underscores (_), and hyphens (-), and cannot exceed 256 characters in length. Note You can click +Add Parameter to add multiple parameters, and click the icon to delete unnecessary parameters. You can add up to 30 parameters.
Schema (optional)	If the data source is only used for real-time database migration, you do not need to specify a schema. If the data source is used for offline integration or real-time computing, you need to specify a schema (case-sensitive).
Username, Password	The username and password for logging on to the Amazon RDS for Oracle instance.

Note

After you create a data source with the Host configuration method, if you need to switch to the JDBC URL configuration method, the system will concatenate the IP address and port number of the server into a JDBC URL for filling.

Configure advanced settings for connecting the data source to Dataphin.
Connection Retries: If the database connection times out, the system will automatically retry the connection until the specified number of retries is reached. If the maximum number of retries is reached and the connection is still unsuccessful, the connection fails. The default number of retries is 1, and you can configure a value between 0 and 10.
The connection retry count will be applied by default to offline integration tasks and global quality (requires the Asset Quality function module to be enabled). In offline integration tasks, you can configure the retry count at the task level separately.
Note
Rules for duplicate parameter values:
- If a parameter exists in both the JDBC URL, Advanced Settings parameters, and Host configuration method's parameter configuration, the value in the JDBC URL takes precedence.
- If a parameter exists in both the JDBC URL and Advanced Settings parameters, the value in the JDBC URL takes precedence.
- If a parameter exists in both the Advanced Settings parameters and Host configuration method's parameter configuration, the value in the Advanced Settings parameters takes precedence.

Select a Default Resource Group, which is used to run tasks related to the current data source, including database SQL, offline database migration, and data preview.
Click Test Connection or directly click OK to save and complete the creation of the Amazon RDS for Oracle data source.
When you click Test Connection, the system tests whether the data source can connect to Dataphin properly. If you directly click OK, the system automatically tests the connection for all selected clusters. However, even if all selected clusters fail to connect, the data source can still be created normally.
Test Connection tests the connection for the Default Cluster or Registered Scheduling Clusters that have been registered in Dataphin and are in normal use. The Default Cluster is selected by default and cannot be deselected. If there are no resource groups under a Registered Scheduling Cluster, connection testing is not supported. You need to create a resource group first before testing the connection.
- The selected clusters are only used to test network connectivity with the current data source and are not used for running related tasks later.
- The test connection usually takes less than 2 minutes. If it times out, you can click the icon to view the specific reason and retry.
- Regardless of whether the test result is Connection Failed, Connection Successful, or Succeeded With Warning, the system will record the generation time of the final result.
  Note
  Only the test results for the Default Cluster include three connection statuses: Succeeded With Warning, Connection Successful, and Connection Failed. The test results for Registered Scheduling Clusters in Dataphin only include two connection statuses: Connection Successful and Connection Failed.
- When the test result is Connection Failed, you can click the icon to view the specific failure reason.
- When the test result is Succeeded With Warning, it means that the application cluster connection is successful but the scheduling cluster connection failed. The current data source cannot be used for data development and integration. You can click the icon to view the log information.