By creating an OceanBase data source, you can enable Dataphin to read business data from OceanBase or write data to OceanBase. This topic describes how to create an OceanBase data source.
Background information
OceanBase is a financial-grade distributed relational database independently developed by Alibaba Group and Ant Financial. If you use ApsaraDB for OceanBase and want to connect it to Dataphin for data development or write Dataphin data to OceanBase, you need to create an OceanBase data source first. For more information about OceanBase, see What is OceanBase Database?.
Permission requirements
Only users who have the Create Data Source permission point in a custom global role or users who have the super administrator, data source administrator, domain architect, or project administrator system role can create data sources.
Procedure
In the top navigation bar of the Dataphin homepage, choose Management Center > Datasource Management.
On the Datasource page, click +Create Data Source.
On the Create Data Source page, click Relational Database and select OceanBase.
If you have recently used OceanBase, you can also select OceanBase in the Recently Used section. You can also enter keywords in the search box to quickly search for OceanBase.
On the Create OceanBase Data Source page, configure the connection parameters.
Configure the basic information of the data source.
Parameter
Description
Datasource Name
The name must meet the following requirements:
It can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-).
It cannot exceed 64 characters in length.
Datasource Code
After you configure the data source code, you can reference tables in the data source in a Flink_SQL task by using the
data source code.table nameordata source code.schema.table nameformat. If you need to automatically access the data source in the corresponding environment based on the current environment, use the variable format${data source code}.tableor${data source code}.schema.table. For more information, see Dataphin data source table development method.ImportantThe data source code cannot be modified after it is configured successfully.
After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory.
In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.
Data Source Description
The description of the data source, such as business information and source. The description cannot exceed 128 characters in length.
Real-time Development
After this option is enabled, you need to configure real-time development parameters to make the data source available for real-time development.
Data Source Configuration
Select the data source that you want to configure:
If the business data source is divided into production and development data sources, select Production + Development Data Source.
If the business data source is not divided into production and development data sources, select Production Data Source.
Tag
You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.
Configure the connection parameters between the data source and Dataphin.
If you select Production + Development data source for your data source configuration, you need to configure the connection information for the Production + Development data source. If your data source configuration is Production data source, you only need to configure the connection information for the Production data source.
NoteIn most cases, the production data source and development data source should be configured as different data sources to isolate the development environment from the production environment and reduce the impact of the development data source on the production data source. However, Dataphin also supports configuring them as the same data source with identical parameter values.
Parameter
Description
Tenant Mode
The tenant mode. MySQL Tenant and Oracle Tenant are supported. MySQL Tenant supports both MySQL and OceanBase protocols. Oracle Tenant supports only the OceanBase protocol.
JDBC URL
The connection address of OceanBase. The format is as follows:
MySQL Tenant:
jdbc:mysql://host:port/dbnameOracle Tenant:
jdbc:oceanbase://host:port/dbnameUsername, Password
The username and password of the target database.
Configure advanced settings for the data source.
Parameter
Description
connectTimeout
The connectTimeout duration of the database in milliseconds. The default value is 900,000 milliseconds (15 minutes).
NoteIf you have configured connectTimeout in the JDBC URL, the connectTimeout is the timeout period configured in the JDBC URL.
For data sources created before Dataphin V3.11, the default connectTimeout is
-1, which indicates no timeout limit.
socketTimeout
The socketTimeout duration of the database in milliseconds. The default value is 1,800,000 milliseconds (30 minutes).
NoteIf you have configured socketTimeout in the JDBC URL, the socketTimeout is the timeout period configured in the JDBC URL.
For data sources created before Dataphin V3.11, the default socketTimeout is
-1, which indicates no timeout limit.
Connection Retries
If the database connection times out, the system automatically retries the connection until the specified number of retries is reached. If the connection still fails after the maximum number of retries, the connection fails.
NoteThe default number of retries is 1. You can set a value between 0 and 10.
The number of connection retries is applied to offline integration tasks and global quality (the asset quality feature module must be activated). You can configure the number of retries separately for offline integration tasks.
Select Default Resource Group. This resource group is used to run tasks related to the current data source, including database SQL, offline database migration, and data preview.
Click Test Connection or directly click OK to save and complete the creation of the OceanBase data source.
When you click Test Connection, the system tests whether the data source can connect to Dataphin properly. If you directly click OK, the system automatically tests the connection for all selected clusters. However, even if all selected clusters fail the connection test, the data source can still be created normally.
Test Connection tests the connection for the Default Cluster or Registered Scheduling Clusters that have been registered in Dataphin and are in normal use. The Default Cluster is selected by default and cannot be deselected. If there are no resource groups under a Registered Scheduling Cluster, connection testing is not supported. You need to create a resource group first before testing the connection.
The selected clusters are only used to test network connectivity with the current data source and are not used for running related tasks later.
The test connection usually takes less than 2 minutes. If it times out, you can click the
icon to view the specific reason and retry.Regardless of whether the test result is Connection Failed, Connection Successful, or Succeeded With Warning, the system will record the generation time of the final result.
NoteOnly the test results for the Default Cluster include three connection statuses: Succeeded With Warning, Connection Successful, and Connection Failed. The test results for Registered Scheduling Clusters in Dataphin only include two connection statuses: Connection Successful and Connection Failed.
When the test result is Connection Failed, you can click the
icon to view the specific failure reason.When the test result is Succeeded With Warning, it means that the application cluster connection is successful but the scheduling cluster connection failed. The current data source cannot be used for data development and integration. You can click the
icon to view the log information.