By creating a Vertica data source, you can enable Dataphin to read business data from Vertica or write data to Vertica. This topic describes how to create a Vertica data source.
Background information
Vertica is a database based on columnar storage architecture. Before connecting Vertica to Dataphin for data development, you need to create a Vertica data source. For more information about Vertica, see Vertica official website.
Permissions
Only custom global roles with the Create Data Source permission and system roles such as super administrator, data source administrator, domain architect, and project administrator can create data sources.
Procedure
On the Dataphin homepage, click Management Center > Datasource Management in the top navigation bar.
On the Datasource page, click +Create Data Source.
On the Create Data Source page, select Vertica in the Relational Database section.
If you have recently used Vertica, you can also select Vertica in the Recently Used section. You can also enter Vertica keywords in the search box to quickly find it.
On the Create Vertica Data Source page, configure the connection parameters.
Configure the basic information of the data source.
Parameter
Description
Datasource Name
The name must meet the following requirements:
It can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-).
It cannot exceed 64 characters in length.
Datasource Code
After you configure the data source code, you can reference tables in the data source in Flink_SQL tasks by using the format
data source code.table nameordata source code.schema.table name. If you need to automatically access the data source in the corresponding environment based on the current environment, you can use the variable format${data source code}.tableor${data source code}.schema.table. For more information, see Dataphin data source table development method.ImportantThe data source code cannot be modified after it is configured successfully.
After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory.
In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.
Data Source Description
A brief description of the data source. It cannot exceed 128 characters.
Data Source Configuration
Select the data source to configure:
If the business data source distinguishes between production and development data sources, select Production + Development Data Source.
If the business data source does not distinguish between production and development data sources, select Production Data Source.
Tag
You can categorize and tag data sources using labels. For information about how to create tags, see Manage data source tags.
Configure the connection parameters between the data source and Dataphin.
If you select Production + Development data source for your data source configuration, you need to configure the connection information for the Production + Development data source. If your data source configuration is Production data source, you only need to configure the connection information for the Production data source.
NoteTypically, production and development data sources should be configured as separate data sources to achieve environment isolation and reduce the impact of development data sources on production data sources. However, Dataphin also supports configuring them as the same data source with identical parameter values.
Parameter
Description
JDBC URL
The format of the connection URL is
jdbc:vertica//host:port/dbname. For example,jdbc:vertica//192.168.*.1:5433/dataphin.Schema
You can specify the Schema information for the Vertica database. If not specified, the default schema is public.
Username, Password
The username and password of the Vertica database.
Configure advanced settings for the data source.
Parameter
Description
loginTimeout
The loginTimeout duration for the database (in seconds). The default is 900 seconds (15 minutes).
NoteIf you have included a loginTimeout configuration in the JDBC URL, the loginTimeout will be the timeout value configured in the JDBC URL.
For data sources created before Dataphin V3.11, the default loginTimeout is
-1, which means no timeout limit.
Connection Retries
If the database connection times out, the system will automatically retry the connection until the specified number of retries is reached. If the connection still fails after the maximum number of retries, the connection is considered failed.
NoteThe default number of retries is 1. You can configure a value between 0 and 10.
The connection retry count will be applied by default to offline integration tasks and global quality (requires the asset quality function module to be enabled). You can separately configure task-level retry counts in offline integration tasks.
Select a Default Resource Group. This resource group is used to run tasks related to the current data source, including database SQL, offline database migration, data preview, and more.
Click Test Connection or directly click OK to save and complete the creation of the Vertica data source.
When you click Test Connection, the system tests whether the data source can connect to Dataphin normally. If you directly click OK, the system will automatically test the connection for all selected clusters. However, even if all selected clusters fail the connection test, the data source can still be created normally.