By creating a Presto data source, you can use it for offline development in Dataphin. This topic describes how to create a Presto data source.
Background information
Presto is a high-performance, distributed SQL query engine. It can perform fast queries and analysis on data, supporting various data sources such as HDFS, MySQL, Cassandra, and more. Presto can be used for real-time queries, interactive analysis, and processing of large datasets.
Permission description
Only custom global roles with the Create Data Source permission and system roles such as super administrator, data source administrator, domain architect, and project administrator can create data sources.
Procedure
On the Dataphin homepage, click Management Hub > Datasource Management in the top navigation bar.
On the Datasource page, click +Create Data Source.
In the Create Data Source page, select Presto from the NoSQL section.
If you have recently used Presto, you can also select it from the Recently Used section. You can also quickly search for Presto by entering keywords in the search box.
On the Create Presto Data Source page, configure the data source connection parameters.
Configure the basic information of the data source.
Parameter
Description
Datasource Name
Enter the name of the data source. The naming convention is as follows:
The name can contain only Chinese characters, letters, digits, underscores (_), or hyphens (-).
The name cannot exceed 64 characters in length.
Datasource Code
After configuring the data source code, you can directly access Dataphin data source tables in Flink_SQL tasks or using the Dataphin JDBC client through the format
data source code.table nameordata source code.schema.table namefor quick consumption. If you need to automatically switch data sources based on the task execution environment, access them using the variable format${data source code}.tableor${data source code}.schema.table. For more information, see Dataphin data source table development method.ImportantThe data source code cannot be modified after it is configured successfully.
After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory.
In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.
Version
Currently, only Presto version 2.1.5 is supported.
Data Source Description
Enter a brief description of the data source, within 128 characters.
Data Source Configuration
Select the configuration environment for the data source:
If your business data source distinguishes between production and development data sources, select Production + Development Data Source.
If your business data source does not distinguish between production and development data sources, select Production Data Source.
Tag
You can categorize and tag data sources using tags. For information about how to create tags, see Manage data source tags.
Configure the connection parameters between the data source and Dataphin.
If you selected Production + Development Data Source in the previous step, the configuration page for Production + Development Data Source is displayed. If you selected Production Data Source, only the configuration page for Production Data Source is displayed.
NoteTypically, production and development data sources should be configured as separate data sources to achieve environment isolation, reducing the impact of development data sources on production data sources. However, Dataphin also supports configuring them as the same data source with identical parameter values.
Parameter
Description
JDBC URL
The connection address of Presto. The connection format is
jdbc:presto://ip:port/catalog/schema. Thecatalog/schemaparameters are optional in the connection address.Username
Enter the username used to access the Presto data source.
Select a Default Resource Group that will be used to run tasks related to the current data source, including database SQL, offline database migration, and data preview.
Click Test Connection or directly click OK to save and complete the creation of the Presto data source.
When you click Test Connection, the system tests whether the data source can connect to Dataphin properly. If you directly click OK, the system automatically tests the connection for all selected clusters. However, the data source can still be created even if the connection tests for all selected clusters fail.