How to create an Elasticsearch data source - Dataphin - Alibaba Cloud Documentation Center

By creating an Elasticsearch data source, you can write data from Dataphin to Elasticsearch. This topic describes how to create an Elasticsearch data source.

Background information

Alibaba Cloud Elasticsearch is a fully managed Elasticsearch cloud service based on open-source Elasticsearch. It is widely used in scenarios such as real-time log analysis and processing, information retrieval, and multi-dimensional data query and statistical analysis.

If you are using Alibaba Cloud Elasticsearch and want to export data from Dataphin to Elasticsearch, you need to first create an Elasticsearch data source.

Permission requirements

Only users with the Create Data Source permission point in a custom global role and users with the super administrator, data source administrator, domain architect, and project administrator system roles can create data sources.

Procedure

In the top navigation bar of the Dataphin homepage, choose Management Center > Datasource Management.
On the Datasource page, click +Create Data Source.
On the Create Data Source page, select Elasticsearch in the NoSQL section.
If you have recently used Elasticsearch, you can also select Elasticsearch in the Recently Used section. You can also enter Elasticsearch keywords in the search box to quickly search for it.

On the Create Elasticsearch Data Source page, configure the connection parameters.

Configure the basic information of the data source.

Parameter	Description
Datasource Name	Enter the name of the data source. The name must meet the following requirements: It can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-). It cannot exceed 64 characters in length.
Datasource Code	After you configure the data source code, you can directly access Dataphin data source tables in Flink_SQL tasks or by using the Dataphin JDBC client in the format of `data source code.table name` or `data source code.schema.table name` for quick consumption. If you need to automatically switch data sources based on the task execution environment, access them using the variable format `${data source code}.table` or `${data source code}.schema.table`. For more information, see Development method for Dataphin data source tables. Important The data source code cannot be modified after it is configured successfully. After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory. In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.
Version	Supports Elasticsearch 6.x, Elasticsearch 7.x, and Elasticsearch 8.x versions.
Data Source Description	A brief description of the data source, not exceeding 128 characters.
Data Source Configuration	Select the data source to configure: If your business data source distinguishes between production and development data sources, select Production + Development Data Source. If your business data source does not distinguish between production and development data sources, select Production Data Source.
Tag	You can categorize and tag data sources based on tags. To create tags, see Manage data source tags.

Configure the connection parameters between the data source and Dataphin.

If you selected Production + Development Data Source in the previous step, the configuration page for Production + Development Data Source is displayed. If you selected Production Data Source, only the configuration page for Production Data Source is displayed.

Note

Typically, production and development data sources should be configured as separate data sources to achieve environment isolation between development and production data sources, reducing the impact of development data sources on production data sources. However, Dataphin also supports configuring them as the same data source with identical parameter values.

Parameter	Description
ES URL	The connection address of Elasticsearch. We recommend that you use a private network connection address. The format is `http://host:port`. For example: `http://192.168.*.212:9200`.
Username, Password	The username and password for accessing the Elasticsearch instance.

Select a Default Resource Group, which is used to run tasks related to the current data source, including database SQL, offline database migration, data preview, and more.
Perform a Test Connection or directly click OK to save and complete the creation of the Elasticsearch data source.
Click Test Connection, and the system will test whether the data source can connect normally with Dataphin. If you directly click OK, the system will automatically test the connection for all selected clusters, but the data source can still be created normally even if all selected clusters fail to connect.