Create an ArgoDB data source - Dataphin - Alibaba Cloud Documentation Center

By creating an ArgoDB data source, you can read business data from ArgoDB or write data to ArgoDB in Dataphin. This topic describes how to create an ArgoDB data source.

Permissions

Only users who have the permission to create data sources in a custom global role and users who are assigned the super administrator, data source administrator, domain architect, or project administrator role can create data sources.

Procedure

On the Dataphin homepage, click Management Center > Datasource Management in the top navigation bar.
On the Datasource page, click +Create Data Source.
In the Big Data section of the Create Data Source page, select ArgoDB.
If you have recently used ArgoDB, you can also select ArgoDB in the Recently Used section. You can also enter keywords in the search box to quickly search for ArgoDB.

On the Create ArgoDB Data Source page, configure the basic information of the data source.

Parameter	Description
Datasource Name	The name must meet the following requirements: The name can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-). The name cannot exceed 64 characters in length.
Datasource Code	After you configure the data source code, you can reference tables in the data source in a Flink_SQL node by using the `data source code.table name` or `data source code.schema.table name` format. If you need to automatically access the data source in the corresponding environment based on the current environment, use the variable format `${data source code}.table` or `${data source code}.schema.table`. For more information, see Development method for Dataphin data source tables. Important The data source code cannot be modified after it is configured successfully. After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory. In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.
Version	Currently, only version 5.2 is supported.
Data Source Description	A brief description of the data source. The description cannot exceed 128 characters in length.
Data Source Configuration	Select the data source that you want to configure: If the data source is divided into a production data source and a development data source, select Production + Development Data Source. If the data source is not divided into a production data source and a development data source, select Production Data Source.
Tag	You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.

Configure the connection parameters between the data source and Dataphin.

If you select Production + Development data source for your data source configuration, you need to configure the connection information for the Production + Development data source. If your data source configuration is Production data source, you only need to configure the connection information for the Production data source.

Note

In most cases, the production data source and the development data source must be configured as different data sources to isolate the development environment from the production environment and reduce the impact of the development data source on the production data source. However, Dataphin also allows you to configure them as the same data source with the same parameter values.

Configure the parameters in the Cluster Configuration section.

Parameter	Description
NameNode	The hostname or IP address and port of the NameNode in the HDFS cluster. Example: `host=192.x.x.169,webUiPort=,ipcPort=8020`. In a TDH environment, the default values of `webUiPort` and `IPCport` are 50070 and 8020. You can specify the ports based on your actual situation.
Configuration File	Upload Hadoop configuration files, such as `hdfs-site.xml` and `core-site.xml`. You can export these configuration files from your Hadoop cluster.
Authentication Type	If the HDFS cluster does not require authentication, select No Authentication. If the HDFS cluster requires authentication, Dataphin supports Kerberos. If you select Kerberos as the authentication method, you need to configure the following authentication information: Kerberos configuration: KDDC Server: The unified service address of the KDC. Multiple addresses are supported and separated by semicolons (;). Krb5 file configuration: Upload the Krb5 file. HDFS configuration: HDFS keytab File: The keytab file for HDFS, which is the Kerberos authentication file. HDFS Principal: The Kerberos authentication principal name. The format is `XXXX/hadoopclient@xxx.xxx`.

Configure the parameters in the ArgoDB Configuration section.

Parameter	Description
JDBC URL	The JDBC URL for connecting to ArgoDB. The format is `jdbc:hive2//host:port/dbname`.
Authentication Type	If the ArgoDB cluster does not require authentication, select No Authentication. If the Inceptor cluster requires authentication, Dataphin supports LDAP or Kerberos. You can select an authentication method based on your actual situation. The details are as follows: Kerberos: If you select this option, you need to upload a Keytab File and configure a Principal. The Keytab File is the Kerberos authentication file. The format of the Principal is `XXXX/hadoopclient@xxx.xxx`. LDAP: If you select this option, you need to configure the username and password for LDAP authentication.
Username	The username for ArgoDB.

Configure the parameters in the Metadatabase Configuration section.

Parameter	Description
Metadata Retrieval Method	Supports retrieval directly from the metadatabase or from HMS. To use HMS, you need to upload the hive-site.xml configuration file. The authentication methods supported are No Authentication, LDAP, and Kerberos. For Kerberos authentication, you also need to upload a Keytab File and configure a Principal.
Database Type	Select the database type based on the type of metadatabase used in your cluster. ArgoDB is supported.
JDBC URL	The JDBC URL of the ArgoDB metadatabase. The format is: `jdbc:hive2://hsot:port/dbname`.
Authentication Method	Three authentication methods are supported: No Authentication, LDAP, and Kerberos. For Kerberos authentication, you also need to upload a Keytab File and configure a Principal.
Username, Password	The username and password for logging on to the metadatabase.

Select a Default Resource Group. This resource group is used to run tasks related to the current data source, including database SQL, offline database migration, and data preview.
Click Test Connection or directly click OK to save and complete the creation of the ArgoDB data source.
If you click Test Connection, the system tests whether the data source can connect to Dataphin properly. If you directly click OK, the system automatically tests the connection for all selected clusters. However, even if the connection to all selected clusters fails, the data source can still be created normally.