By creating an ArgoDB data source, you can read business data from ArgoDB or write data to ArgoDB in Dataphin. This topic describes how to create an ArgoDB data source.
Permissions
Only users who have the permission to create data sources in a custom global role and users who are assigned the super administrator, data source administrator, domain architect, or project administrator role can create data sources.
Procedure
On the Dataphin homepage, click Management Center > Datasource Management in the top navigation bar.
On the Datasource page, click +Create Data Source.
In the Big Data section of the Create Data Source page, select ArgoDB.
If you have recently used ArgoDB, you can also select ArgoDB in the Recently Used section. You can also enter keywords in the search box to quickly search for ArgoDB.
On the Create ArgoDB Data Source page, configure the basic information of the data source.
Parameter
Description
Datasource Name
The name must meet the following requirements:
The name can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-).
The name cannot exceed 64 characters in length.
Datasource Code
After you configure the data source code, you can reference tables in the data source in a Flink_SQL node by using the
data source code.table nameordata source code.schema.table nameformat. If you need to automatically access the data source in the corresponding environment based on the current environment, use the variable format${data source code}.tableor${data source code}.schema.table. For more information, see Development method for Dataphin data source tables.ImportantThe data source code cannot be modified after it is configured successfully.
After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory.
In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.
Version
Currently, only version 5.2 is supported.
Data Source Description
A brief description of the data source. The description cannot exceed 128 characters in length.
Data Source Configuration
Select the data source that you want to configure:
If the data source is divided into a production data source and a development data source, select Production + Development Data Source.
If the data source is not divided into a production data source and a development data source, select Production Data Source.
Tag
You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.
Configure the connection parameters between the data source and Dataphin.
If you select Production + Development data source for your data source configuration, you need to configure the connection information for the Production + Development data source. If your data source configuration is Production data source, you only need to configure the connection information for the Production data source.
NoteIn most cases, the production data source and the development data source must be configured as different data sources to isolate the development environment from the production environment and reduce the impact of the development data source on the production data source. However, Dataphin also allows you to configure them as the same data source with the same parameter values.
Configure the parameters in the Cluster Configuration section.
Parameter
Description
NameNode
The hostname or IP address and port of the NameNode in the HDFS cluster.
Example:
host=192.x.x.169,webUiPort=,ipcPort=8020. In a TDH environment, the default values ofwebUiPortandIPCportare 50070 and 8020. You can specify the ports based on your actual situation.Configuration File
Upload Hadoop configuration files, such as
hdfs-site.xmlandcore-site.xml. You can export these configuration files from your Hadoop cluster.Authentication Type
If the HDFS cluster does not require authentication, select No Authentication. If the HDFS cluster requires authentication, Dataphin supports Kerberos.
If you select Kerberos as the authentication method, you need to configure the following authentication information:
Kerberos configuration:
KDDC Server: The unified service address of the KDC. Multiple addresses are supported and separated by semicolons (;).
Krb5 file configuration: Upload the Krb5 file.
HDFS configuration:
HDFS keytab File: The keytab file for HDFS, which is the Kerberos authentication file.
HDFS Principal: The Kerberos authentication principal name. The format is
XXXX/hadoopclient@xxx.xxx.
Configure the parameters in the ArgoDB Configuration section.
Parameter
Description
JDBC URL
The JDBC URL for connecting to ArgoDB. The format is
jdbc:hive2//host:port/dbname.Authentication Type
If the ArgoDB cluster does not require authentication, select No Authentication. If the Inceptor cluster requires authentication, Dataphin supports LDAP or Kerberos. You can select an authentication method based on your actual situation. The details are as follows:
Kerberos: If you select this option, you need to upload a Keytab File and configure a Principal. The Keytab File is the Kerberos authentication file. The format of the Principal is
XXXX/hadoopclient@xxx.xxx.LDAP: If you select this option, you need to configure the username and password for LDAP authentication.
Username
The username for ArgoDB.
Configure the parameters in the Metadatabase Configuration section.
Parameter
Description
Metadata Retrieval Method
Supports retrieval directly from the metadatabase or from HMS.
To use HMS, you need to upload the hive-site.xml configuration file. The authentication methods supported are No Authentication, LDAP, and Kerberos. For Kerberos authentication, you also need to upload a Keytab File and configure a Principal.
Database Type
Select the database type based on the type of metadatabase used in your cluster. ArgoDB is supported.
JDBC URL
The JDBC URL of the ArgoDB metadatabase. The format is:
jdbc:hive2://hsot:port/dbname.Authentication Method
Three authentication methods are supported: No Authentication, LDAP, and Kerberos.
For Kerberos authentication, you also need to upload a Keytab File and configure a Principal.
Username, Password
The username and password for logging on to the metadatabase.
Select a Default Resource Group. This resource group is used to run tasks related to the current data source, including database SQL, offline database migration, and data preview.
Click Test Connection or directly click OK to save and complete the creation of the ArgoDB data source.
If you click Test Connection, the system tests whether the data source can connect to Dataphin properly. If you directly click OK, the system automatically tests the connection for all selected clusters. However, even if the connection to all selected clusters fails, the data source can still be created normally.