All Products
Search
Document Center

Dataphin:Create a TDH Inceptor data source

Last Updated:May 28, 2025

By creating a TDH Inceptor data source, you can enable Dataphin to read business data from TDH Inceptor or write data to TDH Inceptor. This topic describes how to create a TDH Inceptor data source.

Limits

Only custom global roles with the Create Data Source permission and users who are assigned the super administrator, data source administrator, domain architect, or project administrator role can create data sources.

Procedure

  1. In the top navigation bar of the Dataphin homepage, choose Management Center > Datasource Management.

  2. On the Datasource page, click +Create Data Source.

  3. On the Create Data Source page, select TDH Inceptor in the Big Data section.

    If you have recently used TDH Inceptor, you can also select TDH Inceptor in the Recently Used section. You can also enter keywords of TDH Inceptor in the search box to quickly filter it.

  4. On the Create TDH Inceptor Data Source page, configure the basic information of the data source.

    Parameter

    Description

    Datasource Name

    The name must meet the following requirements:

    • The name can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-).

    • The name cannot exceed 64 characters in length.

    Datasource Code

    After you configure the data source code, you can reference tables in the data source in Flink_SQL tasks in the format of data source code.table name or data source code.schema.table name. If you need to automatically access the data source in the corresponding environment based on the current environment, access the data source in the variable format of ${data source code}.table or ${data source code}.schema.table. For more information, see Dataphin data source table development method.

    Important
    • The data source code cannot be modified after it is configured successfully.

    • After the data source code is configured successfully, you can preview data on the object details page in the asset directory and asset inventory.

    • In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are currently supported.

    Version

    You can select 6.2.x or 9.3.x.

    Data Source Description

    A brief description of the data source. It cannot exceed 128 characters.

    Data Source Configuration

    Select the data source that you want to configure:

    • If the data source distinguishes between production data sources and development data sources, select Production + Development Data Source.

    • If the data source does not distinguish between production data sources and development data sources, select Production Data Source.

    Tag

    You can categorize and tag data sources based on tags. For information about how to create tags, see Manage data source tags.

  5. Configure the connection parameters between the data source and Dataphin.

    If you select Production + Development Data Source for Data Source Configuration, you need to configure the connection information for both Production + Development Data Source. If you select Production Data Source for Data Source Configuration, you only need to configure the connection information for Production Data Source.

    Note

    In general, production data sources and development data sources should be configured as different data sources to achieve environment isolation between development data sources and production data sources, reducing the impact of development data sources on production data sources. However, Dataphin also supports configuring them as the same data source, meaning with identical parameter values.

    Parameter

    Description

    Cluster Configuration

    NameNode

    NameNode is the HostName or IP address and port of the NameNode node in the HDFS cluster.

    Configuration example: host=192.x.x.169,webUiPort=,ipcPort=8020. In the TDH environment, webUiPort and IPCport are 50070 and 8020 by default. You can fill in the corresponding ports according to your actual situation.image

    Configuration File

    Used to upload Hadoop configuration files, such as hdfs-site.xml and core-site.xml. These configuration files can be exported from the Hadoop cluster.

    Authentication Type

    If the HDFS cluster does not require authentication, select No Authentication. If the HDFS cluster requires authentication, Dataphin supports selecting Kerberos.

    If you select Kerberos as the authentication method, you need to configure the following authentication information:

    • Kerberos Configuration Method:

      • KDDC Server: The unified service address of KDC. Multiple configurations are supported, separated by semicolons (;).

      • Krb5 File Configuration: You need to upload the Krb5 file.

    • HDFS Configuration:

      • HDFS keytab File: The keytab file of HDFS, which is the Kerberos authentication file.

      • HDFS Principal: The Kerberos authentication Principal name. Example: xxxx/hadoopclient@xxx.xxx.

    Inceptor Configuration

    JDBC URL

    Configure the JDBC URL for connecting to Inceptor. The format is jdbc:hive2//host:port/dbname.

    Authentication Type

    If the Inceptor cluster does not require authentication, select No Authentication. If the Inceptor cluster requires authentication, Dataphin supports selecting LDAP or Kerberos. You can choose based on your actual situation. The details are as follows:

    • Kerberos: After selecting this option, you need to upload the Keytab File and configure the Principal. The Keytab File is the Kerberos authentication file. An example of Principal is xxxx/hadoopclient@xxx.xxx.

    • LDAP: After selecting this option, you need to configure the username and password for LDAP authentication.

    Username

    Configure the username for Inceptor.

    Metadatabase Configuration

    Metadata Retrieval Method

    Supports retrieval through Metadatabase method and HMS method.

    • Metadatabase Method: You need to configure the database type, version, JDBC URL, authentication method, username, and password.

      • Database Type: Select the database type based on the metadatabase type used in the cluster. Supported options include MySQL, PostgreSQL, and Inceptor.

      • Version: If the database type is MySQL, you need to select the corresponding version. Supported versions include MySQL5.1.43, MySQL5.6/5.7, and MySQL8.0.

      • JDBC URL: Fill in the connection address of the corresponding metadatabase.

        • MySQL: The format is jdbc:mysql://host:port/dbname.

        • PostgreSQL: The format is jdbc:postgresql://host:port/dbname.

        • Inceptor: The format is jdbc:hive2://host:port/dbname.

      • Authentication Method: If the database type is Inceptor, you need to fill in the authentication method. Three authentication methods are supported: No Authentication, LDAP, and Kerberos.

        If you use the Kerberos authentication method, you need to select Kerberos as the authentication method in the cluster configuration section.

      • Username, Password: The username and password for logging in to the metadatabase.

    • HMS Method: You need to configure the Authentication Method and upload hive-site.xml.

      • Authentication Method: Supports No Authentication, LDAP, and Kerberos.

        If you use the Kerberos authentication method, you need to select Kerberos as the authentication method in the cluster configuration section. You also need to upload the Keytab File and fill in the Principal.

      • hive-site.xml: Upload the hive-site.xml configuration file of the TDH Inceptor data source cluster.

  6. Select Default Resource Group. This resource group is used to run tasks related to the current data source, including database SQL, offline database migration, data preview, and more.

  7. Perform a Test Connection or directly click OK to save and complete the creation of the TDH Inceptor data source.

    Click Test Connection, and the system will test whether the data source can communicate normally with Dataphin. If you directly click OK, the system will automatically test the connection for all selected clusters. However, even if all selected clusters fail to connect, the data source can still be created normally.