All Products
Search
Document Center

Dataphin:Create TDH Inceptor compute source

Last Updated:Jan 23, 2025

The TDH Inceptor compute source binds the Dataphin project space to TDH Inceptor, providing a compute source for processing tasks within the Dataphin project. If the Dataphin system's compute engine is set to TDH Inceptor, the project space must add the TDH Inceptor compute source to support features such as standard modeling, ad hoc queries, Hive tasks, and general scripts. This topic outlines the steps to create a TDH Inceptor compute source.

Prerequisites

  • To use TDH Inceptor as the metadata warehouse or as the metadatabase retrieval method during metadata warehouse initialization, the following conditions must be met:

    • The dataphin_meta project has been created in TDH Inceptor.

    • The user configured in TDH Inceptor during metadata warehouse initialization must have permission to write and create tables in the dataphin_meta project.

    • If TDH Inceptor is used as the metadatabase retrieval method during metadata warehouse initialization, the user in the Inceptor metadata configuration needs permission to write and create tables in the dataphin_meta project.

      image

Limits

  • User-defined functions (UDF) are not supported: Registering a JAR package with the same name for UDF may cause the Inceptor service to stop and prevent it from restarting successfully. Registering a JAR package with a different name but containing the same class file may result in unpredictable UDF execution outcomes. As such, Dataphin does not permit UDF registration under the TDH Inceptor engine. If UDF addition is necessary, it can be done through the TDH Inceptor client, ensuring the UDF name's uniqueness and the class name's consistency across the cluster.

  • When obtaining metadata information through MySQL metadatabase or HMS, the following cannot be retrieved:

    • Asset overview, data block, and project data volume information.

    • Table data volume, partition data volume, and partition record count in the asset directory.

    • Storage-related metric information in resource administration.

    • Data volume and record count of dim_dataphin_table and dim_dataphin_partition in the metadata warehouse shared model.

  • When obtaining metadata information through the TDH Inceptor System library, the following cannot be retrieved:

    • Partition record count information in the asset directory.

    • Record count of dim_dataphin_table and dim_dataphin_partition in the metadata warehouse shared model.

  • The TDH Inceptor engine does not support setting task priority.

    To allocate different resources to tasks with varying priorities, you can set different usernames for different priority queues. After setting priorities on Inceptor SQL tasks, Dataphin will submit these tasks to the TDH Inceptor engine using the corresponding users. Note that the users set here must have the Submit permission for the Inceptor resource queue.

Procedure

  1. In the top menu bar of the Dataphin home page, select Planning > Compute Source.

  2. On the Compute Source page, click Add Compute Source and select TDH Inceptor Compute Source.

  3. On the Create TDH Inceptor Compute Source page, configure the following parameters.

    1. Configure the parameters in the Compute Engine Source Basic Information area.

      Parameter

      Description

      Compute Source Type

      Select the compute source type as TDH Inceptor.

      Compute Source Name

      The name must meet the following requirements:

      • Can only contain Chinese characters, numbers, uppercase and lowercase English letters, underscores (_), and hyphens (-).

      • The name can be up to 64 characters in length.

      Compute Source Description

      A brief description of the compute source, up to 128 characters.

    2. Configure the parameters in the Cluster Basic Information area.

      Parameter

      Description

      nameNode

      The default value is the NameNode parameter value configured during system initialization. Modification is not supported.

      Configuration File

      Upload the HDFS configuration file, including core-site.xml, hdfs-site.xml, hive-site.xml, yarn-site.xml, and other configuration files.

      You can contact Transwarp operation and maintenance personnel or log in to the Transwarp cluster operation and maintenance interface, then select HDFS Service > More Operations > Download Service Configuration to obtain the configuration file.

      Authentication Type

      If the TDH Inceptor cluster has Kerberos authentication, select Kerberos as the authentication method. Kerberos is an identity authentication protocol based on symmetric-key cryptography that supports single sign-on (SSO). Once authenticated, a client can access multiple services, such as HBase and HDFS.

      After selecting Kerberos authentication, upload the Krb5 authentication file or configure the KDC Server address:

      • Krb5 authentication file: Upload the Krb5 file for Kerberos authentication.

      • KDC Server address: The KDC server address is necessary for completing Kerberos authentication. Multiple KDC Server service addresses are supported, separated by commas (,).

    3. Configure the parameters in the HDFS Connection Information area.

      Parameter

      Description

      Execution Username, Password

      Enter the username and password used to log into the compute execution machine, necessary for executing MapReduce tasks, reading and writing to HDFS, and more.

      Important

      Ensure you have the necessary permissions to submit MapReduce tasks.

      Authentication Type

      If HDFS is configured with Kerberos authentication, select Kerberos as the authentication method. Kerberos is a secure identity authentication protocol that supports single sign-on (SSO), allowing authenticated clients to access various services like HBase and HDFS.

      • Upon selecting Kerberos authentication, upload the Keytab File and configure the Principal address:

        • Keytab File: Upload the Keytab File required for Kerberos authentication.

        • Principal: The username associated with Kerberos authentication.

      • If no authentication is selected, configure the username for HDFS access.

    4. Configure the parameters in the Inceptor Configuration area.

      Parameter

      Description

      JDBC URL

      Specify the Hive Server's connection address. The format is jdbc:hive2://{endpoint}:{port}/{database name}.

      Authentication Type

      Choose the authentication file for Inceptor based on the engine configuration, with options including No Authentication, LDAP, and Kerberos:

      • No Authentication: No authentication required.

      • LDAP: Configure the username and password for access.

      • Kerberos: Upload the HDFS Kerberos authentication file and configure the Hive Principal.

      Execution User For Development Environment Tasks

      Enter the execution username for tasks in the development environment.

      Execution User For Periodic Scheduling Tasks

      Enter the execution username for periodic scheduling tasks.

      Priority Task Queue

      Choose between Use Default Execution User and Custom for priority execution users.

      Upon selecting Custom, configure the usernames for executing tasks with different priorities.

      Note

      The priority queue manages resource allocation by creating different Yarn queues on the Hadoop cluster, executing tasks with varying priorities in their respective Yarn queues.

    5. Configure the parameters in the Inceptor Metadata Connection Information area.

      Parameter

      Description

      Metadata Retrieval Method

      Provides support for Metadata Database and HMS as methods of metadata retrieval, each requiring specific configuration details.

      • Metadata Database retrieval method requires configuration of:Database Type,Database Version (only when the database type is MySQL),JDBC URL,Authentication Type,Username,Password (only when the database type is MySQL).

      • HMS retrieval method requires configuration of:Authentication Type.

      Note
      • If you select the Metadata Database retrieval method, please upload the core-site.xml and hdfs-site.xml configuration files first.

      • If you select the HMS retrieval method, please upload the core-site.xml, hdfs-site.xml, and hive-site.xml configuration files first.

      Database Type

      Select either Inceptor or MySQL as the database type.

      For MySQL, also select the database Version, with options including MySQL 5.6/5.7, MySQL 8, and MySQL 5.1.43.

      JDBC URL

      Input the corresponding metadatabase's connection address.

      When MySQL is chosen as the database type, use the format jdbc:mysql://<connection address>:<port>/<database name>. For Inceptor databases, the format should be jdbc:postgresql://<connection address>:<port>/<database name>.

      Authentication Type

      Choose between No Authentication and LDAP for authentication.

      Username, Password

      Enter the username and password for the metadatabase login credentials.

  4. Click Test Connection to verify connectivity to the compute source.

  5. Once the connection test is successful, click Submit to finalize the configuration.

What to do next

Once you have created the TDH Inceptor compute source, you can associate it with your project. For more information, see Create a general project.