All Products
Search
Document Center

Dataphin:Create an ArgoDB compute source

Last Updated:Jun 23, 2026

Dataphin supports using ArgoDB as an offline computing engine to process offline computing tasks.

Prerequisites

You have initialized the TDH Inceptor metadata warehouse compute engine and set the compute engine to TDH Inceptor. For more information, see Initialize a metadata warehouse by using TDH as the metadata warehouse compute engine and Set the compute engine of a Dataphin instance to TDH or ArgoDB.

Note

Creating an ArgoDB compute source requires a TDH Inceptor metadata warehouse compute engine. Other types of metadata warehouse compute engines are not supported.

Background information

ArgoDB is a distributed analytical database developed by Transwarp that replaces hybrid Hadoop and Massively Parallel Processing (MPP) architectures. It supports standard SQL workloads and provides capabilities such as multi-model data analytics, real-time data processing, decoupled storage and compute, and hybrid deployment on heterogeneous servers. For more details, visit the official ArgoDB website.

Limitations

  • When you use MySQL Metadatabase, ArgoDB System Library, or HMS to acquire metadata, some information may be unavailable or inaccurate, as described below.

    • If Metadata Acquisition Method is set to MySQL Metadatabase or HMS:

      • Data volume for Asset Panorama, Data Portal, and projects is unavailable.

      • Table data volume, partition data volume, and partition record counts in Asset Catalog are unavailable.

      • Storage-related metrics in Resource Management are inaccurate.

      • Data volume and record counts for dim_dataphin_table and dim_dataphin_partition in the metadata warehouse sharing model are unavailable.

    • If Metadata Acquisition Method is set to ArgoDB System Library:

      • Partition record counts in Asset Catalog are unavailable.

      • Data volume and partition data volume for holodesk tables in Asset Catalog are unavailable.

      • Record counts for dim_dataphin_table and dim_dataphin_partition, and data volume for holodesk-format tables in the metadata warehouse sharing model are unavailable.

  • If the HDFS connection uses Non-Kerberos Authentication and the ArgoDB configuration uses Non-LDAP Authentication, unknown issues can occur. Before enabling these options, contact the Dataphin operations and deployment team for confirmation.

  • Other limitations:

    • Dataphin does not support table management when ArgoDB is used as the compute source.

    • The salted hash desensitization algorithm (including salted SHA256, salted SHA384, salted SHA512, and salted MD5) and the Gaussian Noise (GaussianNoise) desensitization algorithm are not supported.

    • SQL dialects such as Oracle, IBM DB2, and Teradata are not supported. Oracle and DB2 stored procedures are also not supported. Errors may occur during SQL execution.

    • A range-partitioned table supports only Data Query Language (DQL) statements and a limited number of Data Definition Language (DDL) and Data Manipulation Language (DML) statements.

Procedure

  1. On the Dataphin homepage, click Plan.

  2. In the left-side navigation pane, choose Projects > Compute Source.

  3. On the Compute Source page, click + New Compute Source and select ArgoDB Compute Source from the drop-down list.

  4. On the Create Compute Source page, configure the parameters.

    a. Configure Basic Information.

    Parameter

    Description

    Compute Source Type

    Select ArgoDB.

    Compute Source Name

    The naming conventions are as follows:

    • It can contain only Chinese characters, letters, digits, underscores (_), and hyphens (-).

    • It must be 64 characters or less.

    Compute Source Description

    A brief description of the compute source.

    b. Configure Basic Cluster Information.

    Parameter

    Description

    namenode

    This field is pre-filled with the NameNode value from your compute settings and cannot be modified.

    core-site.xml, hdfs-site.xml, hive-site.xml, yarn-site.xml, Other Configuration Files

    Upload the HDFS configuration files core-site.xml and hdfs-site.xml, the Hive configuration file hive-site.xml, and the yarn-site.xml file.

    If you have other configuration files, you can upload them in the corresponding section.

    Authentication Method

    If your ArgoDB cluster uses Kerberos authentication, select Kerberos. Kerberos is a symmetric-key authentication protocol that supports Single Sign-On (SSO), allowing a client to access multiple services such as HBase and HDFS after a single authentication.

    If you select Kerberos authentication, you must upload a krb5 file or specify the KDC server address:

    • Krb5 File Configuration: Upload a krb5 file for Kerberos authentication.

    • KDC Server Address: The address of the Key Distribution Center (KDC) server, which assists with Kerberos authentication. You can specify multiple KDC server addresses, separated by commas (,).

    c. Configure parameters in the HDFS Connection Information section.

    Parameter

    Description

    Execution Username, Password

    The username and password for logging in to the execution machine to run MapReduce tasks and read from or write to HDFS.

    Important

    Ensure that the specified user has the required permissions to submit MapReduce tasks.

    Authentication Method

    If your HDFS uses Kerberos authentication, select Kerberos. Kerberos is a symmetric-key authentication protocol that supports SSO, allowing a client to access multiple services such as HBase and HDFS after a single authentication.

    • If you select Kerberos authentication, you must upload a keytab file and configure the principal:

      • Keytab File: Upload the keytab file for Kerberos authentication.

      • Principal: The Kerberos principal name.

    • If you select no authentication, you must specify the username for accessing HDFS.

    d. Configure parameters in the ArgoDB Configuration section.

    Parameter

    Description

    JDBC URL

    Configure the JDBC connection address for Hive Server in the format jdbc:hive2://InceptorServerIP:Port/Database.

    Authentication Method

    Select the authentication method for ArgoDB based on your ArgoDB configuration. Supported options are No Authentication, LDAP, and Kerberos:

    • No Authentication: No authentication is required.

    • LDAP: Provide the username and password for access.

    • Kerberos: Upload a Kerberos authentication file and provide the principal.

    Execution User for Development Tasks

    Based on the selected authentication method, configure the username and password, or upload a Kerberos authentication file and provide the principal for tasks in the development environment.

    Execution User for Periodically Scheduled Tasks

    Based on the selected authentication method, configure the username and password, or upload a Kerberos authentication file and provide the principal for periodically scheduled tasks.

    Priority Task Queue

    Choose how to specify the execution user for priority tasks. You can select Use Default Execution User or Custom.

    If you select Custom, you must configure different usernames for tasks with different priorities.

    Note

    Priority queues allocate resources by creating different YARN queues on the Hadoop cluster. Based on a task's priority level, it is submitted to the corresponding YARN queue.

    e. Configure ArgoDB Metadata Connection Information.

    Parameter

    Description

    Metadata Acquisition Method

    Acquire metadata from a metadatabase or from Hive Metastore (HMS). To use HMS, first upload the hdfs-site.xml, hive-site.xml, and core-site.xml files and configure the authentication method in the cluster configuration section.

    Database Type

    Select the metadatabase type for ArgoDB. Currently, only ArgoDB is supported.

    JDBC URL

    Enter the connection address for the corresponding metadata database in the format jdbc:postgresql://{host}:{port}/{database name}.

    Username, Password

    Enter the username and password for logging in to the metadatabase.

    Note

    Ensure the specified account has the necessary data permissions for tasks to run as expected.

  5. Click Test Connection.

  6. After the connection test succeeds, click Submit.

Next steps

After creating the ArgoDB compute source, you can bind it to a project. For more information, see Create a general-purpose project.