All Products
Search
Document Center

Dataphin:Create a Flink compute source

Last Updated:Jan 21, 2026

A Flink compute source hosts the Flink-based compute resources for a Dataphin project. You can develop Flink-based compute jobs only in projects that are attached to a Flink compute source. This topic describes how to create a Flink compute source.

Prerequisites

  • Apache Flink is enabled as the real-time computing engine for the current tenant. For more information, see Set the real-time computing engine.

  • You can create compute sources only if you have a custom user role with the Create Compute Source permission, or if your account has the super administrator or project administrator role. For more information, see Data warehouse plan permissions.

Procedure

  1. On the Dataphin home page, choose Planning > Compute Source in the top menu bar.

  2. On the Compute Source page, click Add Compute Source and select Flink Compute Source.

  3. On the Create Compute Source page, configure the parameters.

    1. Configure basic information for the compute source

      Parameter

      Description

      Compute Type

      Select Flink.

      Compute Source Name

      Enter a name for the compute source. The naming convention is as follows:

      • It can contain letters, digits, underscores (_), and hyphens (-).

      • It must be no more than 64 characters in length.

      Compute Source Description

      Enter a description for the compute source. The description can be up to 128 characters in length.

    2. Configure basic cluster information and Flink compute engine settings

      Dataphin supports different deployment modes for clusters, such as Yarn and Kubernetes. The required parameters vary based on the selected deployment mode.

      Yarn deployment mode

      • Basic cluster information

        Parameter

        Description

        Configuration File

        Upload the configuration files for the cluster. You must upload the yarn-site.xml, core-site.xml, and hdfs-site.xml files.

        Cluster Kerberos

        Kerberos is an identity authentication protocol that uses symmetric key cryptography. It provides identity authentication for other services and supports single sign-on (SSO). After a client is authenticated, it can access multiple services, such as HBase and Hadoop Distributed File System (HDFS).

        If your cluster uses Kerberos authentication, enable Cluster Kerberos and upload a Krb5 authentication file or configure the KDC Server address.

        • Krb5 authentication file: Upload the Krb5 file for Kerberos authentication.

        • KDC Server address: The address of the KDC server, which helps complete Kerberos authentication. You can configure multiple KDC Server addresses. Separate them with commas (,).

        Cluster Type

        Optional. Select the cluster type to test the connection. Options include Aliyun E-MapReduce 5.x, CDH 5.x Hadoop, CDH 6.x Hadoop, Cloudera Data Platform 7.x, AsiaInfo DP 5.3 Hadoop, and Transwarp TDH 6.x Hadoop.

        Important

        The connection test can usually pass even if you do not select a cluster type. However, in some cases, the test may fail if no type is selected. Select a type.

      • Flink compute engine configuration

        Parameter

        Description

        Flink Job Resource Queue

        The name of the YARN queue to which Flink jobs are submitted. The naming conventions and limits are as follows:

        • Length limit: The queue name cannot exceed 255 characters.

        • Character limit: The queue name can contain only letters, digits, periods (.), and underscores (_). It cannot contain other special characters.

        • Case-sensitive: The queue name is case-sensitive. Uppercase and lowercase letters are treated as different characters.

        • Uniqueness: The queue name must be unique within the compute source and cannot be the same as other queue names.

        To configure multiple job queues, click +Add.

        Note
        • You can add a maximum of 10 resource queues.

        • To delete a resource queue, click the image icon. After deletion, existing jobs cannot be submitted.

        CheckPoint storage

        File System: Supported file systems are HDFS, OSS-HDFS, and Aliyun OSS (only for Flink 1.14 and 1.15). Different file systems require different parameters.

        Note

        The OSS-HDFS file system is supported only by the Aliyun E-MapReduce 5.x Hadoop compute engine.

        • When File System is set to HDFS, configure the following parameters:

          • Directory Path: Enter the directory path for CheckPoint storage in the cluster. Ensure that Flink has permission to access this path. Example: hdfs://cdh-cluster-00001:8020/openflink/savepoint/. If your HDFS is a high availability (HA) cluster, you can enter an HA path, such as hdfs://nameservice/path.

          • Flink Kerberos: If the Flink cluster uses Kerberos authentication, enable Flink Kerberos, upload a Keytab File, and configure the Principal.

            • Keytab File: Upload the keytab file. You can get the keytab file from the Flink Server.

            • Principal: Enter the Kerberos authentication username that corresponds to the Flink Keytab File.

          • Username: If Flink Kerberos is disabled, enter the cluster username for submitting Flink jobs.

        • When File System is set to OSS-HDFS, configure the following parameters:

          • Directory Path: Enter the directory path for CheckPoint storage in the cluster. Ensure that Flink has permission to access this path. Example: hdfs://cdh-cluster-00001:8020/openflink/savepoint/. If your HDFS is an HA cluster, you can enter an HA path, such as hdfs://nameservice/path.

          • AccessKey ID and AccessKey Secret: Enter the AccessKey ID and AccessKey secret for accessing OSS in the cluster. Use an existing AccessKey pair or create a new one. For more information, see Create an AccessKey.

            Note

            To reduce the risk of an AccessKey secret leak, the AccessKey secret is shown only when you create it. You cannot view it later. Store it securely.

          • Flink Kerberos: If the Flink cluster uses Kerberos authentication, enable Flink Kerberos, upload a Keytab File, and configure the Principal.

            • Keytab File: Upload the keytab file. You can get the keytab file from the Flink Server.

            • Principal: Enter the Kerberos authentication username that corresponds to the Flink Keytab File.

          • Username: If Flink Kerberos is disabled, enter the cluster username for submitting Flink jobs.

        • When File System is set to Aliyun OSS, configure the following parameters:

          • Endpoint: Enter the endpoint for the OSS service.

          • Directory Path: Enter the path in the format oss://{Bucket}/{Object}.

          • AccessKey ID and AccessKey Secret: Enter the AccessKey ID and AccessKey secret for accessing OSS in the cluster. Use an existing AccessKey pair or create a new one. For more information, see Create an AccessKey.

            Note

            To reduce the risk of an AccessKey secret leak, the AccessKey secret is shown only when you create it. You cannot view it later. Store it securely.

          • Flink Kerberos: If the Flink cluster uses Kerberos authentication, enable Flink Kerberos, upload a Keytab File, and configure the Principal.

            • Keytab File: Upload the keytab file. You can get the keytab file from the Flink Server.

            • Principal: Enter the Kerberos authentication username that corresponds to the Flink Keytab File.

          • Username: If Flink Kerberos is disabled, enter the cluster username for submitting Flink jobs.

        Important

        The configuration here has a higher priority than the AccessKey configured in the core-site.xml file.

      Kubernetes deployment mode

      • Basic cluster information

        You do not need to configure basic cluster information for the Kubernetes deployment mode.

      • Flink compute engine configuration

        In Kubernetes deployment mode, you can select one of the following file systems for the Flink compute engine: NFS, Amazon S3, or Azure Blob Storage. The required parameters vary based on the file system that you select.

        NFS

        Parameter

        Description

        Server

        Enter the domain name of the NFS server.

        Version

        Select the NFS version. NFSv3 and NFSv4 are supported.

        Directory

        Enter the directory path for CheckPoint storage on NFS. Example: /data/checkpoint.

        Maximum Capacity

        Enter the maximum storage capacity supported by NFS. If this capacity is exceeded, Checkpoint storage is affected. Unit: Gi.

        Amazon S3

        Parameter

        Description

        Directory Path

        Enter the storage path. The default format is s3://{YOUR-BUCKET}/{path}.

        Access Key and Secret Key

        Enter the Access Key and Secret Key for accessing Amazon S3. Click the image icon to view the plaintext.

        Azure Blob Storage

        Parameter

        Description

        Protocol

        Currently, only ABDS is supported.

        Authentication Type

        Currently, only Shared Key is supported.

        Directory Path

        Enter the storage path. The default format is abfs://{YOUR-CONTAINER}@${YOUR-AZURE-ACCOUNT}.dfs.core.windows.net/{object-path}.

        AccessKey

        Enter the access key for the Azure Blob Storage account. Click the image icon to view the plaintext.

  4. Click Test Connection to verify the connection between Dataphin and the cluster.

    The Kubernetes deployment mode does not support connection testing. Click Submit directly.

  5. After the connection test is successful, click Submit.

What to do next

After you create the Flink compute source, you can attach it to a project. For more information, see Create a general-purpose project.