All Products
Search
Document Center

Dataphin:Create a Flink compute source

Last Updated:Mar 31, 2026

A Flink compute source provides Flink-based compute resources for projects in Dataphin. You must bind a Flink compute source to a project before you can develop Flink compute jobs. This topic describes how to create a Flink compute source.

Prerequisites

Procedure

  1. In the top menu bar on the Dataphin homepage, select Planning > Compute Source.

  2. On the Compute Source page, click Add Compute Source and select Flink Compute Source.

  3. On the Create Compute Source page, configure the parameters.

    1. Basic Information

      Parameter

      Description

      Compute Type

      Select Flink.

      Compute Source Name

      Enter a name for the compute source. The name must meet the following requirements:

      • Can contain Chinese characters, letters, digits, underscores (_), and hyphens (-).

      • Cannot exceed 64 characters in length.

      Compute Source Description

      Enter a description for the compute source. The description must be 128 characters or less.

    2. Select a deployment mode and configure its parameters

      Dataphin supports two deployment modes: Yarn and Kubernetes. The required parameters depend on the selected deployment mode.

      Yarn

      • Cluster Basic Information

        Parameter

        Description

        Configuration File

        Upload the cluster's yarn-site.xml, core-site.xml, and hdfs-site.xml configuration files.

        Cluster Kerberos

        Kerberos is an authentication protocol that uses symmetric-key cryptography to authenticate client/server applications. It supports Single Sign-On (SSO), allowing an authenticated client to access multiple services such as HBase and HDFS.

        If your cluster uses Kerberos authentication, enable this option and upload a Krb5 authentication file or configure the KDC server address.

        • Krb5 Authentication File: You need to upload a Krb5 file for Kerberos authentication.

        • KDC Server Address: The address of the KDC server for Kerberos authentication. You can specify multiple addresses separated by commas (,).

        Cluster Type (Optional)

        Select the type of your cluster to use for testing the connection. Supported types include Aliyun E-MapReduce 5.x, CDH 5.x Hadoop, CDH 6.x Hadoop, Cloudera Data Platform 7.x, AsiaInfo DP 5.3 Hadoop, and Transwarp TDH 6.x Hadoop.

        Important

        Although the connection test can succeed without a cluster type, we recommend selecting one to prevent potential connection failures.

      • Flink Compute Resources

        Parameter

        Description

        Compute Resource Type

        You can select Resource Queue and Session Cluster.

        Resource Queue

        If you select Resource Queue as the compute resource type, enter the name of the YARN queue to which Flink jobs will be submitted. The name must follow these rules:

        • Length: The queue name cannot exceed 256 characters.

        • Character limit: The queue name can contain only English letters, numbers, spaces, and the following special characters: -_.@'().

        • Case-sensitivity: The queue name is case-sensitive.

        • Uniqueness: The queue name must be unique within the compute source.

        If you need to configure multiple task queues, you can click + Add.

        Note
        • You can add a maximum of 10 resource queues.

        • To remove a resource queue, click the image delete icon. You must keep at least one resource queue. If you delete a queue, you can no longer submit existing jobs that rely on it.

        Session Cluster

        If you select Session Cluster as the compute resource type, select one or more Session Clusters. The drop-down list contains all clusters created in Session Cluster, regardless of their status.

      • Flink Kerberos Authentication

        Note

        You can configure Flink Kerberos authentication only if you select Resource Queue as the compute resource type.

        • Flink Kerberos: If the Flink cluster has Kerberos authentication, you can enable Flink Kerberos, upload a Keytab File, and configure a Principal.

          • Keytab File: Upload the keytab file. You can obtain the keytab file on the Flink Server.

          • Principal: Enter the Kerberos username that corresponds to the Flink keytab file.

        • Username: When Flink Kerberos is disabled, enter the cluster username used to submit Flink jobs.

      • CheckPoint Storage

        File system: Supports HDFS, OSS-HDFS, and Aliyun OSS (supported only for Flink 1.14 and 1.15). Different file systems require different parameters.

        Note

        The OSS-HDFS file system is supported only with the Aliyun E-MapReduce 5.x Hadoop compute engine.

        • If you select HDFS as the file system, configure the following parameter:

          Directory Path: Enter the directory path for Checkpoint storage, and ensure that Flink has permission to access this path. For example, hdfs://cdh-cluster-00001:8020/openflink/savepoint/. If your HDFS is a High-Availability (HA) cluster, you can specify a high-availability path, such as hdfs://nameservice/path.

        • If you select OSS-HDFS as the file system, configure the following parameters:

          • Directory Path: Enter the directory path for CheckPoint cluster storage, and ensure that Flink has permission to access this path. For example, hdfs://cdh-cluster-00001:8020/openflink/savepoint/. If your HDFS is a high-availability (HA) cluster, you can specify the path in the format of hdfs://nameservice/path.

          • AccessKey ID and AccessKey Secret: Enter the AccessKey ID and AccessKey Secret that are used to access the cluster's OSS. You can use an existing AccessKey. To create a new one, see Create an AccessKey.

            Note

            To prevent AccessKey Secret leaks, the AccessKey Secret is displayed only upon creation and cannot be retrieved.

        • If you select Aliyun OSS as the file system, configure the following parameters:

          • Endpoint: Enter the connection address for the OSS service.

          • Directory path: The format is oss://{Bucket}/{Object}.

          • AccessKey ID and AccessKey Secret: Enter the AccessKey ID and AccessKey Secret to access the cluster's OSS. You can use an existing AccessKey pair or create a new one. For more information, see Create an AccessKey.

            Note

            To prevent AccessKey Secret leaks, the AccessKey Secret is displayed only upon creation and cannot be retrieved.

        Important

        The AccessKey credentials you configure here override any credentials set in the core-site.xml file.

      Kubernetes

      • Cluster Basic Information

        No cluster basic information is required for the Kubernetes deployment mode.

      • Flink Compute Engine Configuration

        For the Kubernetes deployment mode, you can select one of the following file systems: NFS, S3, or Azure Blob Storage. The required parameters vary based on the selected file system.

        NFS

        Parameter

        Description

        Server

        Enter the domain name of the NFS server.

        Version

        Select the NFS version. Supported versions are NFSv3 and NFSv4.

        Contents

        Enter the storage directory path for CheckPoint on NFS. For example, /data/checkpoint.

        Maximum capacity

        Enter the maximum storage capacity for NFS in GiB. Exceeding this limit will disrupt CheckPoint storage.

        S3

        Parameter

        Description

        Endpoint (Optional)

        Enter the correct address, for example, http://s3.us-east-2.amazonaws.com.

        Note

        This field is optional for Amazon S3 but required for all other S3-compatible services.

        Directory Path

        Enter the storage path. The default path is s3://{YOUR-BUCKET}/{path}. We recommend that you use a dedicated directory for Checkpoint storage and clean it up regularly.

        Access Key, Secret Key

        Enter the AccessKey ID and AccessKey Secret for accessing your S3-compatible storage. Click the image icon to view the plain text.

        Azure Blob Storage

        Parameter

        Description

        Protocol

        Currently, only ABFS is supported.

        Authentication Method

        Currently, only Shared Key is supported.

        Directory Path

        Enter the storage path. The default is abfs://{YOUR-CONTAINER}@${YOUR-AZURE-ACCOUNT}.dfs.core.windows.net/{object-path}.

        Access Key

        Enter the access key for your Azure Blob Storage account. Click the image icon to view the plain text.

  4. Click Test Connection to test the connectivity between Dataphin and the cluster.

    The Kubernetes deployment mode does not support testing the connection, so you can directly click Submit.

  5. After the test is successful, click Submit.

Next steps

After you create the Flink compute source, you can bind it to a project. For more information, see Create a general project.