All Products
Search
Document Center

Dataphin:t2325969

Last Updated:Nov 18, 2025

By creating a StarRocks data source, you can enable Dataphin to read business data from StarRocks or write data to StarRocks. This topic describes how to create a StarRocks data source.

Background information

StarRocks is a high-performance analytical database that supports real-time, multi-dimensional, and highly concurrent data analysis. StarRocks is highly scalable, available, and easy to maintain. It provides various support in OLAP scenarios, such as real-time analysis, ad hoc queries, data lake analysis, and more. For more information, see StarRocks official website.

Permissions

Only custom global roles with the Create Data Source permission and system roles such as super administrator, data source administrator, domain architect, and project administrator can create data sources.

Procedure

  1. In the top navigation bar of the Dataphin homepage, choose Management Center > Datasource Management.

  2. On the Datasource page, click +Create Data Source.

  3. On the Create Data Source page, in the Relational Database section, select StarRocks.

    If you have recently used StarRocks, you can also select StarRocks in the Recently Used section. Additionally, you can enter StarRocks keywords in the search box to quickly filter.

  4. On the Create StarRocks Data Source page, configure the connection parameters.

    1. Configure the basic information of the data source

      Parameter

      Description

      Data Source Name

      The name must meet the following requirements:

      • It can contain only Chinese characters, letters, digits, underscores (_), or hyphens (-).

      • It cannot exceed 64 characters in length.

      Data Source Code

      After you configure the data source code, you can reference tables in the data source in Flink SQL nodes using the data_source_code.table or data_source_code.schema.table format. To automatically access the data source of the corresponding environment, use the ${data_source_code}.table or ${data_source_code}.schema.table format.

      Important
      • The data source code cannot be modified after it is configured.

      • You can preview data on the object details page in the asset directory and asset inventory only after the data source code is configured.

      • In Flink SQL, only MySQL, Hologres, MaxCompute, Oracle, StarRocks, Hive, and SelectDB data sources are supported.

      Data Source Description

      A brief description of the data source. It cannot exceed 128 characters.

      Data Source Configuration

      Select the data source to configure:

      • If the business data source distinguishes between production and development data sources, select Production + Development Data Source.

      • If the business data source does not distinguish between production and development data sources, select Production Data Source

      Tag

      You can categorize data sources by adding tags. For information about how to create tags, see Manage data source tags.

    2. Configure the connection parameters between the data source and Dataphin

      If you select Production + Development data source for your data source configuration, you need to configure the connection information for the Production + Development data source. If your data source configuration is Production data source, you only need to configure the connection information for the Production data source.

      Note

      Typically, production and development data sources should be configured as separate data sources to achieve environment isolation and reduce the impact of development activities on production. However, Dataphin also supports configuring them as the same data source with identical parameter values.

      Parameter

      Description

      JDBC URL

      Enter the Java Database Connectivity (JDBC) URL for StarRocks. The following formats are supported:

      • jdbc:mysql:loadbalance://{fe1-host}:{port},{fe2-host}:{port},{fe3-host}:{port}/{database}

      • jdbc:mysql://host:port/dbname

      Load URL

      The host and HTTP port of the frontend (FE). Use the format fe_host:http_port,fe_host:http_port.

      Username, Password

      The username and password for logging in to the database.

    3. Configure advanced settings for the data source.

      Parameter

      Description

      connectTimeout

      The connectTimeout duration for the database (in milliseconds), default is 900000 milliseconds (15 minutes).

      Note
      • If you have included connectTimeout configuration in the JDBC URL, the connectTimeout will be the timeout value configured in the JDBC URL.

      • For data sources created in Dataphin versions earlier than V3.11, the default value is -1, which means no timeout limit.

      socketTimeout

      The socketTimeout duration for the database (in milliseconds), default is 1800000 milliseconds (30 minutes).

      Note
      • If you have included socketTimeout configuration in the JDBC URL, the socketTimeout will be the timeout value configured in the JDBC URL.

      • For data sources created in Dataphin versions earlier than V3.11, the default value is -1, which means no timeout limit.

      Connection Retries

      If the database connection times out, the system will automatically retry the connection until the specified number of retries is reached. If the connection still fails after the maximum number of retries, the connection is considered failed.

      Note
      • The default number of retries is 1, and you can configure a value between 0 and 10.

      • The connection retry count will be applied by default to offline integration tasks and global quality (requires the Asset Quality feature module to be enabled). You can also configure task-level retry counts separately in offline integration tasks.

  5. Select a Default Resource Group, which will be used to run tasks related to the current data source, including database SQL, offline database migration, data preview, and more.

  6. Click Test Connection or directly click OK to save and complete the creation of the StarRocks data source.

    When you click Test Connection, the system will test whether the data source can connect normally with Dataphin. If you directly click OK, the system will automatically test the connection for all selected clusters, but the data source can still be created normally even if all selected clusters fail to connect.

    Test Connection tests the connection for the Default Cluster or Registered Scheduling Clusters that have been registered in Dataphin and are in normal use. The Default Cluster is selected by default and cannot be deselected. If there are no resource groups under a Registered Scheduling Cluster, connection testing is not supported. You need to create a resource group first before testing the connection.

    • The selected clusters are only used to test network connectivity with the current data source and are not used for running related tasks later.

    • The test connection usually takes less than 2 minutes. If it times out, you can click the image icon to view the specific reason and retry.

    • Regardless of whether the test result is Connection Failed, Connection Successful, or Succeeded With Warning, the system will record the generation time of the final result.

      Note

      Only the test results for the Default Cluster include three connection statuses: Succeeded With Warning, Connection Successful, and Connection Failed. The test results for Registered Scheduling Clusters in Dataphin only include two connection statuses: Connection Successful and Connection Failed.

    • When the test result is Connection Failed, you can click the image icon to view the specific failure reason.

    • When the test result is Succeeded With Warning, it means that the application cluster connection is successful but the scheduling cluster connection failed. The current data source cannot be used for data development and integration. You can click the image icon to view the log information.