Dataphin supports the integration of StarRocks as an offline compute engine, enabling the management of offline computing tasks within Dataphin projects. This topic describes the steps to create a StarRocks compute source.
Background information
StarRocks is a high-performance analytical database designed for real-time, multidimensional, and highly concurrent data analysis. Known for its extensibility, availability, and ease of maintenance, StarRocks excels in OLAP scenarios, offering capabilities for real-time analysis, ad hoc queries, and data lake analytics. For more information, see the StarRocks official website.
Permissions
All system roles except for tag business specialists and business members, and custom global roles with the Cluster-View permission, can view the details of each cluster.
Super administrators, system administrators, and custom global roles with the Cluster-Manage permission can create and manage StarRocks clusters. These users can also specify which users can reference the cluster when creating a StarRocks compute source and assign cluster administrators.
Cluster administrators can manage the clusters for which they are responsible.
Super administrators, system administrators, and users with the Compute Source Management-Create custom global role can create StarRocks compute sources. They can also select and reference StarRocks clusters that they have permission to use.
Procedure
On the Dataphin home page, choose Planning > Compute Source from the top menu bar.
On the Compute Source page, click Add Compute Source and select StarRocks Compute Source.
On the Create StarRocks Compute Source page, configure the following parameters.
Reference a specified cluster
Parameter
Description
Basic Information
Compute Type
Set to StarRocks.
Compute Source Name
The name can contain Chinese characters, letters, digits, underscores (_), and hyphens (-).
Configuration Method
Select Reference A Specified Cluster. From the drop-down list, select a cluster that the current user has permission to reference. Click View to go to the View StarRocks page for cluster details. If the required cluster is not available, click Configure Cluster to go to the Create StarRocks Cluster page and create a new one.
NoteChanges to the cluster information are automatically synced to the current compute source.
Compute Source Description
Enter a brief description for the compute source. The description can be up to 128 characters long.
Configuration
JDBC URL
The Java Database Connectivity (JDBC) URL configured in the selected StarRocks cluster is used by default and cannot be modified.
Catalog
Select Default Catalog or External Catalog.
Default Catalog: Manages internal data in StarRocks.
External Catalog: Lets you select from all external catalogs under the cluster or manually enter a catalog name.
Database
Select a database from the selected catalog, or manually enter a database name.
Authenticated User
You can select Same As Cluster or Custom. The default is Same as cluster. If you select Custom, you must also enter the Username and Password. To ensure that the node runs properly, make sure that the user has the required database permissions.
Task Resource Group
StarRocks uses resource groups to isolate resources and uses classifiers to match tasks to resource groups. Dataphin lets you specify resource group names for tasks with different priorities. This ensures that tasks are automatically assigned to the correct resource group at runtime for resource allocation and isolation.
Use Default Execution User: Uses the default execution user configured in the selected cluster.
Custom: Includes five priority levels: Highest Priority, High Priority, Medium Priority, Low Priority, and Lowest Priority. You must enter a resource group name for each priority level.
Manual configuration
Parameter
Description
Basic Information
Compute Type
Set to StarRocks.
Compute Source Name
The name can contain Chinese characters, letters, digits, underscores (_), and hyphens (-).
Configuration Method
Select Manual Configuration.
Compute Source Description
Enter a brief description for the compute source. The description can be up to 128 characters long.
Configuration
JDBC URL
Enter the JDBC URL. The following formats are supported:
jdbc:mysql:loadbalance://{fe1-host}:{port},{fe2-host}:{port},{fe3-host}:{port}/{database}jdbc:mysql://{host}:{port}/database?key1=value1&key2=value2
Catalog
You can only select Default Catalog. To create a compute source from an External catalog, select 'Reference a specified cluster' as the configuration method.
FE Node URL
The connection addresses of Front End (FE) nodes. Separate multiple addresses with commas (,). Example:
fe_host1:http_port01,fe_host02:http_port02.Username
Enter the username and password to log on to the StarRocks compute engine database. To ensure that tasks run properly, make sure the user has the required database permissions.
Password
Task Resource Group
StarRocks uses resource groups to isolate resources and uses classifiers to match tasks to resource groups. Dataphin lets you specify resource group names for tasks with different priorities. This ensures that tasks are automatically assigned to the correct resource group at runtime for resource allocation and isolation.
Use Default Execution User: Uses the default execution user from the preceding configuration.
Custom: Includes five priority levels: Highest Priority, High Priority, Medium Priority, Low Priority, and Lowest Priority. You must enter a resource group for each priority level.
Click Test Connection to test the connection to the compute source.
After the connection test is successful, click Submit.
After you create the StarRocks compute source, you can attach the compute source to a project. For more information, see Manage permissions and compute sources for a project space.
What to do next
After creating the StarRocks compute source, you can bind the StarRocks compute source to the project. For specific operations, see Create a general project.