Serverless StarRocks SQL node development - DataWorks - Alibaba Cloud Documentation Center

Create a Serverless StarRocks SQL node to run distributed SQL queries on structured data. This node uses an EMR Serverless StarRocks computing resource to run jobs more efficiently.

Prerequisites

Computing resource constraints: Only EMR Serverless StarRocks computing resources are supported. You must ensure network connectivity between the resource group and the computing resource.
Resource group constraints: This task runs only in a Serverless resource group.
(Optional, required for RAM users) The RAM user for task development has been added to the corresponding workspace with the Developer or Workspace Administrator role (which has extensive permissions — assign with caution). For more information about adding members, see Add members to a workspace.
You can skip this step if you are using an Alibaba Cloud account.

Create a node

For instructions, see Create a node.

Develop the node

In the SQL editor, write the task code. You can define variables by using the ${variable_name} format. Then, assign values to the variables in the Scheduling Parameters section of the Scheduling Settings panel on the right. This enables dynamic parameter passing in scheduled jobs. For more information about how to use scheduling parameters, see Sources and expressions of scheduling parameters. The following code provides an example.

SHOW TABLES; 
-- Defines a variable named var. If you assign the value ${yyyymmdd} to this variable,
-- you can create a table with a suffix that indicates the business date.
CREATE TABLE IF NOT EXISTS userinfo_new_${var} (
  ip STRING COMMENT 'IP address',
  uid STRING COMMENT 'User ID'
)PARTITIONED BY(
    dt STRING
); --This can be used with scheduling parameters.

Note

The maximum size of a single SQL statement is 130 KB.

Test the node

Configure run properties.

In the Run Configuration panel on the right, configure the Compute Resource and Resource Group. The following table describes the parameters.

Parameter	Description
Computing resource	Select the EMR Serverless StarRocks computing resource that you want to use. If no computing resources are available in the drop-down list, select Create Computing Resource. Important You must ensure that the computing resource and the resource group are connected over the network. For more information, see network connectivity solution.
Resource group	Select the resource group that passed the connectivity test when you attached the computing resource.
Script Parameters	If you defined variables in the node's code by using the `${parameter_name}` format, you must specify the corresponding Parameter name and Parameter Value in the Script Parameters section. At runtime, the system dynamically replaces the variables with their specified values. For more information, see Sources and expressions of scheduling parameters.

Run the node.
To run the node, click Save and then Run.

Next steps

Configure node scheduling: If you need to run a node periodically, configure its Scheduling Policy in the Scheduling Settings panel on the right.
Publish a node: To run a task in the production environment, click the icon to publish the node. A node runs on schedule only after it is published to the production environment.
Task O&M: After a task is published, you can monitor the status of its periodic runs in the Operation Center. For more information, see Get started with Operation Center.