An EMR Serverless StarRocks SQL node runs SQL statements against EMR Serverless StarRocks computing resources using a distributed SQL query engine to process structured data.
Prerequisites
Before you begin, make sure that you have:
An EMR Serverless StarRocks computing resource connected to a Serverless resource group over the network. Both are required — this node type does not support other resource or resource group types.
(Resource Access Management (RAM) users only) Been added to the workspace and assigned the Developer or Workspace Administrator role. For details, see Add members to a workspace.
If you use an Alibaba Cloud account, you can skip this step.
The Workspace Administrator role has broad permissions. Assign it with caution.
Create a node
See Create a node for instructions.
Develop the node
Write SQL in the editing area. To pass dynamic values at runtime, define variables using ${variable_name} in your SQL, then assign values in the Scheduling Parameters section under Scheduling on the right panel.
The maximum size of a single SQL statement is 130 KB.
Example:
SHOW TABLES;
-- ${var} is a variable. If you set its value to ${yyyymmdd},
-- the table name includes the current date as a suffix.
CREATE TABLE IF NOT EXISTS userinfo_new_${var} (
ip STRING COMMENT 'IP address',
uid STRING COMMENT 'User ID'
)
PARTITIONED BY (
dt STRING
);For supported variable formats, see Supported formats for scheduling parameters.
Test the node
In the Run Configuration section on the right panel, configure the following parameters.
Parameter Description Computing resource Select the EMR Serverless StarRocks computing resource to use. If none appear, select Create Computing Resource from the list. The computing resource must be network-connected to the resource group. See Network connectivity solutions. Resource group Select the resource group that passed the connectivity test when you attached the computing resource. Script Parameters If your SQL uses ${variable_name}variables, enter the parameter name and value here. At runtime, each variable is replaced with the value you specify.Click Save, then click Run.
Next steps
Schedule a node: Set Scheduling Policies in the Scheduling section to run the node on a recurring schedule.
Publish a node: Click the
icon to publish the node. A node runs on schedule only after it is published to the production environment.Node O&M: After publishing, monitor scheduled task status in Operation Center. See Get started with Operation Center.