DataWorks provides StarRocks Reader and StarRocks Writer for reading from and writing to StarRocks data sources. With a StarRocks node, you can write SQL to develop tasks, debug them interactively, and schedule them on a recurring basis — and chain them with other task types in a workflow.
The resource group acts as the execution layer between DataWorks and the StarRocks instance. It must share the same virtual private cloud (VPC) as the StarRocks instance and have network access to it when connecting over the internal network.
What you can do with a StarRocks node: Write and run SQL statements (DDL and DML), schedule tasks to run on a recurring basis, and chain tasks with other DataWorks node types. For ETL sync tasks, use StarRocks data source in Data Integration instead.
Prerequisites
Before you begin, ensure that you have:
A DataWorks workspace. See Activate DataWorks.
A resource group purchased, associated with the workspace, and with network settings configured. See Resource group management.
An EMR Serverless StarRocks instance. See Create an instance.
Step 1: Configure network access
Add the IP address or CIDR block of your DataWorks resource group to the internal IP address whitelist of the StarRocks instance. This lets the resource group reach the instance over the internal network.
To get the IP address or CIDR block of your resource group, see Configure an IP address whitelist.
To add an entry to the StarRocks instance whitelist, see Network access and security configuration.
Step 2: Create a StarRocks data source
Go to the Data Integration page. Log on to the DataWorks console. In the top navigation bar, select the region. In the left-side navigation pane, choose Data Integration > Data Integration. Select the workspace from the drop-down list and click Go to Data Integration.
In the left-side navigation pane, click Data source to open the Data Sources page.
Click Add Data Source. In the Add Data Source dialog box, search for StarRocks and click StarRocks.
In the Add StarRocks Data Source dialog box, configure the following parameters:
Parameter Description Example Data Source Name A name for this data source based on your business requirements. StarRocksConfiguration Mode Select Alibaba Cloud Instance Mode to connect over the internal network (requires same VPC). To connect over the Internet, select Connection String Mode. See StarRocks data source. Alibaba Cloud Instance Mode Region The region where the EMR Serverless StarRocks instance resides. China East 1 (Hangzhou) Instance Select the EMR Serverless StarRocks instance from the drop-down list. — Database Name The name of the database to connect to. Get it from EMR StarRocks Manager (Metadata Management page), or use a built-in database. To access tables across databases in SQL, specify the table as <database name>.<table name>and ensure you have the required permissions.information_schemaUsername and Password Credentials for the StarRocks instance. The default administrator user is admin. The password is the one you set when creating the instance. To reset a forgotten password, see How do I reset the password of a StarRocks instance?admin
In the Connection Configuration section, find the resource group associated with the workspace and click Test Network Connectivity in the Connection Status column.
If Connected appears, proceed to the next step.
If Connection failed appears, open the Network Connectivity Diagnostic Tool panel to review the failure cause and troubleshoot.
Click Complete.
Step 3: Create a StarRocks node
Go to the DataStudio page. Log on to the DataWorks console. In the top navigation bar, select the region. In the left-side navigation pane, choose Data Development and O&M > Data Development. Select the workspace from the drop-down list and click Go to Data Development.
Find the target workflow. Right-click the workflow name and choose Create Node > Database > StarRocks.
In the Create Node dialog box, set Name and click Confirm.
Step 4: Develop StarRocks tasks
The following examples show two common SQL tasks: creating a database and querying table metadata. Both patterns apply to any SQL you write in the node.
On the configuration tab of the StarRocks node, select the StarRocks data source from the Select Data Source drop-down list.
Write SQL statements and click Run. When prompted, select the scheduling resource group.
Example 1: Create a database
CREATE DATABASE IF NOT EXISTS load_test;To verify, open EMR StarRocks Manager and run the following:
In the left-side navigation pane, click SQL Editor.
Create a file, enter the following command, and click Run:
SHOW DATABASES;If load_test appears in the result, the database was created successfully.

Example 2: Query table metadata
SELECT * FROM information_schema.tables
WHERE table_type = 'BASE TABLE';
What's next
To schedule the StarRocks node and configure dependencies, see Configure a StarRocks node.