To run EMR Serverless StarRocks tasks in DataWorks, associate your EMR Serverless StarRocks instance as a computing resource. Once associated, the resource is available for data development tasks.
Before you begin
Before you begin, make sure that:
Workspace requirements:
A DataWorks workspace is created, and it uses the new version of DataStudio. For workspaces that do not use the new version of DataStudio, create a StarRocks data source instead.
The Resource Access Management (RAM) user is added to the workspace and assigned the Workspace Administrator role.
Instance requirements:
An EMR Serverless StarRocks instance exists in the same region as the DataWorks workspace. Cross-region association is not supported.
Network requirements:
A resource group is associated with the workspace and network connectivity is verified:
Serverless resource group: the computing resource instance must connect to the Serverless resource group.
Exclusive resource group (earlier version): the computing resource instance must connect to the exclusive resource group for scheduling. See Use an exclusive resource group of an earlier version.
The internal network CIDR block of the resource group is added to the internal network whitelist of the EMR Serverless StarRocks instance. To find the CIDR block, see Obtain the internal network CIDR block of the resource group.
Limits
Supported regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Indonesia (Jakarta), Germany (Frankfurt), US (Silicon Valley), and US (Virginia).
Required permissions:
| Operator | Required permissions |
|---|---|
| Alibaba Cloud account | No extra permissions required. |
| RAM user or RAM role | Must have the O&M or Workspace Administrator role, or the AliyunDataWorksFullAccess permission to create computing resources. See Grant the Workspace Administrator role to a user. |
Associate a Serverless StarRocks computing resource
Go to the computing resource page
Log on to the DataWorks console. Switch to the target region, then in the left navigation pane choose More > Management Center. Select your workspace from the drop-down list and click Go To Management Center.
In the left navigation pane, click Computing Resource.
Configure the computing resource
Click Associate Computing Resource.
On the Associate Computing Resource page, set the computing resource type to Serverless StarRocks. The Associate Serverless StarRocks Computing Resource configuration page opens.
Configure the parameters for your chosen configuration mode.
Alibaba Cloud Instance Mode
Parameter Description Instance Select the EMR Serverless StarRocks instance to associate. Alternatively, click Create in the drop-down list to create an EMR Serverless StarRocks instance. If you isolated the production and development environments when creating the workspace, select StarRocks instances for both environments. Database Name Select a database in the EMR Serverless StarRocks instance. If no database exists, create one directly in the instance. Authentication Method Available only for the Enterprise Edition. Other editions support only the Account and Password method. If set to Owner, the tenant administrator for the DataWorks Enterprise Edition must configure mappings between RAM users and engine accounts in Security Center by adding Manage Ranger and Identity Credentials. Username The username set when the EMR Serverless StarRocks instance was created. The default username is admin.Password The password set when the EMR Serverless StarRocks instance was created. To reset a forgotten password, go to the instance details page and click Reset The Password. Connection String Mode
Parameter Description Host/IP Address The IP address of the StarRocks frontend (FE). Port The query port of the StarRocks FE. The default is 9030. Adjust based on your StarRocks configuration.Load URL The StarRocks FE address for StreamLoad, in the format FE_IP:FE_HTTP. Separate multiple addresses with commas.FE_IPmust be an internal network address — internet addresses are not supported.FE_HTTPis typically8030or18030.Database Name Select a database in the EMR Serverless StarRocks instance. If no database exists, create one in the SQL Editor of the instance. Username The username set when the EMR Serverless StarRocks instance was created. The default username is admin.Password The password set when the EMR Serverless StarRocks instance was created. To reset a forgotten password, go to the instance details page and click Reset The Password. Advanced Parameters Optional. Click Add Property to add extra connection properties. For available properties, see the MySQL Connector/J configuration reference. Computing Resource Instance Name A name that identifies the computing resource. Tasks use this name to select the resource at run time. In the connection configuration section, select the resource group that DataWorks uses to run StarRocks node tasks, then click Test Network Connectivity to verify that the resource group can reach the EMR Serverless StarRocks instance. For troubleshooting, see Overview of network connectivity solutions.
Click OK.
NoteA Serverless StarRocks data source with the same name is automatically created in the Data Sources section of the current workspace.
What's next
After the computing resource is associated, use it for data development in the following node types:
Batch synchronization tasks: in a Data Integration > Offline Synchronization node. See Batch Synchronization nodes.
EMR Serverless StarRocks tasks: in a node. See Serverless StarRocks SQL nodes.